When trying to track down an elusive Active Directory performance problem, gathering stats using the Active Directory Diagnostics Data Collector Set is the best method for insight as to what the Domain Controller is doing. However, having super busy DCs and not knowing exactly when the problem is going to occur can make capturing the data and generating a useful report harder.
First, let’s make sure we can capture all the data on a busy DC. We’ll need to create a custom version of the default Active Directory Diagnostics DCS to be able to make the necessary changes.
Start by creating a new User Defined Data Collector Set. Give it a name, create it from an existing template. Use the Active Directory Diagnostics as the template. Finish the wizard.
Now we’ll make a small change that allows the DCS to continuously capture data without fear of impacting performance or disk capacity too much. Open the Data Manager Settings, set the Resource Policy to Delete oldest, and adjust the size or folder count if needed.
Next, we make the DCS long running but still have a manageable size. Open the DCS Properties and switch to the Stop Condition tab. Turn off the Overall Duration checkbox, turn on Restart the data collector set at limits, then choose a limit based on either duration or size. For relatively busy DCs I’d suggest a duration time limit of 1 hour. Adjust as needed for your DCs. This change to the stop conditions causes the DCS to run until you turn it off, but it will switch to a new set of files in a new folder once the configured limit is reached. This helps keep each capture from growing too large and doesn’t risk missing any events.
Start your DCS, then wait. Once the event you wanted to capture occurs, stop the DCS. The event and the conditions around it will be somewhere in one of the collection folders. Hopefully you know when the event occurred so you can isolate it to a single set of capture files.
When the DCS stops, it will automatically create the report for the final set of files but will not create a report for any prior sets, so we’ll have to generate the report for any previous sets manually. In each DCS collection folder, you should see four files: Active Directory.etl, AD Registry.xml, NtKernel.etl and Performance Counter.blg. There is a fifth file that we need to generate the report. We will manually create it. Create a new text file in the folder called reportdefinition.txt. In that text file, add the following XML and save it.
<Report name="wpdcAdvisor" version="1" threshold="9999"><Import file="%systemroot%\pla\reports\Report.System.Common.xml"/><Import file="%systemroot%\pla\reports\Report.System.Summary.xml"/><Import file="%systemroot%\pla\reports\Report.System.Performance.xml"/><Import file="%systemroot%\pla\reports\Report.System.CPU.xml"/><Import file="%systemroot%\pla\reports\Report.System.Network.xml"/><Import file="%systemroot%\pla\reports\Report.System.Disk.xml"/><Import file="%systemroot%\pla\reports\Report.System.Memory.xml"/><Import file="%systemroot%\pla\reports\Report.System.Configuration.xml"/><Import file="%systemroot%\pla\Reports\Report.AD.xml"/></Report>You may notice these are the same files that show up in the Data Manager Settings, Rules section for the DCS.
Finally, execute the following command line from within the capture directory you want to use for the report.
tracerpt.exe *.blg *.etl -df reportdefinition.txt -report report.html -f htmlIf everything went right, you should end up with a normal DCS diagnostic report that you can review which covers the time period from when the event occurred.
As a neat trick, if you need to see more than the top 25 items that the report defaults to, you can run the following command to get full XML output:
tracerpt.exe –lr "Active Directory.etl"For additional reading on similar but not identical issues that led me to this solution, I offer up the Canberry PFE team blog Issues with Perfmon reporting - Turning ETL into HTML, the Directory Services Team blog Are your DCs too busy to be monitored?: AD Data Collector Set solutions for long report compile times or report data deletion and the Core Infrastructure and Security blog Taming Perfmon: Data Collector Sets.