Escolar Documentos
Profissional Documentos
Cultura Documentos
Guidance
Traditionally we have shied away from noting specific values or thresholds that
are indicative of good or bad performance. One reason for this is because
coming up with good values is quite hard to do, and people sometimes see that a
particular value is outside of some threshold and become fixated on that being
the issue when in reality it may not be. For example, the Windows NT Resource
Kit had a section that stated that a disk queue length greater than two to three
times the number of disk spindles was indicative of a performance problem.
When working with SQL Server this is not always true, especially if read ahead
activity is driving the disk queue length. Just because there is a queue waiting for
IO does not necessarily mean that SQL Server is stalled waiting for the IO to
complete. We have seen disk queue lengths up in the 20-30 range (on much
fewer than 10 disks) where SQL Server performance was just fine.
It should be fairly easy for you to visually identify a counter whose value
changed substantially during a problematic time period. Quite often you will find
that there are many counters that changed significantly. With a blocking
problem, for example, you might see user connections, lock waits and lock wait
time all increase while batch requests/sec decreases. If you focused solely on
a particular counter (or a few counters) you might come to some very
different conclusions about what the problem is, and you could very
likely be wrong. Some of the changes in counter values are the cause of the
original problem, whereas others are just side affects from that problem.
In the ideal situation, the change in the counters that indicate the
cause of the problem should lead the counters showing the affect, but
due to the granularity used to capture Performance Monitor data some
of these distinctions can be lost. If you collect data once every 15 seconds
and the problem was of quick onset, it can be hard to figure out if user
connections went up first and then lock timeouts, or vice versa. This is where
you have to use other available information, such as other performance
counters, the customer’s description of the problem, etc, to form a theory as to
what you think may be wrong and then look for other supporting data to prove or
disprove your theory.
Note:
Paging File %Usage < 70% The amount of the Page File instance
in use in percent. See KB 889654.
Paging File %Usage Peak < 70% The peak usage of the Page File
instance in percent. See KB 889654.
System Processor < 4 per CPU For standard servers with long
Queue Length Quantums
<= 4 per CPU Excellent
< 8 per CPU Good
< 12 per CPU Fair
Performance Disk Counters
When the data files are places on a SAN ignore the following!! Use the
performance tools provided by the SAN vendor instead
PhysicalDisk Avg. Disk < 8ms Measure of disk latency. Avg. Disk
Sec/Read sec/Read is the average time, in
seconds, of a read of data from
the disk.
More Info:
Reads
Excellent < 08 Msec ( .008
seconds )
Good < 12 Msec ( .012
seconds )
Fair < 20 Msec ( .020
seconds )
Poor > 20 Msec ( .020 seconds
)
PhysicalDisk Avg. Disk < 8ms (non Measure of disk latency. Avg. Disk
sec/Write cached) sec/Write is the average time, in
seconds, of a write of data to the
disk.
< 1ms
(cached)
Non cached Writes
Excellent < 08 Msec ( .008
seconds )
Good < 12 Msec ( .012
seconds )
Fair < 20 Msec ( .020
seconds )
Poor > 20 Msec ( .020 seconds
)