Detecting disk system bottlenecks

How to find out if the disk subsystem is going bad.

When your Exchange server starts getting sluggish, it's probably a good idea to suspect that your disk subsystem is the major culprit. But how can you be sure? This tip, excerpted from InformIT, discusses How you can go about finding out if the disk subsystem is really the bad guy or not.


With most Exchange systems, the disk subsystem has the most influence on performance. The primary consideration with the disk subsystem is not its size, but its capability to handle multiple random reads and writes quickly. For example, when Exchange users open their inboxes, the set of properties in the default folder view must be read for approximately the first 20 messages. If the property information is not in the cache, it must be read from the information stored on disk. Likewise, a message transferred from one server to another must be written to disk for the receiving server to acknowledge its receipt. (This is a safety measure that prevents message loss during power outages.) Now imagine the read and write activity created by 300 heavy email users on one server. Their combined requests would generate a multitude of random messages (traffic) on the disk subsystem.

CAUTION

Sometimes you see extremely high % Disk Times and think that your disk subsystem is causing a bottleneck. However, you want to examine other overview counters before going in any one direction. For example, when available memory drops to critical levels, Windows 2000 begins to page (write unused data or code to the hard drive to make room for more active programs). In a case of extreme RAM resource starvation, your disk subsystem can be reading and writing furiously and appear to be bottlenecked. Looking at other general disk counters in PerfMon will validate this illusion.

When you examine both memory and disk subsystem counters, you'll notice that during prolonged memory paging, disk activity increases. The solution is to add more memory, not to increase your disk subsystem capacity.

If you suspect that the server's disk subsystem is forming a bottleneck that slows down user requests, examine the following Windows 2000 Performance Monitor counters:

  • Physical Disk: % Disk Time -- Disk Time is the percentage of elapsed time that the selected disk drive is busy servicing read or write requests. In other words, this counter provides an indication of how busy your disk subsystem is over the time period that you're measuring in PerfMon. A consistent average over 95 percent indicates significant disk activity.
  • Physical Disk: Current Disk Queue Length -- This counter measures the number of requests waiting to use the disk subsystem at the time the performance data was collected. Multispindle disk devices can have numerous requests active at any instance of time. Requests would experience a delay directly proportional to the queue length minus the number of spindles on the disk. This counter should average less than 2 percent for good performance. Use the Disk Queue Length counter combined with the % Disk Time counter to get an exceptional overview of your disk subsystem's workload.

Both counters can monitor either your server's physically installed disk spindles or RAID bundles.


To read the entire article from which this tip is excerpted, click over to InformIT. You have to register there, but the registration is free.


This was first published in November 2002

Dig deeper on Microsoft Exchange Server Hardware Management

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

-ADS BY GOOGLE

SearchWindowsServer

SearchEnterpriseDesktop

SearchCloudComputing

SearchSQLServer

Close