I recently posted on SQLServerCentral.com with regard to a question about Disk Queue Length…
“As an employee of a major data managment/san storage provider I get asked questions like this quite often, and I really liked the one answer: “It Depends.” This is by far the most accurate answer I can start out with in my own discussions.
As I read through the posts I think most everyone had the correct answers – for any given moment for any specific situation.
Just to clarify a misstatement…. RAID 1 or 0+1, or 10 is better suited for transaction logs than RAID 5. The write penalty of RAID 5 is (4) IO’s per write while the write penalty for RAID 1/0+1/10 is only (2) IO’s per write. Since the transaction logs are primarily write oriented, RAID 5 would cost you twice as many IO’s as RAID 1/0+1/10
The type of disk drive in the SAN array would determine the number of IOPs that could be expected out of that disk. 15K RPM Fibre Channel is usually rated around 180 IOPs per disk. 10K RPM Fibre Channel is rated around 140 IOPs, while SATA 7200 RPM is rated around 70 IOPs per disk. Solid State/Enterprise Flash Drives are similar in Write Performance to 15K Fibre Channel disks, but in READ scenarios are capable of 2500 or more READ IOPs.
In the case of the SAN Array with 10 drives (assuming they are running the highest performance spinning disk @ 15K RPM) there would be available in that array ~1800 IOPs. If the activity is all read then there is no additional RAID write penalty associated with this. This means that a query that performs a huge table scan would have about 1800 IOPs of capacity to satisfy that request. (for table scans – IO Bandwidth becomes an important factor as well, and along these lines IO READ SIZE is critical to determining MAX BANDWIDTH. This is a slightly different discussion but may be relative on TABLE SCAN intensive queries such as Microsoft’s Fast Track Data Warehouse reference architecture.)
So back to the original question where Spotlight triggers an alert at 477 IOs Queued up. – IF – the queue depth is satisfied in a timely manner – such as in a second or whatever seems reasonable to your specific environment – then this is usually considered to be fine. If the queue depth is held in an oversubscribed level for an extended period of time then this would indicate the drives are being asked to work harder and longer then they are physically capable of working, so requests for work are backing up in a queue waiting to be serviced by the disks. There are only two ways that I know of to fix long lines at the grocery store when we checkout – one would be to open up additional checkout lanes (for the SAN admin and the DBA this would mean: get more disks involved in the workload so the requests are satisfied in a timely manner), or two – make it more efficient for the checkers to process each customer more quickly – spending less time on each customer – thereby increasing their efficiency and reducing the time other customers must stand in line waiting to be checked out (translated as: the DBA would need to optimize the queries to use less disk resources – which as we all know is often times needed, but not always a viable option due to application constraints.)
Relational Database systems are probably the most complex and most difficult of all of the various systems that IT practitioners have to deal with because there is really no such thing as a “typical database”. For this reason we as DBA’s must think in terms of guidelines and principles, and best practices to direct our thinking towards the right answers for our individual situations at a specific point in time – and the only thing that we can be sure of is that things will change and we’ll have to readjust our thinking again.
I hope this helps to point you in the correct direction with regard to your database/SAN questions. I’m happy to answer additional questions. My name is Randy Loeschner and I work for EMC. I can be reached at my blog: http://www.randyloeschner.com. Cheers! “