As the Hard Disc Spins
indepth analysis of HDD performance @ Lost Circuits
Summary Excerpts
"Hard disk drive technology has moved more and more into the center of attention of the IT industry. The original role of HDDs was meant to be a simple mass storage media with relatively little emphasis on performance; twenty years ago, this did suffice. With the internet evolving and servers becoming a powerful factor in the electronic fabric of data communication throughout the entire world, it soon became obvious that storage media were also the most prevalent bottleneck, thus, the need for increased performance.
Performance of hard disk drives is a rather touchy topic, depending on who one asks, one may get very different answers, based on the different criteria applied and the different benchmarks used. Buzz words like internal performance, seek latencies, media transfer rates, response times are being thrown around, burst speed and effective host transfer rates are held against those and last not least, the cache size matters.
This is the level of argumentation for a single drive, once the subject switches to RAID configurations, things are getting even more complicated. In addition, there are the standard benchmarks and then there are those advanced benchmarks like IOMeter, which by their mere use already qualify the testers as experts. Or not?
In this and the following articles we will first discuss HDD performance in general, then go about a few basic functional drive architecture parameters and finally discuss how certain benchmarks fit into the grand scheme. We will also show, which benchmarks, despite their popularity are either meaningless for the end user or else create a false impression of the drive's capabilities. There will be follow-up articles that will deal in greater detail with individual benchmarks. "
I: Internal Drive Performance
What is HDD Performance Anyway?
Hard Disk Drive Architecture
Zones
Effective Internal Transfer Rate (TxD)
Internal Performnance vs Sequential Data Transfer Rate
Servo Bursts
Skew
Summary: Effective Internal Performance (TxD)
"Effective internal performance or effective media transfer rate is defined by the media density times the linear velocity relative to the head minus housekeeping data that are interspersed with the actual data on the platters. Higher area densities and higher rotational speeds will increase the effective internal performance, higher amounts of housekeeping data as required for e.g. positional corrections will reduce the effective transfer rate. Depending on the drive's targeted environment, the house-keeping overhead will vary, e.g. laptop drives that are subject to higher levels of vibrations will need more "corrective actions" / repositioning than desktop drives. Likewise, rack-mounted server drives may require more servo data since vibrations can easily propagate throughout an entire rack.
It should also be clear now why the relatively higher amount of servo data on high-end, e.g. SCSI drives will result in lower performance than that of comparable (in terms of rpm) desktop drives. On the other hand, within the environment that SCSI drives usually are operating in, the very same "faster" desktop drives would run into the need for constant recalibration which would cause a severe performance hit under operational conditions"
II: Averages, Seeks and other Paradoxes
Average Sequential Transfer
Read vs. Write Performance
Random Access
Long Seek vs. Short Seek Optimizations
Longer Platters Will Cause Higher Seek/ Random Access Latencies
Summary
"In this (second) article, we have covered some misleading internal drive parameters like the average sequential transfer rates for READ and WRITE as well as the different parameters influencing Random Access and Seek latencies and we further showed that different capacity models of the same drive can generate some performance data that paint the opposite picture of what the user would get in real world performance. Keep in mind that all parameters covered so far are only indicative of the drive's internal performance and should not vary from one interface to the other, at least in single drive test configurations.
That means that e.g. using the "average sequential transfer" to show differences in interfaces or controllers is somewhat useless since by definition the results will be the same. Comparing e.g. the SATA and PATA interface on the same setup with different drives can only be called a self-fulfilling prophecy and will not bear any relevance for the intended test either.
In the case of RAID setups, there will be differences between READ and WRITE performance, moreover, the interface bandwidth will likely become a limiting factor and the cause for speed matching conditions. "
III: Effective Host Transfer Rates
Interface Speed as Basis for Classification
A Brief History of ATA
Host Transfer Rates (TH) and Effective Host Transfer Rates (TxH)
Quantifying the CMD Overhead
Data Transfer Vs. I/O Performance
Summary
"In the last two articles, we have covered the internal performance parameters of hard disc drives, that is, more precisely, those parameters that mostly relate to a HDD as an electromechanical device. For a brief recap, HDDs use electronic and mechanical parts and the mechanical latencies are what mostly holds back the performance of any disc drive. The main reason is the inertia of any mechanical component, starting from the spindle and the platter to the read / write heads and the actuator. Whoever read through the first two articles on the subject should at least have some basic understanding of how media density and rotational speed translate into sequential transfers, likewise, the influence of rotational latency and seek latency on the random access speed should be clear"
IV: DMAs, Latencies and Speed Matching
Bus Master?
Other factors influencing host transfer rate
..PCI Latency
..Bus Parking
..Write Combining
Speed Matching Condition
Summary
"In the last article, I briefly brushed on the issue that current version of Windows, as well as Unix and Linux systems are not real time operating systems, rather, software stacks are created to allow internal scheduling of the execution of events. That, however, also means that there can be conflicts between the availability of interrupts and the software stacks, that is, the hardware is fighting against the software for priority. The result in cases like that is that there is no winner, performance is being reduced and at the same time, CPU usage is increased because of access errors and retries on the PCI bus and ineffective execution of transfers.
Some of the issues with poor utilization are plain and simply a matter of incorrect or marginally functional software. One good example are the chipset and bus master drivers that are necessary for almost any chipset not conforming to the original Intel specifications. That is, there are hardware bridges and buffers that need to be configured in order to optimally interface with the Microsoft OS environment by means of drivers. Only in a few rare instances will the new interface be completely transparent to the OS, one example is the ICH5 Serial ATA interface introduced by Intel with the Canterwood / Springdale chipset. However, even in this case, the transparency (the fact that the OS does not even notice anything has changed) is limited to the non RAID version of the same south bridge, namely the ICH5R, in that for RAID operation the installation of drivers will be necessary."
"The VIA Hyperion drivers are the currently last step in a development that we have followed for about 5 years to optimize the South Bridge and IDE controller interaction with the Windows environment by means of bus master and GART drivers."
"Third Party chipsets" and external controllers will in almost all cases require the installation of extra drivers, not necessarily for any basic functionality but definitely for enabling performance while reducing CPU utilization. A few well known examples are the VIA or nVidia or any other chipset bus master drivers."
V: Protocol Differences For Reduced Latencies
Cabling and Parallel Signaling Properties
Serial ATA
Shared Bus Issues
Parallel vs. Serial Command Overhead
Serial ATA: First Party DMA
SATA "Out Of Order Data Delivery" To Reduce Rotational Latencies
"we have concentrated on some of the internal design parameters that influence the internal performance of Hard Disc Drives as well as some of the issues that relate to the effective host transfer rate and the interfacing of the drive with the Host Bus Adapter (HBA) and the DMA and busmastering channels needed to interface with the system logic. The current article will concentrate more on the differences between Parallel and Serial ATA with a focus on the different interfacing protocols
Most of the marketing strategies promoting the migration from Parallel to Serial ATA have focused on the physical cabling properties, that is, the reduction from 16 bidirectional data channels to two unidirectional pairs of Low Voltage Differential Signaling ((LVDS) lines. On the surface, the obvious effect is a greatly facilitated ease of routing, reduced obstacles in the air flow and a smaller connector footprint. The main reasons, however, relate to the signaling properties in the context of the fact that parallel signaling across long distances had no headroom left for further speed grades. "
and since these are related to Part V
SATA and the 7 Deadly Sins of Parallel ATA
ATA Not So Frequently Asked Questions
both also @ Lost Circuits
VI: Command Queing
Queuing Schemes: Parallel ATA vs.Serial ATA
Mechanical Overhead
Seek Latencies
Supermarkets and Elevators
Rotational Latencies
Different Queues in Different Standards
The Hardware Behind Queuing
Queue Depth against The Rest Of The World
The Big Picture
In summary, here is the short and sweet on the different forms of Command Queuing. In Parallel ATA, the merely passive role of the drive, along with the command overhead associated with the disconnect and polling of master and slave devices on the same cable renders the legacy command queuing scheme somewhat ineffective. Therefore, there has been little incentive to move to the more sophisticated albeit more expensive command queuing scheme.
Serial ATA extensions add Native Command Queuing to the FirstPartyDMA engine setup. The point to point topology allows a continuous communication between the device and the controller, which in turn allows to take full advantage of advanced features like for example a so-called non-zero offset DMA engine setup to allow for out of order data delivery, as well as reordering of commands within the queue. SATA NCQ does not allow prioritizing of queues, however, Virtual Head of Queue attributes are possible with the effect that any such command will trash any existing queue. This virtual Head of Queue command will, thus, grant priority to itself in a Last-Man-Standing fashion. Thereafter, the previously uncompleted commands will have to be reissued.
As The Disc Spins, indepth analysis of HDD performance @ Lo
Jul 6 2004, 07:06 PM, updated 21y ago
Quote
0.1252sec
0.62
5 queries
GZIP Disabled