As I tried to explain to Andrew just the other day, this is neither a
drive nor HBA problem. You've essentially constructed a benchmark where
a single process can get so far ahead of the I/O subsystem in terms of
buffered writes that there is no choice but for there to be a long delay
for the device to handle your read. Consider that because you are queuing,
the drive will completely fill its cache with write data that is pending
to hit the platters. The number of transactions in the cache is marginally
dependant on the number of tags in use since that will affect the ability
of the controller to saturate the drive cache with write data. Depending
on your drive, mode page settings, etc, the drive may allow your read to
pass the write, but many do not. So you have to wait for the cache to
at least have space to handle your read and perhaps have even additional
write data flush before your read can even be started. If you don't like
this behavior, which actually maximizes the throughput of the device, have
the I/O scheduler hold back a single processes from creating such a large
backlog. You can also read the SCSI spec and tune your disk to behave
differently.
Now consider the read case. I maintain that any reasonable drive will
*always* outperform the OS's transaction reordering/elevator algorithms
for seek reduction. This is the whole point of having high tag depths.
In all I/O studies that have been performed todate, reads far outnumber
writes *unless* you are creating an ISO image on your disk. In my opinion
it is much more important to optimize for the more common, concurrent
read case, than it is for the sequential write case with intermittent
reads. Of course, you can fix the latter case too without any change to
the driver's queue depth as outlined above. Why not have your cake and
eat it too?
-- Justin - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/