Re: queue_nr_requests needs to be selective

Jeff V. Merkey (jmerkey@vger.timpanogas.org)
Fri, 1 Mar 2002 16:20:16 -0700

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Mitchell Blank Jr: "Re: [PATCH] spinlock not locked when unlocking in atm_dev_register"
Previous message: Andi Kleen: "Re: OOPS: Multipath routing 2.4.17"

On Fri, Mar 01, 2002 at 12:43:09PM -0800, Andrew Morton wrote:
> "Jeff V. Merkey" wrote:
> >
> > Linus/Alan/Linux,
> >
> > Performance numbers can be increased dramatically (> 300 MB/S)
> > by increasing queue_nr_requests in ll_rw_blk.c on large RAID
> > controllers that are hosting a lot of drives.
>
> I don't immediately see why increasing the queue length should
> increase bandwidth in this manner. One possibility is that
> the shorter queue results in tasks sleeping in __get_request_wait
> more often, and the real problem is the "request starvation" thing.

This is the case. We end up sleeping at with 8,000 buffer head requests
(4K block size) queued per adapter. After I made the change and
increased the size to 1024, this number increased to 17,000
buffer heads queued per adapter. Performance went up and processor
utilization went down.

far too much. This value should be maintained on a per card
basis and for RAID controllers that present a single virtual disk
for many physical disks (i.e. on 3Ware this number is 8), we should
make the queue 8 X default. I guess each driver would need to
change this value based on how many actual drives were attached
to the controller.

>
> The "request starvation" thing could conceivably result in more
> seeky behaviour. In your kernel, disk writeouts come from
> two places:

This is not happening. The elevator code is above this level and
I am seeing the requests are ordered for the most part at this
layer.

>
> - Off the tail of the dirty buffer LRU
>
> - Basically random guess, from the page LRU.

There are two senarios, one scenario is not using Linus' buffer cache
but a custom cache maintained between SCI nodes, another implementation
is using Linus' buffer cache. We are seeing > 300 MB/S on the SCI
cache.

>
> It's competition between these two writeout sources which causes
> decreased bandwidth - I've seen kernels in which ext2 writeback
> performance was down 40% due to this.
>
> Anyway. You definitely need to try 2.4.19-pre1. Max sleep times
> in __get_request_wait will be improved, and it's possible that the
> bandwidth will improve. Or not. My gut feel is that it won't
> help.
>

How about just increasing the value of queue_nr_requests or making
it adapter specific?

> And yes, 128 requests is too few. It used to be ~1000. I think
> this change was made in a (misguided, unsuccessful) attempt to
> manage latency for readers. The request queue is the only mechanism
> we have for realigning out-of-order requests and it needs to be
> larger so it can do this better. I've seen 15-25% throughput
> improvements from a 1024-slot request queue.

>
> And if a return to a large request queue damages latency (it doesn't)
> then we need to fix that latency *without* damaging request merging.
>
> First step: please test 2.4.19-pre1 or -pre2. Also 2.4.19-pre1-ac2
> may provide some surprises..
>

I will test, but unless this value is higher, I am skeptical I will see
the needed improvement. The issue here is that it sleeps too much
and what's really happening and that we are forcing 8 disk drives
toshare 64/128 request buffers rather than provide each physical disk
with what it really needs.

Jeff

> -
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Mitchell Blank Jr: "Re: [PATCH] spinlock not locked when unlocking in atm_dev_register"
Previous message: Andi Kleen: "Re: OOPS: Multipath routing 2.4.17"