Nitin,
I got a chance to run the NetBench benchmark with your patch on 2.5.54-mjb2
kernel. NetBench measures SMB/CIFS performance by using several SMB clients
(in this case 44 Windows 2000 systems), sending SMB requests to a Linux
server running Samba 2.2.3a+sendfile. Result is in throughput, Mbps.
Generally the network traffic on the server is 60% recv, 40% tx.
I believe we have very similar systems. Mine is a 4 x 1.6 GHz, 1 MB L3 P4
Xeon with 4 GB DDR memory (3.2 GB/sec I believe). The chipset is "Summit".
I also have more than one Intel e1000 adapters.
I decided to run a few configurations, first with just one adapter, with and
without HT support in the kernel (acpi=off), then add another adapter and
test again with/without HT.
Here are the results:
4P, no HT, 1 x e1000, no kirq: 1214 Mbps, 4% idle
4P, no HT, 1 x e1000, kirq: 1223 Mbps, 4% idle, +0.74%
I suppose we didn't see much of an improvement here because we never run into
the situation where more than one interrupt with a high rate is routed to a
single CPU on irq_balance.
4P, HT, 1 x e1000, no kirq: 1214 Mbps, 25% idle
4P, HT, 1 x e1000, kirq: 1220 Mbps, 30% idle, +0.49%
Again, not much of a difference just yet, but lots of idle time. We may have
reached the limit at which one logical CPU can process interrupts for an
e1000 adapter. There are other things I can probably do to help this, like
int delay, and NAPI, which I will get to eventually.
4P, HT, 2 x e1000, no kirq: 1269 Mbps, 23% idle
4P, HT, 2 x e1000, kirq: 1329 Mbps, 18% idle +4.7%
OK, almost 5% better! Probably has to do with a couple of things; the fact
that your code does not route two different interrupts to the same
core/different logical cpus (quite obvious by looking at /proc/interrupts),
and that more than one interrupt does not go to the same cpu if possible. I
suspect irq_balance did some of those [bad] things some of the time, and we
observed a bottleneck in int processing that was lower than with kirq.
I don't think all of the idle time is because of a int processing bottleneck.
I'm just not sure what it is yet :) Hopefully something will become obvious
to me...
Overall I like the way it works, and I believe it can be tweaked to work with
NUMA when necessary. I hope to have access to a specweb system on a NUMA box
soon, so we can verify that.
-Andrew Theurer
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/