Nitin,
I am seeing if I can get together a test for Netbench with/without your patch.
Looking at the patch, I would expect a slight increase in performance. I'll
let you know the results as soon as I have them.
I do have one question for you: Have you tested netperf using only one
gigabit adapter? If so, have you been able to max out the adapter when using
hyperthreading? If not, could you test this? So far I have not been able
to, while I can quite easily with no hyperthreading (in Netbench). This is
the case with both irq_balance and irq affinity. having ints processed by
one and only one logical CPU at one time really seems to bottleneck network
throughput. I'm sure some of this has to do with sharing those resources
among 2 logical CPUs, but I also wonder if int processing is just a lot
slower than P3 overall.
I am bringing this up, because I recall James Cleverdon having some code which
allows interrupts to be dynamically routed to two CPU destinations, a pair of
CPUs with consecutive CPU ID's. Interrupts are dynamically routed to the
least loaded CPU, and if both are idle, to the CPU with the lower CPUID. I
like this idea, because when in HT, if consecutive logical CPU ID's map to
one physical core, we get to use "whole" processor, and both destinations
share the cache. Anyway, just a thought.
-Andrew Theurer
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/