FYI, I have used a topology to map HT aware processors (in this case P4) to
a NUMA topology while using this scheduler. This was done to help address
the same problems that Ingo's shared runqueue implementation fixed. The
topology is quite simple. Sibling logical procs are members of a node.
Number of nodes = number of physical procs.
This primarily avoids sharing cpu cores (and avoiding resource contention)
on low loads. In my case, 4 tasks on 8 logical proc system, we want to load
balance the tasks across nodes/cores for better performance. For my test, I
did a make -j4 on a 2.4.18 kernel. Results are:
stock sched, no numa: 56.523 elapsed 202.899 user, 18.266 sys, 390.6%
numa sched, ht topo: 53.088 elapsed 189.424 user, 18.36 sys, 391%
~6.5% better. These results are the average of 10 kernel compiles.
* I did make one minor change to sched_best_cpu(). The first test case was
elimintaed, and that change is currently under discussion.
I did this mainly to demonstrate that a numa scheduler's policies may be
able to help HT systems and to capture a wider interest in numa scheduler.
By no means is P4 HT required to use this. This is simply a numa topology
implemantation. I would like some feedback on any interest in this.
One of the reasons we probably have not had much interest in numa patches is
that numa systems are not that prevailent. However, numa-like qualites are
showing up in commonly available systems, and I believe we can take
advantage of policies that these patches, such as numa scheduler provide.
Does anyone have any other ideas where numa like qualities lie? x86-64?
-Andrew Theurer
P.S. I am working on a topology patch to send out. It's quite hackish right
now.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/