Yes, I like it! I needed some time to understand that the per_cpu
variables can spread the execed tasks acros the nodes as well as the
atomic sched_node. Sure, I'd like to select the least loaded node instead
of the least loaded CPU. It can well be that you just have created on a
node 10 threads (by fork, therefore still on their original CPU), and have
an idle CPU in the same node (which didn't steal yet the newly created
tasks). Suppose your instant load looks like this:
node 0: cpu0: 1 , cpu1: 1, cpu2: 1, cpu3: 1
node 1: cpu4:10 , cpu5: 0, cpu6: 1, cpu7: 1
If you exec on cpu0 before cpu5 managed to steal something from cpu4,
you'll aim for cpu5. This would just increase the node-imbalance and
force more of the threads on cpu4 to move to node0, which is maybe bad
for them. Just an example... If you start considering non-trivial
cpus_allowed masks, you might get more of these cases.
We could take this as a design target for the initial load balancer
and keep the fastest version we currently have for the benchmarks
we currently use (Michael's).
Regards,
Erich
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/