OK, well I hacked this for now:
sched.c somewhere:
- lnode_number[i] = pnode_to_lnode[SAPICID_TO_PNODE(cpu_physical_id(i))];
+ lnode_number[i] = i/4;
Which makes the pools work properly. I think you should just use
the cpu_to_node macro here, but the hack will allow us to do some
testing.
Results, averaged over 5 kernel compiles:
Before:
Elapsed: 20.82s User: 191.262s System: 59.782s CPU: 1206.4%
After:
Elapsed: 21.918s User: 190.224s System: 59.166s CPU: 1137.4%
So you actually take a little less horsepower to do the work, but
don't utilize the CPUs quite as well, so elapsed time is higher.
I seem to recall getting better results from Mike's quick hack
though ... that was a long time back. What were the balancing
issues you mentioned?
M.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/