Re: CPU affinity & IPI latency

Larry McVoy (lm@bitmover.com)
Thu, 12 Jul 2001 17:36:41 -0700

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Pawel Kot: "Oops in 2.4.4"
Previous message: Linus Torvalds: "Re: yenta_socket hangs sager laptop in kernel 2.4.6"

Be careful tuning for LMbench (says the author :-)

Especially this benchmark. It's certainly possible to get dramatically better
SMP numbers by pinning all the lat_ctx processes to a single CPU, because
the benchmark is single threaded. In other words, if we have 5 processes,
call them A, B, C, D, and E, then the benchmark is passing a token from
A to B to C to D to E and around again.

If the amount of data/instructions needed by all 5 processes fits in the
cache and you pin all the processes to the same CPU you'll get much
better performance than simply letting them float.

But making the system do that naively is a bad idea.

This is a really hard area to get right but you can take a page from all
the failed process migration efforts. In general, moving stuff is a bad
idea, it's much better to leave it where it is. Everything scales better
if there is a process queue per CPU and the default is that you leave the
processes on the queue on which they last run. However, if the load average
for a queue starts going up and there is another queue with a substantially
lower load average, then and ONLY then, should you move the process.

I think if you experiment with that you'll see that lat_ctx does well and
so do a lot of other things.

An optimization on that requires hardware support. If you knew the number
of cache misses associated with each time slice, you could factor that in
and start moving processes that have a "too high" cache miss rate, with the
idea being that we want to keep all processes on the same CPU if we can
but if that is causing an excessive cache miss rate, it's time to move.

Another optimization is to always schedule an exec-ed process (as opposed
to a forked process) on a different CPU than its parent. In general, when
you exec you have a clear boundary and it's good to spread those out.

All of this is based on my somewhat dated performance efforts that lead to
LMbench. I don't know of any fundamental changed that invalidate these
opinions but I could be wrong.

This is an area in which I've done a pile of work and I'd be interested
in keeping a finger in any efforts to fix up the scheduler.

-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/






 Next message: Pawel Kot: "Oops in 2.4.4"
 Previous message: Linus Torvalds: "Re: yenta_socket hangs sager laptop in kernel 2.4.6"