> only that it's nontrivial to estimate the migration costs, I think. at
> one point, around 2.3.3*, there was some effort at doing this - or
> something like it. specifically, the scheduler kept track of how long
> a process ran on average, and was slightly more willing to migrate a
> short-slice process than a long-slice. "short" was defined relative
> to cache size and a WAG at dram bandwidth.
yes. I added the avg_slice code, and i removed it as well - it was
hopeless to get it right and it was causing bad performance for certain
application sloads. Current CPUs simply do not support any good way of
tracking cache footprint of processes. There are methods that are an
approximation (eg. uninterrupted runtime and cache footprint are in a
monotonic relationship), but none of the methods (including cache traffic
machine counters) are good enough to cover all the important corner cases,
due to cache aliasing, MESI-invalidation and other effects.
> the rationale was that if you run for only 100 us, you probably don't
> have a huge working set. that justification is pretty thin, and
> perhaps that's why the code gradually disappeared.
yes.
> hmm, you really want to monitor things like paging and cache misses,
> but both might be tricky, and would be tricky to use sanely. a really
> simple, and appealing heuristic is to migrate a process that hasn't
> run for a long while - any cache state it may have had is probably
> gone by now, so there *should* be no affinity.
well it doesnt take much for a process to populate the whole L1 cache with
dirty cachelines. (which then have to be cross-invalidated if this process
is moved to another CPU.)
Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/