> I prefer a single point of entry called load_balance() to multiple
> functionally different balancers. [...]
agreed - my cleanup patch keeps that property.
> [...] IIRC, his conclusion for the multi-queue scheduler was that an
> order of magnitude of 10ms is long enough, below you start feeling the
> balancing overhead, above you waste useful cycles.
this is one reason why we do the idle rebalancing at 1 msec granularity
right now.
> On a NUMA system this is even more important: the longer you leave fresh
> tasks on an overloaded node, the more probable it is that they allocate
> their memory there. And then they will run with poor performance on the
> node which stayed idle for 200-400ms before stealing them. So one wastes
> 200-400ms on each CPU of the idle node and at the end gets tasks which
> perform poorly, anyway. If the tasks are "old", at least we didn't waste
> too much time beeing idle. The long-term target should be that the tasks
> should remember where their memory is and return to that node.
i'd much rather vote for fork() and exec() time 'pre-balancing' and then
making it quite hard to move a task across nodes.
> > The inter-node balancing (which is heavier than even the global SMP
> > balancer), should never be triggered from the high-frequency path.
>
> Hmmm, we made it really slim. [...]
this is a misunderstanding. I'm not worried about the algorithmic overhead
_at all_, i'm worried about the effect of too frequent balancing - tasks
being moved between runqueues too often. That has shown to be a problem on
SMP. The prev_load type of statistic measurement relies on a constant
frequency - it can lead to over-balancing if it's called too often.
> So if the CPU is idle, it won't go through schedule(), except we get an
> interrupt from time to time... [...]
(no, it's even better than that, we never leave the idle loop except when
we _know_ that there is scheduling work to be done. Hence the
need_resched() test. But i'm not worried about balancing overhead at all.)
Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/