No can do.
If we tear down the page tables, we _have_ to flush the TLB on x86,
because even if we don't touch them later on, speculative execution may
end up causing TLB fills, and if we don't tell the TLB fill hw that we've
torn down the pages (by invalidating the TLB), you can get all the same
nasty behaviour.
And we cannot just defer the TLB flush to a later date ("who cares if we
get crap in the TLB, we'll flush it anyway"), because some of the bogus
TLB contents might get the "Global" bit set too. Which would mean that
those bogus entries wouldn't be flushed at all.
In short:
- if we tear down the page tables, we _have_ to flush the TLB, even if we
turn it into a lazy TLB.
- At least on x86, once you flush the TLB, the incremental cost of doing
a full mm switch is basically zero. The TLB flush is, after all, the
real cost of the mm switch (this is likely to be true on other CPU's
too).
- so we can choose between just flushing the TLB (and leaving it lazy),
and then on the next switch_mm() we flush it again when we switch into
the next process, _OR_ we could try to opportunistically switch mm's
"early".
The early switch would at least on x86 be likely to result in the minimal
amount of TLB flushing theoretically possible. Which I kind of like (if
you can _prove_ that you cannot do better, you're in a good position ;).
But the "just flush the TLB" approach certainly also works.
Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/