However, the time consumption affects everybody. The overhead of pte-chains
is very significant ... people seem to be conveniently forgetting that for
some reason. Ingo's rmap_pages thing solves the lowmem space problem, but
the time problem is still there, if not worse.
Please don't create the impression that rmap methodologies are only an
issue for large 32 bit machines - that's not true at all.
People seem to be focused on one corner case of performance for objrmap ...
If you want a countercase for pte-chain based rmap, try creating 1000
processes in a machine with a decent amount of RAM. Make them share
libraries (libc, etc), and then fork and exit in a FIFO rolling fashion.
Just forking off a bunch of stuff (regular programs / shell scripts) that
do similar amounts of work will presumably approximate this. Kernel
compiles see large benefits here, for instance. Things that were less
dominated by userspace calculations would see even bigger changes.
I've not seen anything but a focused microbenchmark deliberately written
for the job do better on pte-chain based rmap that partial objrmap yet. If
we had something more realistic, it would become rather more interesting.
M.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/