It has created exceptional situations which are rather tying our hands in
other areas.
> I mean it's not too bad but it's a mere
> workaround for:
>
> 1) lack of 64bit address space that will be fixed
> 2) lack of O(log(N)) mmap, that will be fixed too
Yes, mmap() overhead due to the linear search, VMA space consumption,
additional TLB invalidations and additional faults. The latter could be
fixed up via MAP_PREFAULT and are independent of nonlinearity.
Here's Ingo's original summary:
- really complex remappings (used by databases or virtualizing
applications) create a *huge* amount of vmas - and vma's are per-process
which puts a really big load on kernel memory allocations, especially on
32-bit systems. I've seen applications that had a mapping setup that
generated 128 *thousand* vmas per process, causing lots of problems.
- setting up separate mappings is expensive, causes one pagefault per page
and also causes TLB flushes.
- even on 64-bit systems, when mapping really large (terabyte size) and
really sparse files, sparse mappings can be a disadvantage - in the
worst-case there can be as much as 1 more pagetable page allocated for
every file page that is mapped in.
> 1) and 2) are the only reason why there's huge interest in such syscall
> right now. So I don't like it too much and I'm not convinced it was
> right to merge it in 2.5 given 2) is a software problem and I've the
> design to fix it with a rbtree extension, and 1) is an hardware problem
> that will be fixed very soon. the API is not too bad but there is a
> reason we have the vma for all other mappings.
>
> Maybe I'm missing something, I'm curious to hear what you think and what
> other cases needs this syscall even after 1) and 2) are fixed.
I think that's right - the system call is very specialised and is targeted at
solving problems which have been encountered in a small number of
applications, but important ones.
Right now, I do not feel that we are going to be able to come up with an
acceptably simple VM which has both nonlinear mappings and objrmap.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/