Bonwick has a newer paper
(http://www.usenix.org/events/usenix01/bonwick.html)
that describes how per cpu support can be added. I've forgotten my Usenix
password, so I can't get the full text of the paper online at the moment.
But, if I recall correctly his magazine layer included support to
dynamically
adjust the size of the per-cpu lists.
The question becomes: Are the performance benefits high enough to justify
this extra code complexity? Especially as tuning using /proc/slabinfo is
already available to mitigate problems that are bad enough for people to
notice.
Can you quantify the SMP/NUMA benefits? I took some measurements a while
ago that showed that a huge percentage of slab allocations were freed by the
same cpu after a very short lifetime. I didn't look into how often the
problems that you cite occur.
> I agree that preserving read only variables that can be used between uses
> will help performance. We still can do that by revising the assumption to
> leave the first 4 or whatever bytes needed to store the links. What do you
> think?
You'd need enough bytes to store your pointer (so "whatever" == 8 on 64-bit
architectures). Users that care to arrange the fields of their structures
in "used together" order for better cache locality tend to put there efforts
into the first elements of a structure. You might get less resistance to
change
if you use the tail end of the object? But this is a potentially big
change.
Drivers can create their own slab caches, and if you change the semantics,
then
you may well break something.
-Tony
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/