> Before:
> Elapsed: 20.18s User: 192.914s System: 48.292s CPU: 1195.6%
>
> After:
> Elapsed: 19.798s User: 191.61s System: 43.322s CPU: 1186.4%
>
> That's for a kernel compile.
>
UP or SMP?
And was that the complete patch, or just the modification to slab.c?
I've made a microbenchmark of kmem_cache_alloc/free of 4 kb objects, on
UP, AMD Duron:
1 object 4 objects
cur 145 cycles 662 cycles
patched 133 cycles 2733 cycles
Summary:
* for one object, the patch is a slight performance improvement. The
reason is that the fallback from partial to free list in
kmem_cache_alloc_one is avoided.
* the overhead of kmem_cache_grow/shrink is around 500 cycles, nearly a
slowdown of factor 4. The cache had no constructor/destructor.
* everything cache hot state. [100 runs in a loop, loop overhead
substracted. 98 or 99 runs completed in the given time, except for
patched-4obj, where 24 runs completed in 2735 cycles, 72 in 2733 cycles]
For SMP and slabs that are per-cpu cached, the change could be right,
because the arrays should absorb bursts. But I do not think that the
change is the right approach for UP.
-- Manfred- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/