I can give you 1 data point. This is for the SGI SN1 platform. This is a NUMA
platform & is running with the DISCONTIGMEM patch that is on sourceforge.
The virt_to_page() function currently generates the following code:
23 instructions
18 "real" instructions
5 noop (I would like to believe the compiler can eventually
use these instructions slots for something else)
The code has
2 load instructions that are always reference node-local memory & have
a high probability of hitting in the caches
1 load to the node that contains the target page
I think I see a couple opportunities for reducing the amount of code. However,
I consider the code to be "fast enough" for most purposes.
>
> * one slab chain for each node, one spinlock for each node.
> * 2 per-cpu arrays for each cpu: one for "correct node" kmem_cache_free
> calls , one for "foreign node" kmem_cache_free calls.
> * kmem_cache_alloc allocates from the "correct node" per-cpu array,
> fallback to the per-node slab chain, then fallback to __get_free_pages.
> * kmem_cache_free checks to which node the freed object belongs and adds
> it to the appropriate per-cpu array. The array overflow function then
> sorts the objects into the correct slab chains.
>
> If virt_to_page is slow we need a different design. Currently it's
> called in every kmem_cache_free/kfree call.
BTW, I think Tony Luck (at Intel) is currently changing the slab allocator
to be numa-aware. Are coordinating your work with his???
-- ThanksJack Steiner (651-683-5302) (vnet 233-5302) steiner@sgi.com
- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/