Re: [Lse-tech] [RFC] [PATCH] Scalable Statistics Counters

Jack Steiner (steiner@sgi.com)
Sat, 8 Dec 2001 21:46:15 -0600 (CST)

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Trever L. Adams: "Re: 2.4.14/16 load reboots"
Previous message: Richard Gooch: "Re: devfs unable to handle permission: 2.4.17-pre[4,5]"
Next in thread: Paul Jackson: "Re: [Lse-tech] [RFC] [PATCH] Scalable Statistics Counters"
Reply: Paul Jackson: "Re: [Lse-tech] [RFC] [PATCH] Scalable Statistics Counters"

>
> On Fri, 7 Dec 2001, Dipankar Sarma wrote:
> > .... If we extend kmem_cache_alloc() to allocate memory
> > in a particular NUMA node, we could simply do this for placing the
> > counters -
> >
> > static int pcpu_ctr_mem_grow(struct pcpu_ctr_ctl *ctl, int flags)
> > {
> > ...
> >
> > /* Get per cpu cache lines for the block */
> > for_each_cpu(cpu) {
> > blkp->lineaddr[cpu] = kmem_cache_alloc_node(ctl->cachep,
> > flags, CPU_TO_NODE(cpu));
> > ...
> > }
> >
> > This would put the block of counters corresponding to a CPU in
> > memory local to the NUMA node.
>
> Rather than baking into each call of kmem_cache_alloc_node()
> the CPU_TO_NODE() transformation, how about having a
> kmem_cache_alloc_cpu() call that allocates closest to a
> specified cpu.

I think it depends on whether the slab allocator manages memory by cpu or
node. Since the number of cpus per node is rather small (<=8) for
most NUMA systems, I would expect the slab allocator to manage by node.
Managing by cpu would likely add extra fragmentation and no real performance
benefit.

>
> I would prefer to avoid spreading the assumption that for each
> cpu there is an identifiable node that has a single memory
> that is best for all cpus on that node.

But this is already true for the page allocator (alloc_pages_node()).

The page pools are managed by node, not cpu. All memory on a node is
managed by a single pg_data_t structure. This structure contains/points-to
the tables for the memory on the node (page structs, free lists, etc).

If a cpu needs to allocate local memory, it determines it's node_id.
This node_id is in the cpu_data structure for the cpu so this is an easy
calculation (one memory reference). The nodeid is used find the pgdata_t struct
for the node (index into an array of node-local pointers, so again, no offnode
references).

Assuming the slab allocator manages by node, kmem_cache_alloc_node() &
kmem_cache_alloc_cpu() would be identical (exzcept for spelling :-).
Each would pick up the nodeid from the cpu_data struct, then allocate
from the slab cache for that node.

> Instead, assume that
> for each cpu, there is an identifiable best memory ... and let
> the internal implementation of kmem_cache_alloc_cpu() find that
> best memory for the specified cpu.
>
> Given this change, the kmem_cache_alloc_cpu() implementation
> could use the CpuMemSets NUMA infrastructure that my group is
> working on to find the best memory. With CpuMemSets, the
> kernel will have, for each cpu, a list of memories, sorted
> by distance from that cpu. Just take the first memory block
> off the selected cpus memory list for the above purpose.
>
>
> I won't rest till it's the best ...
> Manager, Linux Scalability
> Paul Jackson <pj@sgi.com> 1.650.933.1373
>
>
> _______________________________________________
> Lse-tech mailing list
> Lse-tech@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/lse-tech
>

-- Thanks

Jack Steiner (651-683-5302) (vnet 233-5302) steiner@sgi.com

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

Next message: Trever L. Adams: "Re: 2.4.14/16 load reboots"
Previous message: Richard Gooch: "Re: devfs unable to handle permission: 2.4.17-pre[4,5]"
Next in thread: Paul Jackson: "Re: [Lse-tech] [RFC] [PATCH] Scalable Statistics Counters"
Reply: Paul Jackson: "Re: [Lse-tech] [RFC] [PATCH] Scalable Statistics Counters"