Re: [PATCH 2.5.52] NUMA scheduler (1/2)

Erich Focht (efocht@ess.nec.de)
Fri, 20 Dec 2002 18:44:32 +0100

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Linus Torvalds: "Re: PTRACE_GET_THREAD_AREA"
Previous message: Linus Torvalds: "Re: PTRACE_GET_THREAD_AREA"

Hi Christoph,

thanks for the comments, I'll try to fix the things you mentioned when
preparing the next patch and add some more comments.

Regards,
Erich

On Friday 20 December 2002 16:17, Christoph Hellwig wrote:
> diff -urN a/arch/i386/kernel/smpboot.c b/arch/i386/kernel/smpboot.c
> --- a/arch/i386/kernel/smpboot.c 2002-12-16 03:07:56.000000000 +0100
> +++ b/arch/i386/kernel/smpboot.c 2002-12-18 16:53:12.000000000 +0100
> @@ -1202,6 +1202,9 @@
> void __init smp_cpus_done(unsigned int max_cpus)
> {
> zap_low_mappings();
> +#ifdef CONFIG_NUMA
> + build_node_data();
> +#endif
>
> I think it would be much nicer if you had a proper stub for !CONFIG_NUMA..
>
> @@ -20,6 +20,7 @@
> #include <asm/mmu.h>
>
> #include <linux/smp.h>
> +#include <asm/topology.h>
>
> Another header in sched.h?? And it doesn't look like it's used at all.
>
> #include <linux/sem.h>
> #include <linux/signal.h>
> #include <linux/securebits.h>
> @@ -446,6 +447,9 @@
> # define set_cpus_allowed(p, new_mask) do { } while (0)
> #endif
>
> +#ifdef CONFIG_NUMA
> +extern void build_node_data(void);
> +#endif
>
> The ifdef is superflous.
>
> diff -urN a/kernel/sched.c b/kernel/sched.c
> --- a/kernel/sched.c 2002-12-16 03:08:14.000000000 +0100
> +++ b/kernel/sched.c 2002-12-18 16:53:13.000000000 +0100
> @@ -158,6 +158,10 @@
> struct list_head migration_queue;
>
> atomic_t nr_iowait;
> +
> + unsigned long wait_time;
> + int wait_node;
> +
>
> Here OTOH a #if CONFIG_MUA could help to avoid a little bit of bloat.
>
> } ____cacheline_aligned;
>
> +#define cpu_to_node(cpu) __cpu_to_node(cpu)
>
> I wonder why we don't have a proper cpu_to_node() yet, but as long
> as it doesn't exist please use __cpu_to_node() directly.
>
> +#define LOADSCALE 128
>
> Any explanation?
>
> +#ifdef CONFIG_NUMA
>
> sched.c uses #if CONFIG_FOO, not #ifdef CONFIG_FOO, it would be cool
> if you could follow the style of existing source files.
>
> +/* Number of CPUs per node: sane values until all CPUs are up */
> +int _node_nr_cpus[MAX_NUMNODES] = { [0 ... MAX_NUMNODES-1] = NR_CPUS };
> +int node_ptr[MAX_NUMNODES+1]; /* first cpu of node (logical cpus are
> sorted!)*/ +#define node_ncpus(node) _node_nr_cpus[node]
>
> Parametrized macros and variables aren't in the ßame namespace, what about
> just node_nr_cpus for the macro, too. And should these be static?
>
> +
> +#define NODE_DELAY_IDLE (1*HZ/1000)
> +#define NODE_DELAY_BUSY (20*HZ/1000)
>
> Comments, please..
>
> +/* the following macro implies that logical CPUs are sorted by node number
> */ +#define loop_over_node(cpu,node) \
> + for(cpu=node_ptr[node]; cpu<node_ptr[node+1]; cpu++)
>
> Move to asm/topology.h?
>
> + ptr=0;
> + for (n=0; n<numnodes; n++) {
>
> You need to add lots of space to match Documentation/odingStyle.. :)
>
> + for (cpu = 0; cpu < NR_CPUS; cpu++) {
> + if (!cpu_online(cpu)) continue;
>
> And linebreaks..
>
> Btw, what about a for_each_cpu that has the cpu_online check in topology.h?
>
> + /* Wait longer before stealing if own node's load is average. */
> + if (NODES_BALANCED(avg_load,nd[this_node].average_load))
>
> Shouldn't NODES_BALANCED shout less and be an inline called nodes_balanced?
>
> + this_rq->wait_node = busiest_node;
> + this_rq->wait_time = jiffies;
> + goto out;
> + } else
> + /* old most loaded node: check if waited enough */
> + if (jiffies - this_rq->wait_time < steal_delay)
> + goto out;
>
> That indentation looks really strange, why not just
>
> /* old most loaded node: check if waited enough */
> } else if (jiffies - this_rq->wait_time < steal_delay)
> goto out;
>
> +
> + if ((!CPUS_BALANCED(nd[busiest_node].busiest_cpu_load,*nr_running))) {
>
> Dito, the name shouts a bit too much
>
> +#endif
>
> #endif /* CONFIG_NUMA */
>
> -#define BUSY_REBALANCE_TICK (HZ/4 ?: 1)
> +#define BUSY_REBALANCE_TICK (HZ/5 ?: 1)
>
> And explanation why you changed that constant?
>
> p = req->task;
> - cpu_dest = __ffs(p->cpus_allowed);
> + cpu_dest = __ffs(p->cpus_allowed & cpu_online_map);
> +
>
> This looks like a bugfix valid without the rest of the patch.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Linus Torvalds: "Re: PTRACE_GET_THREAD_AREA"
Previous message: Linus Torvalds: "Re: PTRACE_GET_THREAD_AREA"