Re: 2.5.70 (virgin) hangs running SDET

Andrew Morton (akpm@digeo.com)
Tue, 10 Jun 2003 12:46:13 -0700


"Martin J. Bligh" <mbligh@aracnet.com> wrote:
>
> OK, well backing out dcache_lock-vs-tasklist_lock-take-3.patch does
> indeed seem to fix the problem. It's done 5.5 whole cycles now, and
> still going strong.

Well that patch fixed a real bug. It abviously added or exposed another
one though.

I continue to wonder if that task_struct is scrogged. There seem to be no
other processes holding the lock.

Might be interesting to give the below patch a run with memory debug
enabled.

Also spinlock debugging enabled. It would be nice to run with preempt and
sleep-in-spinlock debugging enabled too, but I think preempt is broken on
NUMA?

(Anton had a patch which just enabled the beancounting which is needed for
might_sleep() effectiveness without turning on preempt, but it needs more
work for ia32)

diff -puN fs/proc/base.c~a fs/proc/base.c
--- 25/fs/proc/base.c~a Tue Jun 10 12:40:44 2003
+++ 25-akpm/fs/proc/base.c Tue Jun 10 12:42:01 2003
@@ -1374,6 +1374,11 @@ struct dentry *proc_pid_lookup(struct in

dentry->d_op = &pid_base_dentry_operations;

+ if (task->bite_me != 0) {
+ printk("corrupted task_struct: 0x%x\n", task->bite_me);
+ BUG();
+ }
+
spin_lock(&task->proc_lock);
task->proc_dentry = dentry;
d_add(dentry, inode);
diff -puN include/linux/sched.h~a include/linux/sched.h
--- 25/include/linux/sched.h~a Tue Jun 10 12:40:44 2003
+++ 25-akpm/include/linux/sched.h Tue Jun 10 12:40:44 2003
@@ -330,6 +330,8 @@ struct task_struct {
struct list_head run_list;
prio_array_t *array;

+ u32 bite_me;
+
unsigned long sleep_avg;
unsigned long last_run;

_

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/