ok, now it's clear what the problem is. there are inuse-dirty inodes
that triggers a deadlock in the schedule-capable
try_to_sync_unused_inodes of 2.4.20rc2aa1 (that avoided me to backout an
otherwise corrupt lowlatency fix). It can trigger only in UP,
in SMP the other cpu can always run kupdate that will flush all dirty
inodes, so it would lockup one cpu as worse for 2.5 sec, this is
probably why I couldn't reproduce it, I assume all of you reproducing
the deadlock were running on an UP machine (doesn't matter if the kernel
was compiled for SMP or not).
Can you give a spin to this untested incremental fix?
--- 2.4.20rc2aa1/fs/inode.c.~1~ 2002-11-27 10:04:43.000000000 +0100
+++ 2.4.20rc2aa1/fs/inode.c 2002-12-02 01:09:05.000000000 +0100
@@ -459,13 +459,16 @@ static void try_to_sync_unused_inodes(vo
{
struct super_block * sb;
int nr_inodes = inodes_stat.nr_unused;
+ int global_pass = 0, local_pass;
restart:
spin_lock(&sb_lock);
+ local_pass = 0;
sb = sb_entry(super_blocks.next);
while (nr_inodes && sb != sb_entry(&super_blocks)) {
- if (list_empty(&sb->s_dirty)) {
+ if (local_pass < global_pass || list_empty(&sb->s_dirty)) {
sb = sb_entry(sb->s_list.next);
+ local_pass++;
continue;
}
sb->s_count++;
@@ -474,6 +477,7 @@ static void try_to_sync_unused_inodes(vo
if (sb->s_root)
nr_inodes = try_to_sync_unused_list(&sb->s_dirty, nr_inodes);
drop_super(sb);
+ global_pass = local_pass + 1;
goto restart;
}
spin_unlock(&sb_lock);
thanks,
Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/