They were trying to down(&rtnl_sem);
I did an audit of rtnl_sem usage and found an obscure case where an
unlock was missing:
--- ./net/ipv4/ipconfig.c 2001/11/01 01:51:16 1.1
+++ ./net/ipv4/ipconfig.c 2001/11/01 02:42:26
@@ -194,8 +194,10 @@
printk(KERN_ERR "IP-Config: Failed to open %s\n", dev->name);
continue;
}
- if (!(d = kmalloc(sizeof(struct ic_device), GFP_KERNEL)))
+ if (!(d = kmalloc(sizeof(struct ic_device), GFP_KERNEL))) {
+ rtnl_shunlock();
return -1;
+ }
d->dev = dev;
*last = d;
last = &d->next;
But that wasn't the problem. The problem was that atalkd was doing
atif_probe_device under some sock_ioctl/atalk_ioctl/atif_ioctl and it
was taking a long time, and holding the rtnl_sem semaphore the whole
time.
The code suggests what it could try for some multiple of 253
seconds, which is several minutes. During this time such simple
commands as "netstat -i" wont work.
It's probably OK for it to take a long time probing the network, but
it is really fair that it hold the semaphore for the whole time??
Fortunately it does interruptible waits so I can just kill atalkd (or
maybe even not run it).
NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/