we've experienced deadlocks that appear to be caused by the loop in
sock_alloc_send_skb(). To trigger this, you need to combine heavy
network load with memory pressure. In this situation, sock_wmalloc()
can fail because it really can't allocate any more memory, even though
the send buffer limit for this socket is not yet exhausted.
If that happens, and the socket uses GFP_ATOMIC allocation, the while (1)
loop in sock_alloc_send_skb() will endlessly spin, without ever calling
schedule(), and all the time holding the kernel lock ...
Do you agree that this is a problem? What do you think about this fix:
diff -ur linux-2.2.19/net/core/sock.c linux-2.2.19-s390/net/core/sock.c
--- linux-2.2.19/net/core/sock.c Sun Mar 25 18:37:41 2001
+++ linux-2.2.19-s390/net/core/sock.c Thu May 10 14:02:11 2001
@@ -752,9 +752,11 @@
break;
try_size = fallback;
}
- skb = sock_wmalloc(sk, try_size, 0, sk->allocation);
+ skb = sock_wmalloc_err(sk, try_size, 0, sk->allocation, &err);
if (skb)
break;
+ if (err)
+ goto failure;
/*
* This means we have too many buffers for this socket already.
Test case is simple: keep spawning lots of long-running 'ping' processes
until physical memory is exhausted.
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
-- Dr. Ulrich Weigand Linux for S/390 Design & Development IBM Deutschland Entwicklung GmbH, Schoenaicher Str. 220, 71032 Boeblingen Phone: +49-7031/16-3727 --- Email: Ulrich.Weigand@de.ibm.com
- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/