buffer_head slab memory leak, Linux bug?

Elisheva Alexander (ealexand@checkpoint.com)
Sun, 2 Sep 2001 14:01:26 +0300


Dear kernel list,

if anyone can send me some pointers or hints on how to tackle this bug i
will be very happy.

on an SMP machine i get:
"stuck on TLB IPI wait (CPU#1)"
the driver that i am debugging uses a spin lock, and sometimes we take the
lock for a pretty long time.
this happens during heavy load, which is why i think that the problem
is that in smp_flush_tlb() in ./arch/i386/kernel/smp.c, one of the CPUs gets
all upset that the other CPU is stuck in the lock for too long, and releases
it before it was ment to be released.

things i did that didn't help:

a patch that fixed a similar problem in reiserfs
(http://www.geocrawler.com/mail/msg.php3?msg_id=3962182&list=3455)
the patch for the fast pentium problem, since i have a pentium III.
(http://www.ultraviolet.org/mail-archives/reiserfs.2000/6201.html)

i put a breakpoint when this occurs using kGDB, but i am not able to get
the registers (and stack) of the CPU that is stuck, only the one that
prints the message. so i don't really know where this occurs
in our own code.
does anyone know how i may extract the stack of the second CPU at the
time of this error?

I am using an Intel pentium III with dual CPU.
I am debugging check point's firewall and vpn modules, with kernel-2.2.14
from the redhat RPM, but this also happens with the latest 2.2.19.

it happens quite often (at random), so it's not too hard to recreate it.

thanks a lot.

(please CC me, as i am not subscribed to the list.)

-- 
 Elisheva Alexander                          Software Developer
================================================================
 Email from people at checkpoint.com does not usually represent 
 official policy of Check Point (TM) Software Technologies Ltd.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/