You say there [of the semaphore field in requests]:
Drivers can use it if they want completion to be signalled for a request
(see end_that_request_last). However, see 2.4.7 where it's not ->waiting
and the interface changed.
end_that_request_ up's the semaphore if it's nonnull, but for that to make
sense, someone must down it. Nobody does (in ll_rw_blk.c), So I assume it's
entirely for my use in controlling access to the request.
So I don't believe it's involved in my problem.
> > 2 processors + 1 userspace helper daemon on device = no bug
> > 2 processors + 2 userspace helper daemon on device = bug (lockup)
> > 1 processors + 1 userspace helper daemon on device = no bug
> > 1 processors + 2 userspace helper daemon on device = no bug
> And I'll restate here what I said then too -- SHOW THE CODE! Or send me
> a crystal ball and I'll be happy to solve your races for you.
Crystal balls would be nice. I'll see if I can get it down to something
sendable. I can confirm the above results since I tried them again. After
about 1.2GB of transfers, one cpu ended up not listening to NMI and the
other was stuck in a spinlock (__down_writelock_failed, from memory),
having called my request fn from the generic_unplug_device function,
which in turn called a write spinlock on the device private request
queue. The spinlocks aren't around sections of code that can sleep.
Peter
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/