Re: NMI handling rework for x86

Dipankar Sarma (dipankar@in.ibm.com)
Fri, 15 Nov 2002 14:20:31 +0530

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: William Lee Irwin III: "Re: [PATCH] swsuspend and CONFIG_DISCONTIGMEM=y"
Previous message: Andrew Morton: "Re: 2.5.47-ac4 panic on boot."

On Fri, Nov 15, 2002 at 03:18:22AM -0500, Zwane Mwaikambo wrote:
> On Fri, 15 Nov 2002, Dipankar Sarma wrote:
>
> > Once you remove a handler from the list, any subsequent NMI is *not*
> > going to see the handler. So even if another CPU is executing the same
> > handler, if you wait for the RCU callback, you can guarantee that
> > no-one is executing the deleted handler. RCU will wait for all the
> > CPUs to context switch or execute user-level code atleast once.
>
> I think you're confusing NMI handling, they aren't like your normal
> interrupts. You're not going to see that context switch.

Let us examine the race -

CPU #0 CPU #1 CPU #2

(free_nmi P) execute NMI P syscall
delete from list

call_rcu NMI (doesn't see P)

wait for completion process in kernel process in kernel
(context switch)

context switch context switch

----------- RCU complete NMI handler P must be complete here --------------

RCU handler tasklet
callback: complete()

nmi freeing task
wakes up and proceeds.

> > Corey's code doesn't rely on completion() to ensure this, it relies
> > on RCU to make sure that nobody is running the handler. The key is
> > that once the pointers between the prev and the next of the deleted
>
> Can you change prev and next atomically?

You don't have to, the traversal during __list_for_each_rcu() is done
in only one direction. So writing out the next pointer is sufficiently
atomic for subsequent NMIs not to see the deleted handler. Either you
see the deleted handler or you don't.

> > spin_trylock modifies the lock cacheline, so cacheline bouncing.
>
> At a fair interrupt rate i'd rather have that fill my caches, less time
> spent in the NMI handler means more overall system time.

It isn't going to fill your caches, it is going to bounce around from
CPU to CPU on every NMI since every NMI will modify the cache line.
So you hurt performance.

Thanks

-- 
Dipankar Sarma  <dipankar@in.ibm.com> http://lse.sourceforge.net
Linux Technology Center, IBM Software Lab, Bangalore, India.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Next message: William Lee Irwin III: "Re: [PATCH] swsuspend and CONFIG_DISCONTIGMEM=y"
Previous message: Andrew Morton: "Re: 2.5.47-ac4 panic on boot."