Whee. What a piece of "interesting hardware".
> The problem is that the
> current memory barrier in smp_call_function is a simple gcc optimisation
> barrier, we really need a memory barrier there so that the other cpu's
> get the updated value when they get IPI'd immediately afterwards.
I really think that your patch is a bit questionable. The wmb() on the
sender side looks correct as-is, and to me it looks like it is the
_receiver_ side that might need a read-barrier before it reads
call_data().
I really thought that the interrupt should be a serializing event, but I
can't find that in the intel databooks (they make "iret" a serializing
instruction, but not _taking_ an interrupt, unless I missed something).
Can you check if you get the right behaviour if you have a read barrier in
the receive path? I actually think we need a full mb() on _both_ paths,
since the current wmb() only guarantees that writes will be seen "in
order wrt other writes", and while the IPI generation really _is_ a write
in itself, I wonder if the Intel CPU's might not consider it something
special..
I'm not opposed to your patch per se, but I really do believe that it is
potentially wrong. If we have no serialization on the read side, your
patch might not actually fully plug the real bug, only hide it. I'd like
to know if a read barrier on the read side (without the full barrier on
the write side) is sufficient. It _should_ be (but see my worry about the
APIC write maybe being considered "outside the scope" of the normal cache
coherency protocols).
Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/