Re: user-mode port 0.44-2.4.7

Davide Libenzi (davidel@xmailserver.org)
Tue, 24 Jul 2001 10:31:57 -0700 (PDT)

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Nico Schottelius: "ps2/ new data for mouse protocol (fwd msg attached)"
Previous message: Damien TOURAINE: "Call to the scheduler..."

On 24-Jul-2001 Linus Torvalds wrote:
> But it shouldn't optimize it that way _every_ time. You only want the
> specific optimizations in specific places. Which is why you use
> "barrier()" or volatile in the _code_, not the data declaration.
>
> For example, if you're holding a lock that protects it or you otherwise
> know that nothing is touching it at the same time, you do NOT want to have
> the compiler generate bad code.
>
> And trust me, "volatile" generates _bad_ code a lot more often than it
> generates correct code.
>
> Look at this:
>
> volatile int i;
> int j;
>
> int main()
> {
> i++;
> j++;
> }
>
> turning into this:
>
> main:
> movl i,%eax
> incl %eax
> movl %eax,i
> incl j
> ret
>
> Now, ask yourself why? The two _should_ be the same. Both do a
> read-modify-write cycle. But the fact is, that when you add "volatile" to
> the register, it really tells gcc "Be afraid. Be very afraid. This user
> expects some random behaviour that is not actually covered by any
> standard, so just don't ever use this variable for any optimizations, even
> if they are obviously correct. That way he can't complain".

This is a too simple case, this is maybe better :

mov homer, %edx
...
...
...
... ( 101 asm ins )
loop:
cmp %edx, ...
ja out
...
inc %edx
...
jmp loop

You're right, it might be optimized with a barrier() but it's all kind of how
much times you're going to need one behaviour or the other.
When I'll need most of my access to be "strict" I'd like to have a way that avoid
me to spread the code with barries()s.

> Also note how the "incl j" instruction is actually _better_ from a
> "atomicity" standpoint than the "load+inc+store" instruction. In this
> case, adding a "volatile" actually made the accesses to "i" be _less_
> likely to be correct - you could have had an interrupt happen in between
> that also updated "i", and got lost when we wrote the value back.

Not that much if you look at how incl is "decomposed" internally ( w/o LOCK )
by the CPU. If you really care about j you need an atomic op here, in any case.

- Davide

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Nico Schottelius: "ps2/ new data for mouse protocol (fwd msg attached)"
Previous message: Damien TOURAINE: "Call to the scheduler..."