> SP> Well, not necessarily. It might be that data just hasn't "arrived" yet because
> SP> of the movntq instruction.
this is wrong; the CPU _internal_ view of the data is always consistent,
regardless of movntq vs movq.
It's only the EXTERNAL view that is slightly different. "sfence" takes
care of syncing that.
> So why it is oopses then?
> Also, we don't want this data to arrive late or whatever.
> fast_copy_page must copy page (make it so that memcpy()==0).
> If it does not, it is too much "optimized".
It does; but if you read it back from memory and is corrupted, your
chipset corrupted it.
> SP> One thing that also puzzels me is that my is the fast_copy_page() routine laid
> SP> out like this :
[snip]
A better way to do it is to bencmark several routines at
> startup time and pick the best one. It is done now
> for RAID xor'ing routine.
I benchmarked several versions, see the testprogram at
http://www.fenrus.demon.nl/athlon.c
The interleaved one is faster on athlons because it seems to help AMD's
register aliasing logic
to operate better....
Anyway, since this code works for like 99% of the machines, and only 1%
seems to be affected, it really really really looks like a hardware bug.
This is also more or less proven by the reports that certain
biosversions "break" working setups by doing things to the via chipset
that make it break....
Greetings,
Arjan van de Ven
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/