Re: [PATCH] Runtime memory barrier patching

Andi Kleen (ak@muc.de)
Tue, 22 Apr 2003 00:45:46 +0200


On Tue, Apr 22, 2003 at 12:05:51AM +0200, Linus Torvalds wrote:
> I would not be surprised if both AMD and Intel are playing some
> "benchmarking games" by trying to select nop's that work badly for the
> other side, and then showing how _their_ new CPU's are so much better by
> having the compilers emit the "preferred" no-ops.
>
> But maybe I'm just too cynical. And I do suspect the Hammer optimization
> guide was meant for the 64-bit mode only, because I'm pretty certain even
> AMD does badly on prefixes at least in older CPU generations.

Yes, you're right. The Athlon recommendation is pretty complicated
(they have different versions for different registers to avoid lengthening
dependency chains), but it uses different two byte nops.

They explicitely say that it works well on "other CPUs" too which
likely means Intel.

Basically it is:

NOP2_EAX TEXTEQU <DB 08Bh,0C0h> ;MOV EAX, EAX
NOP3_EAX TEXTEQU <DB 08Dh,004h,020h> ;LEA EAX, [EAX]
NOP4_EAX TEXTEQU <DB 08Dh,044h,020h,000h> ;LEA EAX, [EAX+00]
;LEA EAX, [EAX+00];NOP
NOP5_EAX TEXTEQU <DB 08Dh,044h,020h,000h,090h>
;LEA EAX, [EAX+00000000]
NOP6_EAX TEXTEQU <DB 08Dh,080h,0,0,0,0>
;LEA EAX, [EAX*1+00000000]
NOP7_EAX TEXTEQU <DB 08Dh,004h,005h,0,0,0,0>
;LEA EAX, [EAX*1+00000000] ;NOP
NOP8_EAX TEXTEQU <DB 08Dh,004h,005h,0,0,0,0,90h>
;JMP
NOP9 TEXTEQU <DB 0EBh,007h,90h,90h,90h,90h,90h,90h,90h>

(+ the same for all other GPRs). Someone must have too much time
when documenting this ;)

But I'm not gonna resubmit with that unless if you insist on it...

-Andi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/