Re: Some functions are not inlined by gcc 3.2, resulting code is ugly
Denis Vlasenko (vda@port.imtp.ilyichevsk.odessa.ua)
Sun, 3 Nov 2002 22:17:13 -0200
On 3 November 2002 14:17, Jussi Laako wrote:
> On Sun, 2002-11-03 at 18:17, Denis Vlasenko wrote:
>
> Jump target 17e0 is aligned (with nops):
> > 17dd: 88 02 mov %al,(%edx)
> > 17df: 90 nop
> > 17e0: 89 d0 mov %edx,%eax
> > 17e2: 5a pop %edx
> >
> > 17ec: eb f2 jmp 17e0
> > <__constant_memcpy+0x20>
> >
> > 17fa: eb e4 jmp 17e0
> > <__constant_memcpy+0x20>
> >
> > 1800: eb de jmp 17e0
> > <__constant_memcpy+0x20>
> >
> > 187c: e9 5f ff ff ff jmp 17e0
> > <__constant_memcpy+0x20> 1881: eb 0d jmp 1890
> > <__constant_memcpy+0xd0> 1883: 90 nop
>
> ...
>
> > 188f: 90 nop
> > 1890: c1 e9 02 shr $0x2,%ecx
> > 1893: 89 d7 mov %edx,%edi
>
> And also jump target 1890 is aligned.
>
>
> I think the penalty of few NOPs is smaller than result of jump to
> unaligned address. This is especially true with P4 architecture.
Alignment does not eliminate jump. It only moves jump target to 16 byte
boundary. This _probably_ makes execution slightly faster but on average
it costs you 7,5 bytes. This price is too high when you take into account
L1 instruction cache wastage and current bus/core clock ratios.
--
vda
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/