So we are going to waste that space in _all_ kernels just for Intel P4
being stupid? I compile fernels for _486_! At home I have a Duron.
I don't want to pay "P4 tax" ;)
Also, once it gets cached at first access, subsequent accesses won't
hurt that much, right?
> > This _probably_ makes execution slightly faster but on average
> > it costs you 7,5 bytes. This price is too high when you take into
> > account L1 instruction cache wastage and current bus/core clock
> > ratios.
>
> 7.5 bytes is not much compared to possibility of trashed cache or
> pipeline flush.
You definitely need to play with objdump -d just to see those 7.5 bytes
everywhere.
Pipeline flush is moot, you have jump penalty regrdless of alignment.
> Do you have execution time numbers of jump to 16-byte aligned address
> vs unaligned address?
I will do.
Let me say this clear:
If you do million iterations loop, maybe you should align it.
It is very dumb to align everything in the hope it will
magically make your kernel run 2x faster.
-- vda - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/