Just look at the x86-64 port (2.5)
The code generated by gcc 3.1 is a lot better than the inline macros.
For example it knows the alignment of target/source and emits
unrolled big (4,2,1bytes) moves and some tricks.
We're using that on x86-64. For tricky cases (it cannot determine length
or alignment) it'll still call out to out of line functions, which
should be optimized, notably not just use rep ; s... like the inline macros
which isn't very efficient on Athlon.
-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/