>>This came up about a year back when zerocopy networking was merged.
>>Intel boxes started running more slowly purely because of the 8+8
>>alignment thing.
>>I changed tcp to use a different copy if either source or dest were
>>not eight-byte aligned, and found that the resulting improvement
>>across a mixed networking load was only 1%. Your numbers are higher,
>>so perhaps there are different alignments in the mix...
>I will test on other workloads when I return back to work after OLS
>and vacation. However we tested an earlier version of this patch on
>Netbench using sendfile and gained around 3% improvement. The baseline
>profiling showed that Netbench was spending 10% in generic_copy_to_user.
>The tcp options are aligned on an 4-byte boundary, so depending on the
>options used the address to the data (source address to the
>generic_copy_to_user) should fall on an 4 or 8 byte boundary. I agree
>with you more test is needed.
One correction to the above statement...
Due to the tcp options alignment on an 4-byte boundary the source
address to the generic_copy_to_user should fall on an 4, 8, 12
and 16 etc., byte boundary. However,I have seen that 4 and 12 byte
alignment using unrolled loop performed better than the string copy.
>>One question: have you tested on other CPU types? This problem is
>>very specific to Intel hardware. On AMD, the eight-byte alignement
>>artifact does not exist at all. It could be that your patch is not
>>desirable on such CPUs?
>I tested only on Pentium II and III. I will test it on Pentium IV.
>When I said 8-byte alignment, it is 8 and greater. I will
>try to check out AMD also.
Same correction here, 8-byte alignment means 8, 16 or greater.
Regards,
Mala
Mala Anand
E-mail:manand@us.ibm.com
Linux Technology Center - Performance
Phone:838-8088; Tie-line:678-8088
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/