Good point.
The figures I quoted for the no-hw-checksum case were still
using scatter/gather. That can be turned off as well and
it makes it a tiny bit quicker. So the table is now:
2.4.1-pre10-vanilla, using sendfile(): 29.6% CPU
2.4.1-pre10-vanilla, using read()/write(): 34.5% CPU
2.4.1-pre10+zercopy, using sendfile(): 18.2% CPU
2.4.1-pre10+zercopy, using read()/write(): 38.1% CPU
2.4.1-pre10+zercopy, using sendfile(): 22.9% CPU * hardware tx checksums disabled
2.4.1-pre10+zercopy, using read()/write(): 39.2% CPU * hardware tx checksums disabled
2.4.1-pre10+zercopy, using sendfile(): 22.4% CPU * hardware tx checksums and SG disabled
2.4.1-pre10+zercopy, using read()/write(): 38.5% CPU * hardware tx checksums and SG disabled
But that's not relevant.
I just retested everything. Yes, the zerocopy patch does
appear to decrease the efficiency of TCP on non-SG+checksumming
hardware by 5% - 10%. Others need to test...
With an RTL8139/8139too. CPU is 500MHz PII Celeron, uniprocessor:
2.4.1-pre10-vanilla, using sendfile(): 43.8% CPU
2.4.1-pre10-vanilla, using read()/write(): 54.1% CPU
2.4.1-pre10+zerocopy, using sendfile(): 43.1% CPU
2.4.1-pre10+zerocopy, using read()/write(): 55.5% CPU
Note that the 8139 only gets 10.8 Mbytes/sec here. it randomly
jumps up to 11.5 occasionally, but spends most of its time at
10.8. Hard to know what to make of this. Of course, if you're
using an 8139 you don't care about performance anyway :)
Contradictory results. rtl8139 doesn't do Rx checksums,
and I think has an extra copy in the driver, so caching effects
may be obscuring things here.
I can test with eepro100 in a couple of days.
-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/