The benefits of the kernel Athlon optimizations are higher memory bandwidth
for bulk copies/clears and less cache pollution. But LMbench isn't going to
show any difference, because its tests use generic x86 mem*() functions, not
Athlon-optimized SSE memory routines like in the Athlon kernel.
*Local* Communication bandwidths in MB/s - bigger is better
Host OS Pipe AF TCP File Mmap Bcopy Bcopy Mem Mem
UNIX reread reread (libc) (hand) read
write
--------- ------------- ---- ---- ---- ------ ------ ------ ------ ---- ----
-
Athlon-1 Linux 2.4.10- 847. 685. 311. 332.4 501.3 176.2 206.2 471.
342.5
Athlon-2 Linux 2.4.10- 882. 586. 187. 331.6 510.2 177.6 207.1 484.
343.5
i686-1 Linux 2.4.10- 863. 586. 299. 320.0 510.2 176.3 206.6 472.
342.6
i686-2 Linux 2.4.10- 874. 318. 199. 319.6 520.2 177.7 206.8 486.
343.5
It should be obvious that LMbench uses sub-optimal memory routines, since
the numbers for "Bcopy" and "Mem read/write" bandwidth are so much lower
than pipe and AF UNIX bandwidths! (the pipe/UNIX tests are basically
equivalent to Bcopy, plus extra user<->kernel transitions and context
switches).
The only cases where I'd expect the Athlon kernel to do better on LMbench
are essentially kernel memcpy() benchmarks - pipe and AF UNIX bandwidths.
I'm not sure if the kernel pipe and UNIX socket code actually uses
Athlon-optimized routines; in any case the small buffer sizes (eg 4KB for
pipes) could be hiding any performance gain.
Regards,
Dan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/