athlon tbird 1133.402Mhz, 133Mhz fsb, 512MB + 128MB pc133sdram No hd
Here are 3 averages of 3 runs each
gcc (GCC) 3.2.1 20021020 (Debian prerelease)
flags : -O3 -march=athlon-tbird -mcpu=athlon-tbird -falign-loops=4
copy_page function 'warm up run' took 17577 cycles per page
copy_page function '2.4 non MMX' took 23659 cycles per page
copy_page function '2.4 MMX fallback' took 23894 cycles per page
copy_page function '2.4 MMX version' took 17549 cycles per page
copy_page function 'faster_copy' took 10452 cycles per page
copy_page function 'even_faster' took 10159 cycles per page
copy_page function 'no_prefetch' took 9508 cycles per page
flags : -O0 -march=i686 -mcpu=i686
copy_page function 'warm up run' took 18377 cycles per page
copy_page function '2.4 non MMX' took 23688 cycles per page
copy_page function '2.4 MMX fallback' took 23671 cycles per page
copy_page function '2.4 MMX version' took 18407 cycles per page
copy_page function 'faster_copy' took 10091 cycles per page
copy_page function 'even_faster' took 10283 cycles per page
copy_page function 'no_prefetch' took 9907 cycles per page
gcc 2.95.4
flags : -O0 -march=i686 -mcpu=i686
copy_page function 'warm up run' took 18343 cycles per page
copy_page function '2.4 non MMX' took 23655 cycles per page
copy_page function '2.4 MMX fallback' took 23646 cycles per page
copy_page function '2.4 MMX version' took 18324 cycles per page
copy_page function 'faster_copy' took 10146 cycles per page
copy_page function 'even_faster' took 10438 cycles per page
copy_page function 'no_prefetch' took 9913 cycles per page
----------------------------------------------------------------------
avg difference due to compiler
copy_page function 'warm up run' took +-533 cycles per page
copy_page function '2.4 non MMX' took +-22 cycles per page
copy_page function '2.4 MMX fallback' took +-496 cycles per page
copy_page function '2.4 MMX version' took +-572 cycles per page
copy_page function 'faster_copy' took +-241 cycles per page
copy_page function 'even_faster' took +-186 cycles per page
Other options may give even greater differences but it's difficult to
try them all so I thought this should give an example of proof. The
options did not do better than all of the runs of another option but
instead did better and worse depending on the test.
copy_page function 'warm up run' test1
copy_page function '2.4 non MMX' test3
copy_page function '2.4 MMX fallback' test3
copy_page function '2.4 MMX version' test1
copy_page function 'faster_copy' test2
copy_page function 'even_faster' test1
copy_page function 'no_prefetch' test1
bandwidth test1 = 488.3MB/sec (not ddr like other setups)
bandwidth test2 = 468.6MB/sec
bandwidth test3 = 468.3MB/sec
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/