Problem with this approach is that it doesn't prevent the linker
from placing other data in the same cacheline as the aligned
lock, at higher addresses.
The work which Juergen Doelle did recently addresses this - just
special-case the five hot spinlocks with additional padding at
the end.
http://www.geocrawler.com/archives/3/5312/2001/7/0/6271339/
With the toolchain I'm using, with latest -ac kernels,
we happened to get lucky:
00000000c03053c0 D pagecache_lock
00000000c03053e0 D pagemap_lru_lock
00000000c03053e4 d file_shared_mmap
00000000c03053f0 d file_private_mmap
The linker decided to not put anything in pagecache_lock's line,
and vm_operations_struct is read-only....
Juergen demonstrated a 17% speedup of RAM-only dbench on an 8-way with
his patch.
And we perhaps should do something about these:
00000000c0305e64 d hash_table_lock
00000000c0305e68 d lru_list_lock
00000000c0305e6c d unused_list_lock
All in the same cacheline. These tend to be taken back-to-back,
so splitting these apart may be less effective. Be interesting
to measure these independently.
Juergen, I'd suggest you dust off that patch, add the conditionals
which make it a no-op on uniprocessor and submit it. It's such a
simple, easy way to gain significant performance improvement - it's
worth doing.
-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/