That's NUMA hardware. The per-hashqueue locking change made
a big improvement on that hardware. But when it was used on
Intel hardware it made no measurable difference at all.
Sorry, but the patch adds compexity and unless a significant
throughput benefit can be demonstrated on less exotic hardware,
why use it?
Here are kumon's test results from March, with and without
the hashed lock patch:
-------- Original Message --------
Subject: Re: [Fwd: Re: [Lse-tech] AIM7 scaling, pagecache_lock, multiqueue scheduler]
Date: Thu, 15 Mar 2001 18:03:55 +0900
From: kumon@flab.fujitsu.co.jp
Reply-To: kumon@flab.fujitsu.co.jp
To: Andrew Morton <andrewm@uow.edu.au>
CC: kumon@flab.fujitsu.co.jp, ahirai@flab.fujitsu.co.jp,John Hawkes <hawkes@engr.sgi.com>,kumon@flab.fujitsu.co.jp
References: <3AB032B3.87940521@uow.edu.au>,<3AB0089B.CF3496D2@uow.edu.au><200103150234.LAA28075@asami.proc><3AB032B3.87940521@uow.edu.au>
OK, the followings are a result of our brief measurement with WebBench
(mindcraft type) of 2.4.2 and 2.4.2+pcl .
Workload: WebBench 3.0 (static get)
Machine: Profusion 8way 550MHz/1MB cache 1GB mem.
Server: Apache 1.3.9-8 (w/ SINGLE_LISTEN_UNSERIALIZED_ACCEPT)
obtained from RedHat.
Clients: 32 clients each has 2 requesting threads.
The following number is Request per sec.
242 242+pcl ratio
-------------------------------------
1SMP 1,603 1,584 0.99
2(1+1)SMP 2,443 2,437 1.00
4(1+3)SMP 4,420 4,426 1.00
8(4+4)SMP 5,381 5,400 1.00
#No idle time observed in the 1 to 4 SMP runs.
#Only 8 SMP cases shows cpu-idle time, but it is about 2.1-2.8% of the
#total CPU time.
Note: The load of two buses of Profusion system isn't balance, because
the number of CPUs on each bus is unbalance.
Summary:
From the above brief test, (+pcl) patch doens't show the measurable
performance gain.
-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/