Re: SMP/cc Cluster description

Larry McVoy (lm@bitmover.com)
Tue, 4 Dec 2001 19:23:17 -0800

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Nathan Bryant: "Re: i810 audio patch"
Previous message: Mike Fedyk: "Re: [kbuild-devel] Converting the 2.5 kernel to kbuild 2.5"
In reply to: David S. Miller: "Re: SMP/cc Cluster description"
Next in thread: David S. Miller: "Re: SMP/cc Cluster description"

On Tue, Dec 04, 2001 at 06:36:01PM -0800, David S. Miller wrote:
> From: Larry McVoy <lm@bitmover.com>
> Date: Tue, 4 Dec 2001 16:36:46 -0800
>
> OK, so start throwing stones at this. Once we have a memory model that
> works, I'll go through the process model.
>
> What is the difference between your messages and spin locks?
> Both seem to shuffle between cpus anytime anything interesting
> happens.
>
> In the spinlock case, I can thread out the locks in the page cache
> hash table so that the shuffling is reduced. In the message case, I
> always have to talk to someone.

Two points:

a) if you haven't already, go read the Solaris doors paper. Now think about
a cross OS instead of a cross address space doors call. They are very
similar. Think about a TLB shootdown, it's sort of a doors style cross
call already, not exactly the same, but do you see what I mean?

b) I am not claiming that you will have less threading in the page cache.
I suspect it will be virtually identical except that in my case 90%+
of the threading is in the ccCluster file system, not the generic code.
I do not want to get into a "my threading is better than your threading"
discussion. I'll even go so far as to say that this approach will have
more threading if that makes you happy, as long as we agree it is outside
of the generic part of the kernel. So unless I have CONFIG_CC_CLUSTER,
you pay nothing or very, very little.

Where this approach wins big is everywhere except the page cache. Every
single data structure in the system becomes N-way more parallel -- with
no additional locks -- when you boot up N instances of the OS. That's
a powerful statement. Sure, I freely admit that you'll add a few locks
to deal with cross OS synchronization but all of those are configed out
unless you are running a ccCluster. There is absolutely zero chance that
you can get the same level of scaling with the same number of locks using
the traditional approach. I'll go out on a limb and predict it is at
least 100x as many locks. Think about it, there are a lot of things that
are threaded simply because of some corner case that happens to show up
in some benchmark or workload. In the ccCluster case, those by and large
go away. Less locks, less deadlocks, less memory barriers, more reliable,
less filling, tastes great and Mikey likes it :-)

-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/






 Next message: Nathan Bryant: "Re: i810 audio patch"
 Previous message: Mike Fedyk: "Re: [kbuild-devel] Converting the 2.5 kernel to kbuild 2.5"
 In reply to: David S. Miller: "Re: SMP/cc Cluster description"

 Next in thread: David S. Miller: "Re: SMP/cc Cluster description"