Re: SMP/cc Cluster description

Davide Libenzi (davidel@xmailserver.org)
Thu, 6 Dec 2001 11:11:27 -0800 (PST)


On Thu, 6 Dec 2001, Jeff V. Merkey wrote:

> Guys,
>
> I am the maintaner of SCI, the ccNUMA technology standard. I know
> alot about this stuff, and have been involved with SCI since
> 1994. I work with it every day and the Dolphin guys on some huge
> supercomputer accounts, like Los Alamos and Sandia Labs in NM.
> I will tell you this from what I know.
>
> A shared everything approach is a programmers dream come true,
> but you can forget getting reasonable fault tolerance with it. The
> shared memory zealots want everyone to believe ccNUMA is better
> than sex, but it does not scale when compared to Shared-Nothing
> programming models. There's also a lot of tough issues for dealing
> with failed nodes, and how you recover when peoples memory is
> all over the place across a nuch of machines.

If you can afford rewriting/rearchitecting your application it's pretty
clear that the share-nothing model is the winner one.
But if you can rewrite your application using a share-nothing model you
don't need any fancy clustering architectures since beowulf like cluster
would work for you and they'll give you a great scalability over the
number of nodes.
The problem arises when you've to choose between a new architecture
( share nothing ) using conventional clusters and a
share-all/keep-all-your-application-as-is one.
The share nothing is cheap and gives you a very nice scalability, these
are the two mayor pros for this solution.
On the other side you've a vary bad scalability and a very expensive
solution.
But you've to consider :

1) rewriting is risky

2) good developers to rewrite your stuff are expensive ( $100K up to $150K
in my area )

These are the reason that let me think that conventional SMP machines will
have a future in addition to my believing that technology will help a lot
to improve scalability.

- Davide

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/