Re: [Lse-tech] Re: RFC: patch to allow lock-free traversal of lists with insertion

Keith Owens (kaos@ocs.com.au)
Wed, 10 Oct 2001 21:54:39 +1000


On Wed, 10 Oct 2001 05:05:10 +0000 (UTC),
torvalds@transmeta.com (Linus Torvalds) wrote:
>Now, before people get all excited, what is this particular code
>actually _good_ for?
>
>Creating a lock-free list that allows insertion concurrently with lookup
>is _easy_.
>
>But what's the point? If you insert stuff, you eventually have to remove
>it. What goes up must come down. Insert-inane-quote-here.
>
>And THAT is the hard part. Doing lookup without locks ends up being
>pretty much worthless, because you need the locks for the removal
>anyway, at which point the whole thing looks pretty moot.

<pedantic>

It is possible to do completely lock free queue management, I have done
it (once). There are several major requirements :-

* Storage must never change its type (type stable storage). Once a
block has been assigned as struct foo, it must always contain struct
foo. This can be relaxed slightly as long as the data types have a
common header format.

The biggest problem this causes is that storage can never be released
and unmapped. You never know if somebody is going to follow an old
pointer to access storage that you freed. You must put the storage
on a free list for struct foo instead of unmapping it, to maintain
the type stable storage.

* Every piece of code that traverses the data tree for structure foo
must take a copy of each struct foo into local storage. After taking
a copy it must verify that its local copy is self consistent, via
validity indicators in each struct. The validity indicators are
typically generation numbers, whatever you use, the indicators must
be atomically updated and atomically read, with suitable wmb and rmb
barriers for machines with weak memory ordering.

* After following a pointer in your local copy of struct foo to access
another data area, you must verify that the master version of your
local copy has not been changed. If the master copy has changed then
the pointer you just followed is no longer reliable. In almost every
case you have to start the traversal from the beginning. It makes
for very convoluted reader and updater code.

</pedantic>

The only reason I used lock free read and update was to hook into an
existing system that mandated sub second responses. Some of the new
operations that were performed during list traversal and update were
subject to unbounded delays that were completely outside my control. I
could not afford to lock the data structures during traversal because
any delay in the new operations would completely stall the existing
system, also the existing system had to be able to retrieve and delete
data for the new operations at any time.

I don't recommend doing lock free unless you have absolutely no
alternative. It requires convoluted code, it is difficult to debug, it
is expensive on memory bandwidth (forget zero copy) and tends to be
expensive on storage as well.

If you are interested in lock free work, take a look at
http://www.cs.pitt.edu/~moir/papers.html, particularly Practical
Implementations of Non-Blocking Synchronization Primitives. But
beware, that paper has an error which caused intermittent bugs that
took me 5 months to track down. Section 3.3, proc Copy, line 6 should
be

6: y := addr->data[i];

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/