<!-- received="Sun Apr 23 19:15:38 2000 EET DST" -->
<!-- sent="Sun, 23 Apr 2000 11:12:14 -0500" -->
<!-- name="Oliver Xymoron" -->
<!-- email="oxymoron@waste.org" -->
<!-- subject="Re: "movb" for spin-unlock (was Re: namei() query)" -->
<!-- id="" -->
<!-- inreplyto="Pine.LNX.4.10.10004221044380.11981-100000@waste.org" -->
<title>Linux-kernel mailing list archive 2000-17,: Re: "movb" for spin-unlock (was Re: namei() query)</title>
<body bgcolor="#FFFFFF"><font face="Arial,Helvetica">
<h1>Re: "movb" for spin-unlock (was Re: namei() query)</h1>
<b>Oliver Xymoron</b> (<a href="mailto:oxymoron@waste.org"><i>oxymoron@waste.org</i></a>)<br>
<i>Sun, 23 Apr 2000 11:12:14 -0500</i>
<p>
<ul>
<li> <b>Messages sorted by:</b> <a href="date.html#61">[ date ]</a><a href="index.html#61">[ thread ]</a><a href="subject.html#61">[ subject ]</a><a href="author.html#61">[ author ]</a>
<!-- next="start" -->
<li> <b>Next message:</b> <a href="0062.html">Juan J. Quintela: "Re: PROBLEM: umountfs and shutting down linux 2.3.99-pre5"</a>
<li> <b>Previous message:</b> <a href="0060.html">Fausto Saporito: "IPTABLES problem"</a>
<!-- nextthread="start" -->
<!-- reply="end" -->
</ul>
<hr>
<!-- body="start" -->
On Sat, 22 Apr 2000, Oliver Xymoron wrote:<br>
<p>
<i>&gt; On Sat, 22 Apr 2000, Jamie Lokier wrote:</i><br>
<i>&gt; </i><br>
<i>&gt; &gt; Linus Torvalds wrote:</i><br>
<i>&gt; &gt; &gt; I have conflicting reports about the safety of "movb" from Intel.</i><br>
<i>&gt; &gt; &gt; According to some people in there, "movb" is always safe, and there should</i><br>
<i>&gt; &gt; &gt; not be any need for any config option at all.</i><br>
<i>&gt; &gt; &gt; </i><br>
<i>&gt; &gt; &gt; However, at the same time my original contact at intel was Andy Glew, who</i><br>
<i>&gt; &gt; &gt; probably knows more about the ia32 core than anybody else I know. And Andy</i><br>
<i>&gt; &gt; &gt; says that yes "movb" is legal, but that some very early P6 steppings may</i><br>
<i>&gt; &gt; &gt; be buggy. And Andy is God.</i><br>
<i>&gt; &gt; </i><br>
<i>&gt; &gt; That comment in &lt;asm-i386/spinlock.h&gt; is rather tantalising.  It says</i><br>
<i>&gt; &gt; don't use "movb" because it doesn't work but gives no clues why.</i><br>
<i>&gt; &gt; </i><br>
<i>&gt; &gt; I still have the thread where this was hashed out.  And it seemed very</i><br>
<i>&gt; &gt; few people ended up understanding the precise reason for not using</i><br>
<i>&gt; &gt; "movb".  Not me :-(</i><br>
<i>&gt; </i><br>
<i>&gt; There are very few things that could cause the movb to be a problem. For</i><br>
<i>&gt; instance, it can't be in the cache coherency protocol as the unlock can</i><br>
<i>&gt; be lazy at it likes and still be safe. My only guess is that somehow the</i><br>
<i>&gt; movb can get scheduled ahead of reads or writes inside the critical</i><br>
<i>&gt; section. If that's the case, then the whole coherency scheme is broken,</i><br>
<i>&gt; no? We'd need to rethink quite a number of things we've presumed safe.</i><br>
<i>&gt; My guess is the whole thing is apocryphal.</i><br>
<p>
Ok, I read the Pentium Pro errata, available at <br>
<p>
 <a href="http://developer.intel.com/design/pro/specupdt/242689.htm">http://developer.intel.com/design/pro/specupdt/242689.htm</a><br>
<p>
and the issues relevant to MP locking appear to be 1, 39, 41, 51, 66, and<br>
92. All but 1 and 92 were fixed in the PPro. Some of them are about<br>
processors having inconsistent MTRR settings (which should be a non-issue<br>
for us) while the others are about cache coherency and snooping. From my<br>
reading the latter allow may allow violations of write-ordering but none<br>
that would allow a second processor to acquire a lock before the first had<br>
released it. This is not surprising as the spinlock *would have to see<br>
someone unlock before the unlock actually happened*. That'd be spooky. So<br>
again, the unlock is just inherently safer than the lock side. Everyone<br>
feel free to double-check this, but I still see no reason we can't use the<br>
faster movb.<br>
<p>
Below is Manfred's lock test code if people with Pentium Pros want to<br>
bang on it. If it locks up, we have a problem. Set USE_MB to one to use<br>
the mov-based unlock. My dual PPro is a later stepping 9 and seems to work<br>
just fine. If you have an SMP system with steppings 1-8 (only 1, 2, 6, and<br>
7 should be out there) and can confirm this works for you, that'd be<br>
great.<br>
<p>
/*<br>
 * movopt: test for Intel memory ordering.<br>
 * Copyright (C) 1999 by Manfred Spraul.<br>
 *<br>
 * Redistribution of this file is permitted under the terms of the GNU<br>
 * Public License (GPL)<br>
 * $Header: /pub/cvs/ms/movopt/movopt.cpp,v 1.3 1999/11/25 23:38:44 manfreds Ex<br>
p $<br>
 */<br>
<p>
/* <br>
   NOTE: this code will run _extremely_ slowly on UP systems<br>
<p>
   func1 and func2 are the functions that may race with each other<br>
<p>
   cpu1 and cpu2 are thread functions that synchronize func1 and func2<br>
   while adjusting timings<br>
*/<br>
<p>
#include &lt;stdio.h&gt;<br>
#include &lt;stdlib.h&gt;<br>
#include &lt;string.h&gt;<br>
#include &lt;unistd.h&gt;<br>
#include &lt;pthread.h&gt;<br>
#include &lt;assert.h&gt;<br>
<p>
#define USE_MB 0<br>
#define USE_ASM 1<br>
<p>
volatile int start = 0;<br>
volatile int current_state = 0;<br>
volatile int lock = 1;<br>
volatile int ready = 0;<br>
volatile int go = 0;<br>
<p>
#if USE_ASM == 0  /* doesn't compile, here for explanation */<br>
<p>
static inline void func1()<br>
{<br>
	  set_current_state(1);<br>
	  if(lock!=1) <br>
		while (current_state!=0) <br>
			;<br>
}<br>
<p>
static inline void func2()<br>
{<br>
	  lock=0;<br>
	  current_state=0;<br>
	  mb();<br>
}<br>
<p>
#else<br>
static inline void func1()<br>
{<br>
	__asm__ __volatile(<br>
#if USE_MB == 0<br>
		"movl $1,%1\n\t" /* set current_state = 1 */<br>
#else<br>
		"lock;bts $0,%1\n\t"<br>
#endif<br>
		"movl %0,%%eax\n\t" /* lock into %%eax */<br>
		"cmpl $1,%%eax\n\t"<br>
		"jne dont_sleep\n"<br>
		"retry:\n\t"<br>
		"movl %1, %%eax\n\t"<br>
		"cmpl $0,%%eax\n\t"<br>
		"jne retry\n"<br>
		"dont_sleep:\n"<br>
		: /* no output */<br>
		: "m" (lock), "m" (current_state)<br>
		: "eax", "cc", "memory");<br>
}<br>
<p>
static inline void func2()<br>
{<br>
	__asm__ __volatile(<br>
		"movl $0,%0\n\t"<br>
		"movl $0,%1\n\t"<br>
		"lock; addl $0,(%%esp)\n\t" /* flush our write buffers*/<br>
		: /* no output */<br>
		: "m" (lock), "m" (current_state)<br>
		: "memory");<br>
}<br>
#endif<br>
<p>
void* cpu1(void* param)<br>
{<br>
	/* 1: always sleep<br>
	   10 000: never sleep<br>
	   PII/350: lock-up at delay=217<br>
	   */<br>
	volatile int delay = 1;<br>
<p>
        int i=0,j;<br>
<p>
        while(!start)<br>
                ;<br>
<p>
        for(;i&lt;50000000;i++) {<br>
                lock = 1;<br>
                current_state = 0;<br>
                go = 0;<br>
                while(!ready)<br>
                        ;<br>
                go = 1;<br>
<p>
                for(j=0;j&lt;delay;j++)<br>
                        ;<br>
<p>
		func1();<br>
<p>
                if((i%5000)==0) {<br>
			printf("delay %d: ok\n",delay);<br>
			delay++;<br>
		}<br>
	}<br>
<p>
        printf("thread %d finished.\n",(int)param);<br>
        exit(0);<br>
}<br>
<p>
void* cpu2(void* param)<br>
{<br>
	/* increase this value if no lock-up occurs,<br>
	   eg: 2000<br>
	   */<br>
	volatile int delay2 = 300;<br>
<p>
        int i=0,j;<br>
<p>
        while(!start)<br>
                ;<br>
<p>
        for(;;) {<br>
                ready = 1;<br>
                while(!go)<br>
                        ;<br>
                ready = 0;<br>
                go = 0;<br>
<p>
                for(j=0;j&lt;delay2;j++)<br>
                        ;<br>
		<br>
		func2();<br>
        }<br>
}<br>
<p>
typedef void *threadfunc(void *);<br>
<p>
void start_thread(threadfunc *f)<br>
{<br>
        pthread_t thread;<br>
        int res;<br>
<p>
        res = pthread_create(&amp;thread,NULL,f,NULL);<br>
        if(res != 0)<br>
                assert(0);<br>
}<br>
<p>
int main()<br>
{<br>
        printf("movopt:\n");<br>
        start_thread(cpu1);<br>
        start_thread(cpu2);<br>
        printf(" starting, please wait.\n");<br>
        fflush(stdout);<br>
        start = 1;<br>
        for(;;) sleep(1000);<br>
}<br>
<p>
<p>
<pre>
--
 "Love the dolphins," she advised him. "Write by W.A.S.T.E.." 
<p>
<p>
<p>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at <a href="http://www.tux.org/lkml/">http://www.tux.org/lkml/</a>
</pre>
<!-- body="end" -->
<hr>
<p>
<ul>
<!-- next="start" -->
<li> <b>Next message:</b> <a href="0062.html">Juan J. Quintela: "Re: PROBLEM: umountfs and shutting down linux 2.3.99-pre5"</a>
<li> <b>Previous message:</b> <a href="0060.html">Fausto Saporito: "IPTABLES problem"</a>
<!-- nextthread="start" -->
<!-- reply="end" -->
</ul>
</font></body>
