<!-- received="Thu Jul 20 10:25:13 2000 EET DST" -->
<!-- sent="Thu, 20 Jul 2000 09:24:33 +0200" -->
<!-- name="Manfred Spraul" -->
<!-- email="manfred@colorfullife.com" -->
<!-- subject="Re: Obscure TLB flushing bug (x86 SMP)" -->
<!-- id="" -->
<!-- inreplyto="Obscure TLB flushing bug (x86 SMP)" -->
<title>Linux-kernel mailing list archive 2000-29,: Re: Obscure TLB flushing bug (x86 SMP)</title>
<body bgcolor="#FFFFFF"><font face="Arial,Helvetica">
<h1>Re: Obscure TLB flushing bug (x86 SMP)</h1>
<b>Manfred Spraul</b> (<a href="mailto:manfred@colorfullife.com"><i>manfred@colorfullife.com</i></a>)<br>
<i>Thu, 20 Jul 2000 09:24:33 +0200</i>
<p>
<ul>
<li> <b>Messages sorted by:</b> <a href="date.html#544">[ date ]</a><a href="index.html#544">[ thread ]</a><a href="subject.html#544">[ subject ]</a><a href="author.html#544">[ author ]</a>
<!-- next="start" -->
<li> <b>Next message:</b> <a href="0545.html">Andre Hedrick: "ide.2.4.0-t5-2.all.4c.patch.bz2"</a>
<li> <b>Previous message:</b> <a href="0543.html">Willy Tarreau: "Re: Is Alan keeping secrets again , 2.2.17pre-13 ??? No announce . JimL"</a>
<li> <b>Maybe in reply to:</b> <a href="0470.html">David Wragg: "Obscure TLB flushing bug (x86 SMP)"</a>
<!-- nextthread="start" -->
<!-- reply="end" -->
</ul>
<hr>
<!-- body="start" -->
This is a multi-part message in MIME format.<br>
--------------0F63D1F836869CF4F0B0CA57<br>
Content-Type: text/plain; charset=us-ascii<br>
Content-Transfer-Encoding: 7bit<br>
<p>
I found a race, but it's 5 or 6 instructions long. How often does your<br>
modified thread library crash?<br>
<p>
David Wragg wrote:<br>
<i>&gt; </i><br>
<i>&gt; </i><br>
<i>&gt;  static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next, st</i><br>
<i>&gt; ruct task_struct *tsk, unsigned cpu)</i><br>
<i>&gt;  {</i><br>
<i>&gt;         set_bit(cpu, &amp;next-&gt;cpu_vm_mask);</i><br>
<i>&gt; *       if (prev != next) {</i><br>
<i>&gt;                 /*</i><br>
<i>&gt;                  * Re-load LDT if necessary</i><br>
<i>&gt;                  */</i><br>
<i>&gt;                 if (prev-&gt;segments != next-&gt;segments)</i><br>
<i>&gt;                         load_LDT(next);</i><br>
<i>&gt;  #ifdef CONFIG_SMP</i><br>
<i>&gt;                 cpu_tlbstate[cpu].state = TLBSTATE_OK;</i><br>
<i>&gt;                 cpu_tlbstate[cpu].active_mm = next;</i><br>
<i>&gt;  #endif</i><br>
<i>&gt;                 /* Re-load page tables */</i><br>
<i>&gt;                 asm volatile("movl %0,%%cr3": :"r" (__pa(next-&gt;pgd)));</i><br>
<i>&gt;                 clear_bit(cpu, &amp;prev-&gt;cpu_vm_mask);</i><br>
<i>&gt;         }</i><br>
<i>&gt;  #ifdef CONFIG_SMP</i><br>
<i>&gt;         else {</i><br>
<i>&gt;  *              int old_state = cpu_tlbstate[cpu].state;</i><br>
<i>&gt;  *              cpu_tlbstate[cpu].state = TLBSTATE_OK;</i><br>
<i>&gt;                 if(cpu_tlbstate[cpu].active_mm != next)</i><br>
<i>&gt;                         BUG();</i><br>
<i>&gt; -               if(old_state == TLBSTATE_OLD)</i><br>
<i>&gt; +               /*if(old_state == TLBSTATE_OLD)*/</i><br>
<i>&gt;                         local_flush_tlb();</i><br>
<i>&gt;         }</i><br>
<i>&gt; </i><br>
<i>&gt;  #endif</i><br>
<i>&gt;  }</i><br>
<i>&gt; </i><br>
<p>
CPU0: flush_tlb_others<br>
<p>
CPU1: within schedule.<br>
	cpu_tblstate[1].state==TLB_STATE_LAZY<br>
<p>
If the  flush IPI arrives in the lines marked with *, then the IPI will<br>
clear the cpu bit in next_mm-&gt;cpu_vm_mask.<br>
Thus all further flush IPI won't be delivered.<br>
<p>
Could you apply patch-tlb-test and run it? I couldn't test it yet, but I<br>
hope it will report if this is really the bug you see.<br>
patch-tlb-fix should fix the problem, but I must double check it before<br>
sending it to Linus.<br>
<p>
<pre>
--
	Manfred
--------------0F63D1F836869CF4F0B0CA57
Content-Type: text/plain; charset=us-ascii;
 name="patch-tlb-test"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="patch-tlb-test"
<p>
--- 2.4/include/asm-i386/mmu_context.h	Sat May 13 10:35:31 2000
+++ build-2.4/include/asm-i386/mmu_context.h	Thu Jul 20 09:07:35 2000
@@ -46,6 +46,11 @@
 	else {
 		int old_state = cpu_tlbstate[cpu].state;
 		cpu_tlbstate[cpu].state = TLBSTATE_OK;
+		if(old_state != TLBSTATE_OLD) {
+			if((next-&gt;cpu_vm_mask &amp; (1&lt;&lt;cpu))==0) {
+				printk("First bug found!.\n");
+			}
+		}
 		if(cpu_tlbstate[cpu].active_mm != next)
 			BUG();
 		if(old_state == TLBSTATE_OLD)
<p>
<p>
--------------0F63D1F836869CF4F0B0CA57
Content-Type: text/plain; charset=us-ascii;
 name="patch-tlb-fix"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="patch-tlb-fix"
<p>
--- 2.4/include/asm-i386/mmu_context.h	Sat May 13 10:35:31 2000
+++ build-2.4/include/asm-i386/mmu_context.h	Thu Jul 20 09:13:18 2000
@@ -27,7 +27,6 @@
 
 static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next, struct task_struct *tsk, unsigned cpu)
 {
-	set_bit(cpu, &amp;next-&gt;cpu_vm_mask);
 	if (prev != next) {
 		/*
 		 * Re-load LDT if necessary
@@ -38,6 +37,7 @@
 		cpu_tlbstate[cpu].state = TLBSTATE_OK;
 		cpu_tlbstate[cpu].active_mm = next;
 #endif
+		set_bit(cpu, &amp;next-&gt;cpu_vm_mask);
 		/* Re-load page tables */
 		asm volatile("movl %0,%%cr3": :"r" (__pa(next-&gt;pgd)));
 		clear_bit(cpu, &amp;prev-&gt;cpu_vm_mask);
@@ -48,10 +48,10 @@
 		cpu_tlbstate[cpu].state = TLBSTATE_OK;
 		if(cpu_tlbstate[cpu].active_mm != next)
 			BUG();
+		set_bit(cpu, &amp;next-&gt;cpu_vm_mask);
 		if(old_state == TLBSTATE_OLD)
 			local_flush_tlb();
 	}
-
 #endif
 }
 
<p>
<p>
--------------0F63D1F836869CF4F0B0CA57--
<p>
<p>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at <a href="http://www.tux.org/lkml/">http://www.tux.org/lkml/</a>
</pre>
<!-- body="end" -->
<hr>
<p>
<ul>
<!-- next="start" -->
<li> <b>Next message:</b> <a href="0545.html">Andre Hedrick: "ide.2.4.0-t5-2.all.4c.patch.bz2"</a>
<li> <b>Previous message:</b> <a href="0543.html">Willy Tarreau: "Re: Is Alan keeping secrets again , 2.2.17pre-13 ??? No announce . JimL"</a>
<li> <b>Maybe in reply to:</b> <a href="0470.html">David Wragg: "Obscure TLB flushing bug (x86 SMP)"</a>
<!-- nextthread="start" -->
<!-- reply="end" -->
</ul>
</font></body>
