<!-- received="Tue Apr 25 05:11:59 2000 EET DST" -->
<!-- sent="Mon, 24 Apr 2000 19:00:59 -0700 (PDT)" -->
<!-- name="dean gaudet" -->
<!-- email="dgaudet-list-linux-kernel@arctic.org" -->
<!-- subject="Re: lockless poll() (was Re: namei() query)" -->
<!-- id="" -->
<!-- inreplyto="39045127.70A61F5C@geocities.com" -->
<title>Linux-kernel mailing list archive 2000-17,: Re: lockless poll() (was Re: namei() query)</title>
<body bgcolor="#FFFFFF"><font face="Arial,Helvetica">
<h1>Re: lockless poll() (was Re: namei() query)</h1>
<b>dean gaudet</b> (<a href="mailto:dgaudet-list-linux-kernel@arctic.org"><i>dgaudet-list-linux-kernel@arctic.org</i></a>)<br>
<i>Mon, 24 Apr 2000 19:00:59 -0700 (PDT)</i>
<p>
<ul>
<li> <b>Messages sorted by:</b> <a href="date.html#257">[ date ]</a><a href="index.html#257">[ thread ]</a><a href="subject.html#257">[ subject ]</a><a href="author.html#257">[ author ]</a>
<!-- next="start" -->
<li> <b>Next message:</b> <a href="0258.html">Andrey Savochkin: "Re: Major packet loss with eepro10 in 2.3.99-pre5"</a>
<li> <b>Previous message:</b> <a href="0256.html">Wakko Warner: "Re: Linux Jobs as of 2.3.99pre6-5"</a>
<!-- nextthread="start" -->
<!-- reply="end" -->
</ul>
<hr>
<!-- body="start" -->
On Mon, 24 Apr 2000, Artur Skawina wrote:<br>
<p>
<i>&gt; <a href="mailto:kumon@flab.fujitsu.co.jp">kumon@flab.fujitsu.co.jp</a> wrote:</i><br>
<i>&gt; &gt; </i><br>
<i>&gt; &gt; In the heavy duty case, csum_partial_copy_generic() becomes the new</i><br>
<i>&gt; &gt; winner of the worst time consuming function with the poll()</i><br>
<i>&gt; &gt; optimization. We are arranging the global figure  now.</i><br>
<i>&gt; &gt; </i><br>
<i>&gt; &gt; Though csum_partial_copy_generic() is highly optimized with</i><br>
<i>&gt; &gt; hand-crafted code, it eats lots of time. It may be inevitable, but may</i><br>
<i>&gt; &gt; be reducible. We are now investigating why it does.</i><br>
<i>&gt; </i><br>
<i>&gt; csum_partial_copy_generic() could certainly be more optimized;</i><br>
<i>&gt; attached is a snapshot of a version that does upto 20% better in</i><br>
<i>&gt; dumb benchmarks, if and what difference there is for real loads</i><br>
<i>&gt; i haven't yet measured.</i><br>
<i>&gt; </i><br>
<i>&gt; [patch vs 2.3.99pre6pre5, offsets are wrong]</i><br>
<i>&gt; </i><br>
<p>
here is a patch against apache-1.3 which forces it to align the length of<br>
its headers to a 32-byte boundary.  if your benchmark is requesting<br>
objects greater than ~4k in size this will cause apache to generate<br>
writev()s such as:<br>
<p>
writev(3, [{"HTTP/1.1 200 OK\r\nDate: Tue, 25 A"..., 288}, {"&lt;!DOCTYPE<br>
HTML PUBLIC \"-//W3C//D"..., 32768}], 2) = 33056<br>
<p>
which are 32-byte aligned, and sized...<br>
<p>
folks have observed a performance boost with other web-servers with this<br>
technique.  i haven't tested it at all, just thought you might want to try<br>
it if you're trying out new csum_partial_copy_generic()s.<br>
<p>
-dean<br>
<p>
Index: src/main/buff.c<br>
===================================================================<br>
RCS file: /home/cvs/apache-1.3/src/main/buff.c,v<br>
retrieving revision 1.96<br>
diff -u -r1.96 buff.c<br>
--- src/main/buff.c	2000/03/04 20:51:02	1.96<br>
+++ src/main/buff.c	2000/04/25 01:59:00<br>
@@ -390,8 +390,11 @@<br>
 <br>
     /* overallocate so that we can put a chunk trailer of CRLF into this<br>
      * buffer */<br>
-    if (flags &amp; B_WR)<br>
-	fb-&gt;outbase = ap_palloc(p, fb-&gt;bufsiz + 2);<br>
+    if (flags &amp; B_WR) {<br>
+#define ALIGN (32)<br>
+	fb-&gt;outbase = ap_palloc(p, fb-&gt;bufsiz + 2 + ALIGN);<br>
+	fb-&gt;outbase += ALIGN - ((long)fb-&gt;outbase % ALIGN);<br>
+    }<br>
     else<br>
 	fb-&gt;outbase = NULL;<br>
 <br>
Index: src/main/http_protocol.c<br>
===================================================================<br>
RCS file: /home/cvs/apache-1.3/src/main/http_protocol.c,v<br>
retrieving revision 1.289<br>
diff -u -r1.289 http_protocol.c<br>
--- src/main/http_protocol.c	2000/02/20 01:14:47	1.289<br>
+++ src/main/http_protocol.c	2000/04/25 01:59:00<br>
@@ -1445,6 +1445,21 @@<br>
     if (bs &gt;= 255 &amp;&amp; bs &lt;= 257)<br>
         ap_bputs("X-Pad: avoid browser bug" CRLF, client);<br>
 <br>
+#define ALIGN (32)<br>
+    ap_bgetopt(client, BO_BYTECT, &amp;bs);<br>
+    bs += 2; /* for the final terminating empty line */<br>
+    if (bs % ALIGN) {<br>
+	ap_bputc('X', client);<br>
+	ap_bputc(':', client);<br>
+	bs += 4;	/* 2 for "X:" and 2 for the final CRLF */<br>
+	while (bs % ALIGN) {<br>
+	    ap_bputc('X', client);<br>
+	    ++bs;<br>
+	}<br>
+	ap_bputc('\r', client);<br>
+	ap_bputc('\n', client);<br>
+    }<br>
+<br>
     ap_bputs(CRLF, client);  /* Send the terminating empty line */<br>
 }<br>
 <br>
<p>
<p>
-<br>
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in<br>
the body of a message to majordomo@vger.rutgers.edu<br>
Please read the FAQ at <a href="http://www.tux.org/lkml/">http://www.tux.org/lkml/</a><br>
<!-- body="end" -->
<hr>
<p>
<ul>
<!-- next="start" -->
<li> <b>Next message:</b> <a href="0258.html">Andrey Savochkin: "Re: Major packet loss with eepro10 in 2.3.99-pre5"</a>
<li> <b>Previous message:</b> <a href="0256.html">Wakko Warner: "Re: Linux Jobs as of 2.3.99pre6-5"</a>
<!-- nextthread="start" -->
<!-- reply="end" -->
</ul>
</font></body>
