Re: [PATCH] /dev/epoll update ...

Davide Libenzi (davidel@xmailserver.org)
Thu, 20 Sep 2001 10:18:21 -0700 (PDT)


On 20-Sep-2001 Christopher K. St. John wrote:
> Davide Libenzi wrote:
>>
>> Here are examples basic functions when used with
>> coroutines.
>>
>
> I think all might be made clear if you did a quick
> test harness that didn't use coroutines. I'm guessing
> the vast majority of potential users will not be using
> a coroutine library.
>
> On "nio-improve" page, you've got:
>
> for (;;) {
> evp.ep_timeout = STD_SCHED_TIMEOUT;
> evp.ep_resoff = 0;
> nfds = ioctl(kdpfd, EP_POLL, &evp);
> pfds = (struct pollfd *) (map + evp.ep_resoff);
> for (ii = 0; ii < nfds; ii++, pfds++) {
> ...
> }
> }

Coroutines or not, this does not change the picture.
All multiplexed servers have an IO driven scheduler that calls
code sections based on the fd.
Obviously if you've a one-thread-per-socket model, epoll is not your answer.

> Assume your server is so overloaded that you need
> to avoid any unproductive calls to read() or write()
> or accept(). Assume that instead of many very fast
> connections coming in at a furious rate, you get a
> large steady current of very slow connections.

>>>> Sorry, bad editing, that should be:
>> Assume a large but bursty current of low bandwidth
>> high latency connections instead of a continuous steady
>> flood of high bandwidth low latency connections.

> If you try to flesh out the above template with those
> goals in mind, I think you'll quickly see what I've
> been trying to get at with regard to the awkwardness
> of not getting back some indication of the initial
> state of the fd.
>
> The current situation isn't fatal, just awkward. And
> fixable. For the low low price of a tiny bit of
> idealogical purity...

Again, no.
If you need to request the current status of a socket you've to f_ops->poll the fd.
The cost of the extra read, done only for fds that are not "ready", is nothing
compared to the cost of a linear scan with HUGE numbers of fds.
You could implement a solution where the low level io functions goes directly to write
inside the mmapped fd set where the data buffer is empty or the out buffer is full.
This would be a way more intrusive patch whose perf gain won't match the cost.

- Davide

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/