Re: [PATCH] /dev/epoll update ...

Dan Kegel (dank@kegel.com)
Wed, 19 Sep 2001 08:37:45 -0700


"Christopher K. St. John" wrote:
> The Banga, Mogul and Druschel[1] paper (which I understand
> was the inspiration for the Solaris /dev/poll which was the
> inspiration for /dev/epoll?) talks about having the poll
> return the current state of new descriptors. As far as I can
> tell, /dev/epoll only gives you events on state changes. So,
> for example, if you accept() a new socket and add it to the
> interest list, you (probably) won't get a POLLIN. That's
> not fatal, but it's awkward.
>...
> My vote would be to always report the initial state, but
> that would make the driver a little more complicated.
>
> What are the preferred semantics?

Taking an extreme but justifiable position for discussion's sake:

Stevens [UNPV1, in chapter on nonblocking accept] suggests that readiness
notifications from the OS should only be considered hints, and that user
programs should behave properly even if the OS feeds it false readiness
events.

Thus one possible approach would be for /dev/epoll (or users of /dev/epoll)
to assume that an fd is initially ready for all (normal) events, and just
try handling them all. That probably involves a single system call
to read() (or possibly a call to both write() and read(), or a call to accept(),
or a call to getsockopt() in the case of nonblocking connect), so the overhead
isn't very high.

(In fact, programs that use select(), poll(), or /dev/epoll would benefit
from having a test mode where false readiness events are injected at random;
the program should continue to behave normally, perhaps with slightly increased
CPU usage.)

That said, the principle of least suprise would suggest that /dev/epoll should
indeed return an accurate initial status. There are a lot of programmers who
don't agree with Stevens on this issue, and who write code that breaks if you
feed it incorrect readiness events.

- Dan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/