How to use epoll? A complete example in C

younata · on Oct 16, 2011

Don't have an example, but I've come to really like kqueues (Freebsd's equivalent to an epoll). I heard they were ported to linux as part of getting GCD to run on it, but I haven't done any research on it.

johnm · on Oct 15, 2011

Use libuv instead of writing this stuff yourself.

jrockway · on Oct 16, 2011

It's essential to understand how an event loop works before you use one from a library. If you don't "get it", you won't write an application that uses it effectively.

Don't write your own event loop for production code; write it to learn how they work.

(Also: uv? Really? That's a pretty bloated library that does everything from handling timers to figuring out the 15-minute load average on OpenBSD. If you just want timers and fds, use something simpler. select is actually fine for a large majority of applications.)

tptacek · on Oct 16, 2011

I wouldn't use select() directly in any application that needed timers.

Libevent and libev are fine solutions to this problem. And I don't think you need to learn how to use epoll or kqueue to make use of them.

jrockway · on Oct 16, 2011

There is some bookkeeping involved when you want IO watchers and timers, but it amounts to a heap for the timers ("when does the next timer go off?"), calling select with the timeout set for when the next timer expires, and then updating the timer queue when you wake up (if it's because the timer time arrived, rather than because IO is possible). That's basically why we have event loop libraries, to tie those two data structures together.

The critical thing to learn about epoll is event-triggering and level-triggering. I've seen people write libev code where they make watchers for r and w on a socket, then keep those alive regardless of whether or not they have data to write. This means their process never goes to sleep, since the fd is always writable and the write callback is called every time the process tries to go to sleep. They are expecting edge-triggering when ev provides level-triggering.

On the other hand, I've seen people using libzmq with libev have the opposite problem. libev's watchers are level-triggered, but libzmq's ZMQ_FDs are edge-triggered. This means you really have to know what you're doing to get things to work (but if you treat the ZMQ_FD like you should treat a normal fd, that is, read until EWOULDBLOCK and write until EWOULDBLOCK, it works), and reading an article that discusses how both work could be helpful in getting you to write correct code. OTOH, the terms themselves are pretty clear.

tptacek · on Oct 16, 2011

From what I can tell, less than half of all "evented" C code gets write events correct. The error far more common than busypolling the write event is not having write events at all, and just assuming that socket writes never block.

But I'm just saying that timers actually take a little bit of data structure work to get right; when it takes chin-ups just to get basic timers, code tends to have crappy (inefficient, slow) timing. Any good event library should give you the ability to schedule thousands and thousands of fine-grained timers without worrying about storage or the amount of time it takes to find the next wait interval.

justincormack · on Oct 16, 2011

Or you use timerfd(2) and get the kernel to manage the timer heap. Under Linux anyway.

Peaker · on Oct 16, 2011

That is more expensive. When using epoll, you typically do care about performance.

jrockway · on Oct 16, 2011

Nice. What's next; the kernel getting its own version of printf?

exDM69 · on Oct 16, 2011

timerfd and it's counterparts in bsd/osx kqueue are absolutely essential to write fast I/O code. it's the stuff that powers node.js and such.

the kernel has it's own version of printf, because it naturally can't run on the one that is in your userspace libc. I don't think that kprintf is exposed through the system call interface, though.

tptacek · on Oct 16, 2011

I don't buy it. I've scaled up code with tens of thousands of millisecond-granular demand-scheduled timers without a special kernel interface for them. Timers are a straightforward data structure problem. The only kernel hook you need to support them has been a part of select(2) and poll(2) for decades.

I'm see the point of timerfd(2), don't get me wrong. I'm just saying: a performant timer implementation only needs to tell the kernel one thing: when the next timer fires. Unlike poll(2) and select(2), which have O(n) bottlenecks in the u/k interface, timers are performant in pure userland code.

gettimeofday(2), on the other hand... there's a problem.

exDM69 · on Oct 17, 2011

I think I'm not on the same page with you. Are you suggesting that there is no need for timerfd and it's kqueue alternatives because you can have a priority queue that has the timer values and every time you call select/poll you look at the smallest value and wait until that?

If so, that approach is very limited.

It cannot be used in a multithreaded environment effectively, because if you want to cancel or change the time of the next timeout while select/poll is blocking, you will have to somehow wake the polling thread to inform it of the change, adding overhead and complexity. Imagine a select/poll call for every TCP socket you have and a timeout for each one of them.

select/poll interface is also quite limited, you can't e.g. tell it which clock to use. I frequently work in a soft real time environment and I need to rely on high precision timing like CLOCK_MONOTONIC.

Also, select and poll have other problems, which is why epoll and kqueue were created in the first place.

tptacek · on Oct 17, 2011

I said "tens of thousands of millisecond granular timers", you can safely assume multiple timers on any given socket. You are describing something as complicated that has been a feature of every major event library since ACE_Wrappers.

I'm at a disadvantage arguing about working with evented timers in multithreaded environments, because generally I event to avoid threads. In environments where I've done both (for instance, Cocoa), I dedicate a thread to the event loop and give it a message queue interface to the other threads.

"Waking the polling thread" is pretty straightforward. Just set up a pipe and ping it when you have a message you need delivered immediately. That's no more expensive than "timerfd".

I'm not arguing that select/poll are superior to epoll and kqueue (although select is just fine for many applications; why would I care, though; I don't write directly to select or kqueue). I'm saying that the Unix interface to timers isn't unscalable, and the scalability problems are in userland. Unlike select/poll, which did in fact have an O(n) problem in the u/k interface, the Unix interface to timers isn't a bottleneck.

jrockway · on Oct 17, 2011

ev even has a watcher type, ev_async, specifically for waking up the event loop from another thread.

vilya · on Oct 16, 2011

You mean kprintf?

http://leaf.dragonflybsd.org/cgi/web-man?command=kprintf&...

tptacek · on Oct 16, 2011

I think you're deliberately misconstruing 'jrockway's point. The kernel has an internal printf, but userland programs don't ask the kernel to printf-format strings for them.

vilya · on Oct 17, 2011

That's exactly what I was doing. It was supposed to be a tongue-in-cheek response while making the point that one could do exactly what jrockway was using as an extreme example, if one was misguided enough.

X-Istence · on Oct 16, 2011

Did you mean libev?

taf2 · on Oct 16, 2011

nah, he means libuv - because that's the layer on top of libev and libeio used by node.js to also include win32 support...

uriel · on Oct 16, 2011

Events are evil, much better use something like Plan 9 (port) libthread ( http://man.cat-v.org/plan_9/2/thread ) or Russ Cox's libtask ( http://swtch.com/libtask/ ).

Or better yet, program in Go ;)

tptacek · on Oct 16, 2011

Most network programs spend much of their entire life waiting for IO events.

I have no argument in principle for the idea that you should restructure your whole program's control flow with transparent cooperative scheduling to attempt to get the best of both worlds of straight-line coding and efficient scheduling, except that it feels to me like you're kind of not really even writing C code anymore at that point. This might be irrational of me.

In the meantime, if you're not going to adopt an exotic thread scheduling library, I think events are the way to go. POSIX threads don't make much sense to me.

SamReidHughes · on Oct 16, 2011

Development using coroutine libraries goes much faster and the resulting code is much more robust than that which uses events and callbacks. This is even more strongly the case in C++ where you can take advantage of traditional C++ RAII on any coroutine stack, which you couldn't do so nicely and clearly with callbacks. In the ten months since we started using them, we've had no stability problems and no strange and inscrutable bugs from within our coroutine implementation. There have been user errors from misuse of coroutines, but nothing anywhere near the order of magnitude of what you get when using callbacks.

p9idf · on Oct 16, 2011

"I have no argument in principle for the idea that you should restructure your whole program's control flow with transparent cooperative scheduling"

Maintaining any kind of long-term state for the functions which the event loop calls can be trickier than it needs to be, since they need to quickly return and can't leave anything on their stack.

Peaker · on Oct 16, 2011

Cooperative threads are implemented in terms of events.

GHC's thread system and IO manager is implemented in terms of epoll, for example.

tptacek · on Oct 16, 2011

Programs that use cooperative thread scheduling are written very different than programs that use events directly. At the end of the day, every networked program on your computer is implemented in terms of interrupt service routines, but that doesn't make Firefox a driver.

Peaker · on Oct 16, 2011

I'm just saying that events may be evil as a general purpose mechanism. But they are useful at the implementation layer of the less "evil" general purpose mechanisms.

And when I write performance-oriented C code, I use events, because great tight resource control is facilitated by that style.

MostAwesomeDude · on Oct 16, 2011

Use Twisted instead of inventing libuv. >:3

burgerbrain · on Oct 16, 2011

That is unfortunately only an option if you are developing in Python.

MostAwesomeDude · on Oct 16, 2011

libuv's realistically only an option if you're in Node. I know of no libevent/libev consumers at the C level who are going to switch to libuv as soon as it becomes more stable.

known · on Oct 16, 2011

http://www.kegel.com/c10k.html