I come to you with a heavy heart. So, as Gilad found out a week or so ago, LInux has some serious issues when using epoll() with dup(). If you have two fds that refer to the same file, and you tell epoll_ctl() to listen for events on one, and then you close it but leave the other one open, epoll_wait() will continue to report the events that you registered on that fd ad infinitum, until all fds for the file are closed.
Older versions of Libevent (1.4 and earlier) had no problem here, since event_add() and event_del() would call epoll_ctl() immediately. But in Libevent 2.0, we decided to fix another problem. In practice, it's very frequent for a program to add and delete the same fd several times between dispatching. For example, you might flush all your data to an fd, delete its write event, then invoke a callback which queues more data to flush, thus adding another write event. In practice, it wasn't unusual to add and delete the same fd three or four times between dispatches. So to save time in Libevent 2.0, we have a "changelist" mechanism that queues up all the modifications for a backend (currently only epoll and kqueue use it) so they can all get handled at once when we go to dispatch. This is what epoll has been using since 2.0.4-alpha, back in February. But the changelist code won't work with current Linux epoll_ctl() and dup(), since if the user deletes an event then closes a dupped fd, we really need the event_del() to call epoll_ctl() immediately, or else we'll hit the kernel issue. I hacked up a variant epoll backend to see how hard it would be to revert to the old behavior. It turns out it isn't so tricky. Doing so, of course, means reverting to the old behavior where we would do way more epoll_ctl() system calls than necessary . Also, the reverted backend is not nearly so tested as the changelist-based backend. So, I see 4 options for Libevent 2.0. Here are two options that I am NOT considering so much: * Include only the changelist backend. Programs like Gilad's will have no way to use an O(1) backend. Too bad for them! * Include only the non-changelist backend. Everybody using epoll will need to do extra epoll_ctl() calls whether they do dup() or not. Too bad for them! Here are the two options that I *am* considering: * Include both backends; make the existing changelist backend on by default. The problem here is that it represents a genuine regression against Libevent 1.4, and I really hate having regressions. A library that accepts regressions for well-formed code using older versions is IMO being very rude to its users, and encouraging people to worry about upgrading. * Include both backends; make the non-changelist backend on by default. The problems here are that a) the non-changelist backend is slower, and most people won't do whatever is necessary to activate the faster one, and b) the non-changelist backend has had not nearly so much testing as the current changelist-based backend. If we do this, the lack of testing means we cannot possibly call 2.0.9 "2.0.9-stable"; we'll need to call it "-rc" instead. :/ I am currently leaning towards the last option. Efficiency is important, but even more important is knowing that if you wrote a program using Libevent version N, your program will still work when Libevent N+1 is released. Setting an option to enable extra performance is more important than setting an option to enable backward compatibility. Or at least that's what I think tonight. Please, let me know if I'm wrong. But keep in mind that if you argue that it's okay for Libvent 2.0 to break a well-behaved Libevent 1.4 program, you are also arguing that it's okay for Libevent 2.1 to break any program that you are writing for Libevent today. Software-is-hard.-Let's-go-shopping-ly yrs, -- Nick *********************************************************************** To unsubscribe, send an e-mail to majord...@freehaven.net with unsubscribe libevent-users in the body.