Ulrich Drepper wrote: > On 5/1/07, Davi Arnaut <[EMAIL PROTECTED]> wrote: >> The pollable futex approach is far superior (send and receive events from >> userspace or kernel) to eventfd and fixes (supercedes) FUTEX_FD at the same >> time. >> [...] > > You have to explain in detail how these interfaces are supposed to > work. From first sight and without understanding (all) the it seems > it's far from useful.
It's basically a asynchronous FUTEX_WAIT with notification delivery through a file descriptor. > Pollable futexes are useful, but any solution which gets implemented > must be sufficiently useful for all the uses we might have. It's very useful for asynchronous event notification libraries (libevent, liboop, libivykis, etc) because it integrates nicely with their (e)poll main loops. Usage schenario: you have 10 worker threads (and 10 futexes) for disk i/o (or whatever) and one manager thread which is a state machine serving many clients (epoll loop). In this scenario the workers threads have only two possible ways of notifying the manager thread once a job is done: signals and pipe tricks. For libraries, signals sux. They dont integrate well with poll() loops, may have overflow issues (RT), and signal numbers may clash with other libraries/code. The self-pipe trick waste resources (mostly unused pipe buffer). By using pollable futexes, all the manager thread has todo is to associate each of these futexes with a file descriptor (plfutex) and epoll() for their completion. Once the futex is signaled, epoll() returns POLLIN for the file descriptor and the manager thread may dequeue the notification status from anywhere. > - the trivial is that you have a futex and you are just interest in > seeing it change. The same as FUTEX_WAIT. I'm just interested in seeing a FUTEX_WAKE. Yes, same as FUTEX_WAIT. > I cannot figure out how all this works in your code. Every futex has a wait queue (q->waiters) which is used to track processes waiting on the futex. When the futex receives a FUTEX_WAKE it wakes up all waiters on the wait queue. Also, a futex is considered woken when it wait queue is empty (or lock_ptr == NULL). When you register a file descriptor with select(), poll() or epoll() a callback is queued into the futex wait queue. When the futex receives a FUTEX_WAKE every callback is called and the event is registered within each select(), poll() or epoll() table. This initiates a chain reaction waking up all process sleeping on poll()/whatever. > Does your read() call (that's the one to wait, yes?) work with O_NONBLOCK > or how else do you get that behavior? If the fd is marked O_NONBLOCK and the futex is not woken yet, it simply returns -EAGAIN (pfs_read_nonblock). If O_NONBLOCK is not set, it waits synchronously (pfs_read_block/wait_event_interruptible) on the futex wait queue. > - more complicated case: I have to wait for multiple futexes and lock > them all at the same time or don't return at all. This is possible with > SysV semaphores and generally useful and needed. How can this be > implemented with your scheme? Remember, it's only about FUTEX_WAIT. > - how does it work with PI futexes? It dosen't work. AFAICS PI futexes don't use FUTEX_WAKE. > - can I use a futex at the same time through this mechanism and using the > normal > FUTEX_WAIT operation? This is a killer if it's not the case. Yes. > - if you have multiple threads polling a futex and the waker wakes up > one, what happens? It is simply not acceptable to have more than one > thread return from the poll() call, this would waste too many cycles, > just to put all threads but one back to sleep. Only one is waked up (whatever matches first on the futex hashed bucket). -- Davi Arnaut - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/