Re: [Qemu-devel] [PATCH RFC 4/4] aio-posix: Use epoll in aio_poll

Fam Zheng Thu, 09 Jul 2015 17:47:34 -0700

On Wed, 07/08 11:58, Stefan Hajnoczi wrote:
> On Wed, Jul 08, 2015 at 09:01:27AM +0800, Fam Zheng wrote:
> > On Tue, 07/07 16:08, Stefan Hajnoczi wrote:
> > > > +#define EPOLL_BATCH 128
> > > > +static bool aio_poll_epoll(AioContext *ctx, bool blocking)
> > > > +{
> > > > +    AioHandler *node;
> > > > +    bool was_dispatching;
> > > > +    int i, ret;
> > > > +    bool progress;
> > > > +    int64_t timeout;
> > > > +    struct epoll_event events[EPOLL_BATCH];
> > > > +
> > > > +    aio_context_acquire(ctx);
> > > > +    was_dispatching = ctx->dispatching;
> > > > +    progress = false;
> > > > +
> > > > +    /* aio_notify can avoid the expensive event_notifier_set if
> > > > +     * everything (file descriptors, bottom halves, timers) will
> > > > +     * be re-evaluated before the next blocking poll().  This is
> > > > +     * already true when aio_poll is called with blocking == false;
> > > > +     * if blocking == true, it is only true after poll() returns.
> > > > +     *
> > > > +     * If we're in a nested event loop, ctx->dispatching might be true.
> > > > +     * In that case we can restore it just before returning, but we
> > > > +     * have to clear it now.
> > > > +     */
> > > > +    aio_set_dispatching(ctx, !blocking);
> > > > +
> > > > +    ctx->walking_handlers++;
> > > > +
> > > > +    timeout = blocking ? aio_compute_timeout(ctx) : 0;
> > > > +
> > > > +    if (timeout > 0) {
> > > > +        timeout = DIV_ROUND_UP(timeout, 1000000);
> > > > +    }
> > > 
> > > I think you already posted the timerfd code in an earlier series.  Why
> > > degrade to millisecond precision?  It needs to be fixed up anyway if the
> > > main loop uses aio_poll() in the future.
> > 
> > Because of a little complication: timeout here is always -1 for iothread, 
> > and
> > what is interesting is that -1 actually requires an explicit
> > 
> >     timerfd_settime(timerfd, flags, &(struct itimerspec){{0, 0}}, NULL)
> > 
> > to disable timerfd for this aio_poll(), which costs somethings. Passing -1 
> > to
> > epoll_wait() without this doesn't work because the timerfd is already added 
> > to
> > the epollfd and may have an unexpected timeout set before.
> > 
> > Of course we can cache the state and optimize, but I've not reasoned about 
> > what
> > if another thread happens to call aio_poll() when we're in epoll_wait(), for
> > example when the first aio_poll() has a positive timeout but the second one 
> > has
> > -1.
> 
> I'm not sure I understand the threads scenario since aio_poll_epoll()
> has a big aio_context_acquire()/release() region that protects it, but I
> guess the nested aio_poll() case is similar.  Care needs to be taken so
> the extra timerfd state is consistent.


Nested aio_poll() has no racing on timerfd because the outer aio_poll()'s
epoll_wait() would have already returned at the point of the inner aio_poll().

Threads are different with Paolo's "release AioContext around blocking
aio_poll()".

> 
> The optimization can be added later unless the timerfd_settime() syscall
> is so expensive that it defeats the advantage of epoll().

That's the plan, and must be done before it get used by main loop.

Fam

Re: [Qemu-devel] [PATCH RFC 4/4] aio-posix: Use epoll in aio_poll

Reply via email to