Right.  So the situation is that the existing design dates from a time when
threads and scheduling for Unix were primitive.  It was not uncommon in
those
days for threads to "go away" for half a second, a second or more in a
loaded system.

To deal with this, the current design is non-blocking all the way down.
 This means that
no code (correctly written) ever blocks on a mutex (we use try_lock) which
is the origin of that AIO code.

Now, threading and scheduling has dramatically improved and the problems
that motivated
that design decision have (most probably :) gone away.

It would be dramatically easier to do away with this requirement and write
the code with
small critical sections and just block if we miss a mutex.  First, that is
the way most
programs are written these days and second, having to back out of the entire
stack, saving
context is a huge burden.  It makes the code hard to read, write and debug.

The upshot is that my suggestion to just add a 10 msec timeout is for the
short term.
I'd like to see a blocking, small critical section AIO/Cache/RamCache patch
performance
tested against the existing code *then* pull the trigger on such a new
design and fix the
offending code (including the AIO race).

I plan to work on such a patch and would be very interested in talking to
anyone who might be
interested in such a thing as well.

john



On Mon, Sep 12, 2011 at 7:03 AM, Bart Wyatt <wanderingb...@yooser.com>wrote:

> While I think this is not an ideal solution, but for the reasons you
> mentioned (re invasiveness/ease), I'd be willing to accept it.
>
> I think I would prefer to fix the queuing so that it can't leave an
> unserviced request.  But if the "fix" of waking a thread every now and
> again is well documented it should be ok for now.  I would put a large
> comment in the thread loop that indicates that we are away of the race
> and chose to fix it by waking the thread on an interval.  It would
> surely save some future code-detective some time if they want to know
> why their request sometimes waits for ~10ms on an otherwise idle
> instance.  It may also prevent some future "optimizer" from
> re-introducing the bug by removing the interval on the conditional
> wait.
>
> On Sun, Sep 11, 2011 at 4:35 PM, John Plevyak <jplev...@acm.org> wrote:
> > I don't think so as this should be self correcting in a busy system
> > as Bart pointed out; the main concern is initialization or recovery.
> >
> > I'll take care of this...
> >
> > john
> >
> > On Sun, Sep 11, 2011 at 1:59 PM, Leif Hedstrom <zw...@apache.org> wrote:
> >
> >> On 09/10/2011 02:17 PM, John Plevyak wrote:
> >>
> >>> This is a race condition which should happen very very infrequently
> (e.g.
> >>> once a day
> >>> on a loaded system perhaps) and it would only be 10msec on an unloaded
> >>> system, which would
> >>> make it very very very infrequent (maybe once a year in that case).  I
> >>> agree
> >>> that 10 msec
> >>> is long these days, but unfortunately unix-ish systems cannot be
> counted
> >>> on
> >>> to not
> >>> busy spin when delaying less than 10 msecs.  So while I agree with you
> in
> >>> principle,
> >>> in practice 10 msec is probably a safer.
> >>>
> >>
> >> Agreed. So, is there a bug filed for this? Is there an easy fix? John,
> I'm
> >> wondering if this is the same problem Yahoo Search was having, where
> once in
> >> a while, a cache would go MIA (unavailable) until they restarted the
> server.
> >> On their busy boxes, it'd happen after ~2 weeks or so.
> >>
> >> -- leif
> >>
> >>
> >
>

Reply via email to