Re: AIO race condition

2011-09-12 Thread John Plevyak
Right. So the situation is that the existing design dates from a time when threads and scheduling for Unix were primitive. It was not uncommon in those days for threads to "go away" for half a second, a second or more in a loaded system. To deal with this, the current design is non-blocking all

Re: AIO race condition

2011-09-12 Thread Bart Wyatt
While I think this is not an ideal solution, but for the reasons you mentioned (re invasiveness/ease), I'd be willing to accept it. I think I would prefer to fix the queuing so that it can't leave an unserviced request. But if the "fix" of waking a thread every now and again is well documented it

Re: AIO race condition

2011-09-11 Thread John Plevyak
I don't think so as this should be self correcting in a busy system as Bart pointed out; the main concern is initialization or recovery. I'll take care of this... john On Sun, Sep 11, 2011 at 1:59 PM, Leif Hedstrom wrote: > On 09/10/2011 02:17 PM, John Plevyak wrote: > >> This is a race condit

Re: AIO race condition

2011-09-11 Thread Leif Hedstrom
On 09/10/2011 02:17 PM, John Plevyak wrote: This is a race condition which should happen very very infrequently (e.g. once a day on a loaded system perhaps) and it would only be 10msec on an unloaded system, which would make it very very very infrequent (maybe once a year in that case). I agree

Re: AIO race condition

2011-09-10 Thread John Plevyak
This is a race condition which should happen very very infrequently (e.g. once a day on a loaded system perhaps) and it would only be 10msec on an unloaded system, which would make it very very very infrequent (maybe once a year in that case). I agree that 10 msec is long these days, but unfortuna

Re: AIO race condition

2011-09-10 Thread Theo Schlossnagle
People don't deploy spinning disks much anymore. 10ms seems high. <<1ms for SSDs. Perhaps we should optimize for that instead? On Sat, Sep 10, 2011 at 3:13 PM, John Plevyak wrote: > You are right.  My preference would be to change this to a > pthread_cond_timedwait > with a 10 msec timeout (or s

Re: AIO race condition

2011-09-10 Thread John Plevyak
You are right. My preference would be to change this to a pthread_cond_timedwait with a 10 msec timeout (or somesuch). The rational being that (hard) disk latency is in that range in any case and the chance of this happening is rare so taking a 10 msec hit would not be the end of the world. The

AIO race condition

2011-09-07 Thread Bart Wyatt
I think I have identified a race condition that can erroneously place a new AIO request on the "temp" list without waking up a thread to service it.  It seems that in most cases of this race condition the next request will rectify the issue, however in cases such as cache volume initialization/reco