Re: [PATCH] make cancel_rearming_delayed_work() reliable

2007-05-21 Thread Jarek Poplawski
On Mon, May 21, 2007 at 10:59:46AM +0200, Tejun Heo wrote: ... > Well, it might be just because I'm used to what I call 'locked > enter/leave' model. If people don't find that easier to understand, no > objection from me and kudos to Oleg for getting it right from the beginning. I think most peop

Re: [PATCH] make cancel_rearming_delayed_work() reliable

2007-05-21 Thread Tejun Heo
Hello, Jarek Poplawski wrote: > If we want to stick to more understanable, established ways, > why don't we use a global lock (not per cpu but per work or > per workqueue) for insert_work(), which would be really the > simplest construct here. Probably because some compromise is > needed. On the o

Re: [PATCH] make cancel_rearming_delayed_work() reliable

2007-05-20 Thread Jarek Poplawski
On Fri, May 18, 2007 at 03:33:49PM +0200, Tejun Heo wrote: ... > I wasn't really aiming for performance optimization. I agree that we > have to live with barriers but it's also true that they and other > synchronization constructs can be difficult to understand and thus to > verify, so IMHO, when

Re: [PATCH] make cancel_rearming_delayed_work() reliable

2007-05-18 Thread Tejun Heo
Hello, Jarek Poplawski wrote: > 2. IMHO the current solution with smp barriers is very good: > these barriers are really needed and they should be enough. > Oleg repeats all the time he hates barriers, but I think > it's wrong approach - they should be seen as something > natural for programming m

Re: [PATCH] make cancel_rearming_delayed_work() reliable

2007-05-18 Thread Jarek Poplawski
On Fri, May 18, 2007 at 09:35:17AM +0200, Jarek Poplawski wrote: > On Wed, May 16, 2007 at 10:52:03PM +0400, Oleg Nesterov wrote: ... > 3. The alternative solution without barriers, based on the > idea of Tejun Heo and presented in the patch proposal from > 2007-05-13, could be probably a little fa

Re: [PATCH] make cancel_rearming_delayed_work() reliable

2007-05-18 Thread Jarek Poplawski
On Wed, May 16, 2007 at 10:52:03PM +0400, Oleg Nesterov wrote: ... > Ah, but this is something different. Both lock/unlock are full barriers, > but they protect only one direction. A memory op must not leak out of the > critical section, but it may leak in. > > A = B; // 1 > l

Re: [PATCH] make cancel_rearming_delayed_work() reliable

2007-05-17 Thread Tejun Heo
Hello, Oleg. Oleg Nesterov wrote: > Hello Tejun, > > On 05/16, Tejun Heo wrote: lock is read arrier, unlock is write barrier. >> Let's say there's a shared data structure protected by a spinlock and >> two threads are accessing it. >> >> 1. thr1 locks spin >> 2. thr1 updates data structure >

Re: [PATCH] make cancel_rearming_delayed_work() reliable

2007-05-16 Thread Oleg Nesterov
Hello Tejun, On 05/16, Tejun Heo wrote: > > >> lock is read arrier, unlock is write barrier. > > Let's say there's a shared data structure protected by a spinlock and > two threads are accessing it. > > 1. thr1 locks spin > 2. thr1 updates data structure > 3. thr1 unlocks spin > 4. thr2 locks sp

Re: [PATCH] make cancel_rearming_delayed_work() reliable

2007-05-16 Thread Tejun Heo
Hello, Oleg. Oleg Nesterov wrote: >>> Yes? I need to think more about this. >> I don't think the test for PENDING should be atomic too. cwq pointer >> and VALID is one package. PENDING lives its own life as a atomic bit >> switch. > > Yes sure, it should not be atomic. But (VALID && !PENDING) =

Re: [PATCH] make cancel_rearming_delayed_work() reliable

2007-05-15 Thread Jarek Poplawski
On Wed, May 16, 2007 at 02:08:12AM +0400, Oleg Nesterov wrote: > On 05/15, Jarek Poplawski wrote: > > > > I've overheared somebody is talking about my favorite 2-nd bit! ... > We already discussed this... Surely, we can do this. I believe > this will complicate (and _imho_ uglify) the code too muc

Re: [PATCH] make cancel_rearming_delayed_work() reliable

2007-05-15 Thread Oleg Nesterov
On 05/15, Jarek Poplawski wrote: > > I've overheared somebody is talking about my favorite 2-nd bit! > Probably I miss many points (your talk isn't the most clear), > but I wonder if this bit couldn't be used otherwise: try_to_grab_ > sets the bit - others know cancel is pending, so don't disturb:

Re: [PATCH] make cancel_rearming_delayed_work() reliable

2007-05-15 Thread Oleg Nesterov
On 05/15, Tejun Heo wrote: > > Oleg Nesterov wrote: > > > So, try_to_grab_pending() should check "VALID && pointers equal" atomically. > > We can't do "if (VALID && cwq == get_wq_data(work))". We should do something > > like this > > > > (((long)cwq) | VALID | PENDING) == atomic_long_read(&wo

Re: [PATCH] make cancel_rearming_delayed_work() reliable

2007-05-15 Thread Jarek Poplawski
On Tue, May 15, 2007 at 10:26:41AM +0200, Tejun Heo wrote: > Oleg Nesterov wrote: ... > > So, try_to_grab_pending() should check "VALID && pointers equal" atomically. > > We can't do "if (VALID && cwq == get_wq_data(work))". We should do something > > like this > > > > (((long)cwq) | VALID | P

Re: [PATCH] make cancel_rearming_delayed_work() reliable

2007-05-15 Thread Tejun Heo
Oleg Nesterov wrote: >> Yeap, I've used the term 'clearing' as more of a logical term - making >> the pointer invalid, but is there any reason why we can't clear the >> pointer itself? > > Because this breaks cancel_work_sync(work), it needs get_wq_data(work) for > wait_on_work(work). Right. Tha

Re: [PATCH] make cancel_rearming_delayed_work() reliable

2007-05-14 Thread Oleg Nesterov
On 05/13, Tejun Heo wrote: > > Oleg Nesterov wrote: > > Heh. I thought about another bit-in-pointer too. I can't explain this, > > but I personally hate these bits even more than barriers. > > I'm the other way around but it's like saying "I like donkey poo better > than horse poo". Let's accept

Re: [PATCH] make cancel_rearming_delayed_work() reliable

2007-05-13 Thread Oleg Nesterov
Hi Tejun, Sorry, I can't give a "complete" reply today (will do tomorrow), just one note right now... On 05/13, Tejun Heo wrote: > > Oleg Nesterov wrote: > >> * try_to_grab_pending() checks VALID && pointers equal after grabbing > >> cwq->lock. > > > > We don't even need to check the pointers.

Re: [PATCH] make cancel_rearming_delayed_work() reliable

2007-05-13 Thread Tejun Heo
Hello, Oleg Nesterov wrote: > On 05/12, Tejun Heo wrote: >> Oleg Nesterov wrote: >>> Yes I hate this barrier too. That is why changelog explicitly mentions it. >>> Probably we can do something else. >> Fortunately, we have one bit left in the flags and can use it to mark >> pointer validity instea

Re: [PATCH] make cancel_rearming_delayed_work() reliable

2007-05-13 Thread Oleg Nesterov
On 05/12, Tejun Heo wrote: > > Oleg Nesterov wrote: > > > > Yes I hate this barrier too. That is why changelog explicitly mentions it. > > Probably we can do something else. > > Fortunately, we have one bit left in the flags and can use it to mark > pointer validity instead of list_empty() test. f

Re: [PATCH] make cancel_rearming_delayed_work() reliable

2007-05-13 Thread Tejun Heo
Oleg Nesterov wrote: > On 05/11, Tejun Heo wrote: >> Oleg Nesterov wrote: >>> However, I agree, this smp_wmb() in insert_work() should die. We can >>> introduce "smp_mb__before_spinlock()" (no-op on x86 at least) to kill it. >> Yeah, right, we allow cwq pointer to change without holding the lock. >

Re: [PATCH] make cancel_rearming_delayed_work() reliable

2007-05-11 Thread Oleg Nesterov
On 05/11, Tejun Heo wrote: > > Oleg Nesterov wrote: > > > > However, I agree, this smp_wmb() in insert_work() should die. We can > > introduce "smp_mb__before_spinlock()" (no-op on x86 at least) to kill it. > > Yeah, right, we allow cwq pointer to change without holding the lock. > Although I stil

Re: [PATCH] make cancel_rearming_delayed_work() reliable

2007-05-11 Thread Tejun Heo
Oleg Nesterov wrote: >> and this. After grabbing cwq lock, compare it to get_wq_data() first, >> if they are the same, both are using the same lock so there's no >> reason for the barriers. If they are different, it doesn't matter >> anyway. We need to retry the locking. > > I think this is not

Re: [PATCH] make cancel_rearming_delayed_work() reliable

2007-05-11 Thread Oleg Nesterov
On 05/11, Tejun Heo wrote: > > Oleg Nesterov wrote: > > + /* > > +* Ensure that we get the right work->data if we see the > > +* result of list_add() below, see try_to_grab_pending(). > > +*/ > > + smp_wmb(); > > I don't think we need this > > > + if (!list_empty(&work->entry))

Re: [PATCH] make cancel_rearming_delayed_work() reliable

2007-05-11 Thread Tejun Heo
Hello, Oleg. Oleg Nesterov wrote: > + /* > + * Ensure that we get the right work->data if we see the > + * result of list_add() below, see try_to_grab_pending(). > + */ > + smp_wmb(); I don't think we need this > + if (!list_empty(&work->entry)) { > + /* >

Re: [PATCH] make cancel_rearming_delayed_work() reliable

2007-05-08 Thread Jarek Poplawski
On Tue, May 08, 2007 at 06:05:17PM +0400, Oleg Nesterov wrote: > On 05/08, Jarek Poplawski wrote: > > > > On Tue, May 08, 2007 at 04:31:02PM +0400, Oleg Nesterov wrote: > > > > > > I thought about adding such a parameter, and I don't like this. This is > > > a matter of taste, of course, but _imho

Re: [PATCH] make cancel_rearming_delayed_work() reliable

2007-05-08 Thread Jarek Poplawski
On Tue, May 08, 2007 at 03:56:48PM +0200, Jarek Poplawski wrote: > > So, we should either return 0, or add BUG_ON(!cwq). > > ...And you prefer endless loop. Seems brave! Sorry! (Maybe you're not so brave?) Seems clear _PENDING should save us here - I hope. Bye, bye, Jarek P. - To unsubscribe fro

Re: [PATCH] make cancel_rearming_delayed_work() reliable

2007-05-08 Thread Oleg Nesterov
On 05/08, Jarek Poplawski wrote: > > On Tue, May 08, 2007 at 04:31:02PM +0400, Oleg Nesterov wrote: > > > > I thought about adding such a parameter, and I don't like this. This is > > a matter of taste, of course, but _imho_ this uglifies the code. > > > > In any case, unless we do completely dif

Re: [PATCH] make cancel_rearming_delayed_work() reliable

2007-05-08 Thread Jarek Poplawski
On Tue, May 08, 2007 at 04:31:02PM +0400, Oleg Nesterov wrote: > On 05/08, Jarek Poplawski wrote: > > > > On Fri, May 04, 2007 at 12:42:26AM +0400, Oleg Nesterov wrote: > > ... > > > +static int try_to_grab_pending(struct work_struct *work) > > > +{ > > > + struct cpu_workqueue_struct *cwq; > > >

Re: [PATCH] make-cancel_rearming_delayed_work-reliable-fix

2007-05-08 Thread Jarek Poplawski
On Tue, May 08, 2007 at 04:02:21PM +0400, Oleg Nesterov wrote: > On 05/08, Jarek Poplawski wrote: > > > > On Mon, May 07, 2007 at 02:34:20PM +0400, Oleg Nesterov wrote: > > > On 05/07, Jarek Poplawski wrote: ... > I am strongly against adding many different variants of cancel_xxx(). > With this pat

Re: [PATCH] make cancel_rearming_delayed_work() reliable

2007-05-08 Thread Oleg Nesterov
On 05/08, Jarek Poplawski wrote: > > On Fri, May 04, 2007 at 12:42:26AM +0400, Oleg Nesterov wrote: > ... > > +static int try_to_grab_pending(struct work_struct *work) > > +{ > > + struct cpu_workqueue_struct *cwq; > > + int ret = 0; > > + > > + if (!test_and_set_bit(WORK_STRUCT_PENDING, wor

Re: [PATCH] make-cancel_rearming_delayed_work-reliable-fix

2007-05-08 Thread Oleg Nesterov
On 05/08, Jarek Poplawski wrote: > > On Mon, May 07, 2007 at 02:34:20PM +0400, Oleg Nesterov wrote: > > On 05/07, Jarek Poplawski wrote: > ... > > I am not happy with the complication this patch adds, mostly > > I hate this smb_wmb() in insert_work(). I have an idea how to > > remove it later, but

Re: [PATCH] make-cancel_rearming_delayed_work-reliable-fix

2007-05-08 Thread Jarek Poplawski
On Mon, May 07, 2007 at 02:34:20PM +0400, Oleg Nesterov wrote: > On 05/07, Jarek Poplawski wrote: ... > I am not happy with the complication this patch adds, mostly > I hate this smb_wmb() in insert_work(). I have an idea how to > remove it later, but this needs another patch not related to > workq

Re: [PATCH] make cancel_rearming_delayed_work() reliable

2007-05-08 Thread Jarek Poplawski
Hi, Below are some of my suggestions: On Fri, May 04, 2007 at 12:42:26AM +0400, Oleg Nesterov wrote: ... > --- OLD/kernel/workqueue.c~1_CRDW 2007-05-02 23:29:07.0 +0400 > +++ OLD/kernel/workqueue.c2007-05-03 22:42:29.0 +0400 > @@ -120,6 +120,11 @@ static void insert_work(

Re: [PATCH] make-cancel_rearming_delayed_work-reliable-fix

2007-05-07 Thread Oleg Nesterov
On 05/07, Anton Vorontsov wrote: > > I guess pseudo code below is not that strange, but real usecase: > > probe() > { > INIT_DELAYED_WORK(...); > /* we're not issuing queue_delayed_work() in probe(), work will >* be started by interrupt */ > return; > } > > remove() > {

Re: [PATCH] make-cancel_rearming_delayed_work-reliable-fix

2007-05-07 Thread Anton Vorontsov
On Mon, May 07, 2007 at 02:34:20PM +0400, Oleg Nesterov wrote: > On 05/07, Jarek Poplawski wrote: > > > > There is a lot of new things in the final version of this > > patch. I guess, there was no such problem in the previous > > version. > > No, this is basically the same patch + re-check-cwq-aft

Re: [PATCH] make-cancel_rearming_delayed_work-reliable-fix

2007-05-07 Thread Oleg Nesterov
On 05/07, Jarek Poplawski wrote: > > There is a lot of new things in the final version of this > patch. I guess, there was no such problem in the previous > version. No, this is basically the same patch + re-check-cwq-after-lock, the latter is mostly needed to prevent racing with CPU-hotplug. > I

Re: [PATCH] make-cancel_rearming_delayed_work-reliable-fix

2007-05-07 Thread Jarek Poplawski
On Sun, May 06, 2007 at 01:32:13AM +0400, Oleg Nesterov wrote: > on top of > make-cancel_rearming_delayed_work-reliable-spelling.patch > > Add cpu-relax() into spinloops, and a comments update. > > Small note. The new implementation has another downside. Suppose that rearming > work->func()

[PATCH] make-cancel_rearming_delayed_work-reliable-fix

2007-05-05 Thread Oleg Nesterov
on top of make-cancel_rearming_delayed_work-reliable-spelling.patch Add cpu-relax() into spinloops, and a comments update. Small note. The new implementation has another downside. Suppose that rearming work->func() gets a preemtion after setting WORK_STRUCT_PENDING but before add_timer/__

Re: [PATCH] make cancel_rearming_delayed_work() reliable

2007-05-04 Thread Oleg Nesterov
On 05/03, Andrew Morton wrote: > > On Fri, 4 May 2007 00:42:26 +0400 > Oleg Nesterov <[EMAIL PROTECTED]> wrote: > > > Disadvantages: > > > > - this patch adds wmb() to insert_work(). > > > > - slowdowns the fast path (when del_timer() succeeds on entry) of > > cancel_rearming_delay

Re: [PATCH] make cancel_rearming_delayed_work() reliable

2007-05-03 Thread Andrew Morton
On Fri, 4 May 2007 00:42:26 +0400 Oleg Nesterov <[EMAIL PROTECTED]> wrote: > Thanks to Jarek Poplawski for the ideas and for spotting the bug in the > initial draft patch. > > cancel_rearming_delayed_work() currently has many limitations, because it > requires that dwork always re-arms itself via

[PATCH] make cancel_rearming_delayed_work() reliable

2007-05-03 Thread Oleg Nesterov
Thanks to Jarek Poplawski for the ideas and for spotting the bug in the initial draft patch. cancel_rearming_delayed_work() currently has many limitations, because it requires that dwork always re-arms itself via queue_delayed_work(). So it hangs forever if dwork doesn't do this, or cancel_rearmin