On 01/31, Paul E. McKenney wrote:
> 
> QRCU as currently written (http://lkml.org/lkml/2006/11/29/330) doesn't
> do what you want, as it acquires the lock unconditionally.  I am proposing
> that synchronize_qrcu() change to something like the following:
> 
>       void synchronize_qrcu(struct qrcu_struct *qp)
>       {
>               int idx;
>       
>               smp_mb();
>       
>               if (atomic_read(qp->ctr[0]) + atomic_read(qp->ctr[1]) <= 1) {
>                       smp_rmb();
>                       if (atomic_read(qp->ctr[0]) +
>                           atomic_read(qp->ctr[1]) <= 1)
>                               goto out;
>               }
>       
>               mutex_lock(&qp->mutex);
>               idx = qp->completed & 0x1;
>               atomic_inc(qp->ctr + (idx ^ 0x1));
>               /* Reduce the likelihood that qrcu_read_lock() will loop */
>               smp_mb__after_atomic_inc();
>               qp->completed++;
>       
>               atomic_dec(qp->ctr + idx);
>               __wait_event(qp->wq, !atomic_read(qp->ctr + idx));
>               mutex_unlock(&qp->mutex);
>       out:
>               smp_mb();
>       }
> 
> For the first "if" to give a false positive, a concurrent switch had
> to have happened.  For example, qp->ctr[0] was zero and qp->ctr[1]
> was two at the time of the first atomic_read(), but then qp->completed
> switched so that both qp->ctr[0] and qp->ctr[1] were one at the time
> of the second atomic_read.  The only way the second "if" can give us a
> false positive is if there was another change to qp->completed in the
> meantime -- but that means that all of the pre-existing qrcu_read_lock()
> holders must have gotten done, otherwise the second switch could not
> have happened.  Yes, you do incur three memory barriers on the fast
> path, but the best you could hope for with your approach was two of them
> (unless I am confused about how you were using barrier_sync()).

While doing qrcu, somehow I convinced myself we can't optimize out taking
qp->mutex. Now I think I was wrong. Good!

Q: you deleted "if (atomic_read(qp->ctr + idx) == 1)" fastpath under ->mutex,
was this needed for this optimization to work? I am asking because I can't
understand how it can make any difference.

> Oleg, does this look safe?

Yes. But let me think more about this later, I've got a fever, and can't
think properly today :)

Oleg.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to