Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

Paolo Bonzini Wed, 16 Sep 2015 01:33:07 -0700


On 15/09/2015 19:38, Paul E. McKenney wrote:
> Excellent points!
> 
> Other options in such situations include the following:
> 
> o     Rework so that the code uses call_rcu*() instead of *_expedited().
> 
> o     Maintain a per-task or per-CPU counter so that every so many
>       *_expedited() invocations instead uses the non-expedited
>       counterpart.  (For example, synchronize_rcu instead of
>       synchronize_rcu_expedited().)


Or just use ratelimit (untested):

diff --git a/include/linux/percpu-rwsem.h b/include/linux/percpu-rwsem.h
index 834c4e52cb2d..8fb66b2aeed9 100644
--- a/include/linux/percpu-rwsem.h
+++ b/include/linux/percpu-rwsem.h
@@ -6,6 +6,7 @@
 #include <linux/percpu.h>
 #include <linux/wait.h>
 #include <linux/lockdep.h>
+#include <linux/ratelimit.h>
 
 struct percpu_rw_semaphore {
        unsigned int __percpu   *fast_read_ctr;
@@ -13,6 +14,7 @@ struct percpu_rw_semaphore {
        struct rw_semaphore     rw_sem;
        atomic_t                slow_read_ctr;
        wait_queue_head_t       write_waitq;
+       struct ratelimit_state  expedited_ratelimit;
 };
 
 extern void percpu_down_read(struct percpu_rw_semaphore *);
diff --git a/kernel/locking/percpu-rwsem.c b/kernel/locking/percpu-rwsem.c
index f32567254867..c33f8bc89384 100644
--- a/kernel/locking/percpu-rwsem.c
+++ b/kernel/locking/percpu-rwsem.c
@@ -20,6 +20,8 @@ int __percpu_init_rwsem(struct percpu_rw_semaphore *brw,
        atomic_set(&brw->write_ctr, 0);
        atomic_set(&brw->slow_read_ctr, 0);
        init_waitqueue_head(&brw->write_waitq);
+       /* Expedite one down_write and one up_write per second.  */
+       ratelimit_state_init(&brw->expedited_ratelimit, HZ, 2);
        return 0;
 }
 
@@ -152,7 +156,10 @@ void percpu_down_write(struct percpu_rw_semaphore *brw)
         *    fast-path, it executes a full memory barrier before we return.
         *    See R_W case in the comment above update_fast_ctr().
         */
-       synchronize_sched_expedited();
+       if (__ratelimit(&brw->expedited_ratelimit))
+               synchronize_sched_expedited();
+       else
+               synchronize_sched();
 
        /* exclude other writers, and block the new readers completely */
        down_write(&brw->rw_sem);
@@ -172,7 +179,10 @@ void percpu_up_write(struct percpu_rw_semaphore *brw)
         * Insert the barrier before the next fast-path in down_read,
         * see W_R case in the comment above update_fast_ctr().
         */
-       synchronize_sched_expedited();
+       if (__ratelimit(&brw->expedited_ratelimit))
+               synchronize_sched_expedited();
+       else
+               synchronize_sched();
        /* the last writer unblocks update_fast_ctr() */
        atomic_dec(&brw->write_ctr);
 }


> Note that synchronize_srcu_expedited() is less troublesome than are the
> other *_expedited() functions, because synchronize_srcu_expedited() does
> not inflict OS jitter on other CPUs.

Yup, synchronize_srcu_expedited() is just a busy wait and it can
complete extremely fast if you use SRCU as a "local RCU" rather
than a "sleepable RCU".  However it doesn't apply here since you
want to avoid SRCU's 2 memory barriers per lock/unlock.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem) causes regression for libvirt/kvm

Reply via email to