On 07/15/2015 06:03 AM, Peter Zijlstra wrote:
On Tue, Jul 14, 2015 at 10:13:36PM -0400, Waiman Long wrote:
+static void pv_kick_node(struct qspinlock *lock, struct mcs_spinlock *node)
  {
        struct pv_node *pn = (struct pv_node *)node;

+       if (xchg(&pn->state, vcpu_running) == vcpu_running)
+               return;
+
        /*
+        * Kicking the next node at lock time can actually be a bit faster
+        * than doing it at unlock time because the critical section time
+        * overlaps with the wakeup latency of the next node. However, if the
+        * VM is too overcommmitted, it can happen that we need to kick the
+        * CPU again at unlock time (double-kick). To avoid that and also to
+        * fully utilize the kick-ahead functionality at unlock time,
+        * the kicking will be deferred under either one of the following
+        * 2 conditions:
         *
+        * 1) The VM guest has too few vCPUs that kick-ahead is not even
+        *    enabled. In this case, the chance of double-kick will be
+        *    higher.
+        * 2) The node after the next one is also in the halted state.
         *
+        * In this case, the hashed flag is set to indicate that hashed
+        * table has been filled and _Q_SLOW_VAL is set.
         */
-       if (xchg(&pn->state, vcpu_running) == vcpu_halted) {
-               pvstat_inc(pvstat_lock_kick);
-               pv_kick(pn->cpu);
+       if ((!pv_kick_ahead || pv_get_kick_node(pn, 1))&&
+           (xchg(&pn->hashed, 1) == 0)) {
+               struct __qspinlock *l = (void *)lock;
+
+               /*
+                * As this is the same vCPU that will check the _Q_SLOW_VAL
+                * value and the hash table later on at unlock time, no atomic
+                * instruction is needed.
+                */
+               WRITE_ONCE(l->locked, _Q_SLOW_VAL);
+               (void)pv_hash(lock, pn);
+               return;
        }
+
+       /*
+        * Kicking the vCPU even if it is not really halted is safe.
+        */
+       pvstat_inc(pvstat_lock_kick);
+       pv_kick(pn->cpu);
  }

  /*
@@ -513,6 +545,13 @@ static void pv_wait_head(struct qspinlock *lock, struct 
mcs_spinlock *node)
                        cpu_relax();
                }

+               if (!lp&&  (xchg(&pn->hashed, 1) == 1))
+                       /*
+                        * The hashed table&  _Q_SLOW_VAL had been filled
+                        * by the lock holder.
+                        */
+                       lp = (struct qspinlock **)-1;
+
                if (!lp) { /* ONCE */
                        lp = pv_hash(lock, pn);
                        /*
*groan*, so you complained the previous version of this patch was too
complex, but let me say I vastly preferred it to this one :/

I said it was complex as maintaining a tri-state variable needed more thought than 2 bi-state variables. I can revert it back to the tri-state variable as doing an unconditional kick in unlock simplifies the code at pv_wait_head().

Cheers,
Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to