Re: hexagon: modeling a shared lock state

Richard Henderson Wed, 24 Jan 2024 16:23:38 -0800

On 1/25/24 01:07, Brian Cain wrote:

Philippe,
For hexagon sysemu, while internally reviewing patches to be upstreamed we noticed thatour design for a lock instruction is not quite suitable. The k0lock instruction will haltif some other hexagon hardware CPU has already claimed the kernel lock, only to continueexecuting once some CPU executes the unlock instruction. We modeled this using a lockstate enumeration member { OWNER, WAITING, UNLOCKED } in **each** vCPU and atomicallytransitioning the lock required us to have vCPU[n] write the updated lock state to vCPU[m]when unlocking.
In context:https://github.com/quic/qemu/blob/hexagon_sysemu_20_dec_2023/target/hexagon/op_helper.c#L1790-L1821 <https://github.com/quic/qemu/blob/hexagon_sysemu_20_dec_2023/target/hexagon/op_helper.c#L1790-L1821>
So instead, maybe it makes more sense to have a system device hold a single representationof the lock’s state and each vCPU can do some kind of access via load/store and/orinterrupts to/from the device? I was thinking of raising an interrupt on the lock deviceat the vCPU’s k0lock instruction to indicate demand for the lock, and then the devicecould raise an interrupt to one of the vCPUs when it’s granted the lock. One drawback forthis is that this might add significant latency to the uncontested lock case. So I alsoconsidered doing a load of some part of the lock device’s memory that could indicatewhether the lock was available, and if available it would claim it with a store (bothld/st while holding BQL). Only if unavailable it would halt and raise the interrupt. Isit possible to create an address space that’s independent of the true system memory mapfor this use case or would we need to carve out some memory from the existing memory mapfor this synthetic device? Or – do you have a suggestion for a better approach overall?

I think you are over-thinking this. A system device would just get in the way. I thinkyou want something like


  struct CPUHexagonState {
    ...
    bool hwlock_pending;
  }

  hexagon_cpu_has_work() {
    if (cpu->hwlock_pending) {
      return false;
    }
  }

  static void do_hwlock(CPUHexagonState *env, bool *lock)
  {
    bql_lock();

    if (*lock) {
      env->hwlock_pending = true;
      cs->halted = true;
      cs->exception_index = EXCP_HALTED;
      bql_unlock();
      cpu_loop_exit(cs);
    } else {
      *lock = true;
      bql_unlock();
    }
  }

  static void do_hwunlock(CPUHexagonState *env, bool *lock)
  {
    CPUState *cs;
    BQL_LOCK_GUARD();

    *lock = false;

    CPU_FOREACH(cs) {
      if (cs->hwlock_pending) {
        cs->hwlock_pending = false;
        qemu_cpu_kick(cs);
      }
    }
  }

  static bool k0lock, tlblock;

  void HELPER(k0lock)(CPUHexagonState *env)
  void HELPER(tlblock)(CPUHexagonState *env)
  void HELPER(k0unlock)(CPUHexagonState *env)
  void HELPER(tlbunlock)(CPUHexagonState *env)

Open questions are:

(1) Do interrupts cancel lock wait?  Either way, placement in 
hexagon_cpu_has_work is key.

(2) I assume that the pc will not be advanced, so that after the kick we will re-executethe hwlock instruction. There will be a thundering herd racing to grab the lock again,but it saves queuing logic, which might be complicated if interrupts are involved.

r~

Re: hexagon: modeling a shared lock state

Reply via email to