mshv_partition_ioctl_create_vp() initialises a VP struct (allocations,
mutex_init, init_waitqueue_head, page mappings) and then publishes the
pointer into partition->pt_vp_array.  Several ISR paths read this array
locklessly: the intercept ISR, the two scheduler ISRs, and
mshv_try_assert_irq_fast() on the irqfd fast path.

Of these, only mshv_try_assert_irq_fast() can structurally race the
publish.  It runs from an eventfd waker without holding pt_mutex, and
MSHV_IRQFD does not require the target lapic_apic_id (== vp_index) to
refer to an existing VP at registration time.  A user can therefore
register an irqfd targeting a yet-to-be-created VP, then trigger
mshv_try_assert_irq_fast() concurrently with MSHV_CREATE_VP for the
same index.  On weakly-ordered architectures the reader can observe a
non-NULL pointer in pt_vp_array before the initialising stores to the
VP struct become visible, leading to use of partially-initialised
fields (e.g. vp_register_page).

The other ISR readers cannot reach this race: the hypervisor will not
generate intercept or scheduler messages for a VP that has never been
told to run, and the user can only call MSHV_RUN_VP on the VP fd
returned by MSHV_CREATE_VP, which by construction is returned after
the publish.  Leave those readers as plain loads.

Use smp_store_release() in mshv_partition_ioctl_create_vp() to publish
the pointer, and pair it with smp_load_acquire() in
mshv_try_assert_irq_fast().  On x86 these compile to plain accesses
under TSO; on ARM64 they emit one-instruction acquire/release barriers,
acceptable on this fast path.

The destroy-side path (destroy_partition() clearing pt_vp_array[i] to
NULL after kfree(vp)) has a separate ordering and lifetime concern
that is out of scope here.

Fixes: 621191d709b14 ("Drivers: hv: Introduce mshv_root module to expose 
/dev/mshv to VMMs")
Signed-off-by: Stanislav Kinsburskii <[email protected]>
---
 drivers/hv/mshv_eventfd.c   |    9 ++++++++-
 drivers/hv/mshv_root_main.c |    8 +++++++-
 2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/drivers/hv/mshv_eventfd.c b/drivers/hv/mshv_eventfd.c
index 11a6006f80194..5f0dd243e1445 100644
--- a/drivers/hv/mshv_eventfd.c
+++ b/drivers/hv/mshv_eventfd.c
@@ -197,7 +197,14 @@ static int mshv_try_assert_irq_fast(struct mshv_irqfd 
*irqfd,
        if (irq->lapic_apic_id >= MSHV_MAX_VPS)
                return -EINVAL;
 
-       vp = partition->pt_vp_array[irq->lapic_apic_id];
+       /*
+        * Pairs with smp_store_release() in mshv_partition_ioctl_create_vp().
+        * MSHV_IRQFD does not require the target lapic_apic_id to refer to an
+        * existing VP, so this read can race a concurrent VP creation; the
+        * acquire ensures that a non-NULL pointer implies the VP's
+        * initialising stores are visible.
+        */
+       vp = smp_load_acquire(&partition->pt_vp_array[irq->lapic_apic_id]);
        if (!vp)
                return -EINVAL;
 
diff --git a/drivers/hv/mshv_root_main.c b/drivers/hv/mshv_root_main.c
index 7e4252b6bc65c..381aa86c5b90e 100644
--- a/drivers/hv/mshv_root_main.c
+++ b/drivers/hv/mshv_root_main.c
@@ -1224,7 +1224,13 @@ mshv_partition_ioctl_create_vp(struct mshv_partition 
*partition,
 
        /* already exclusive with the partition mutex for all ioctls */
        partition->pt_vp_count++;
-       partition->pt_vp_array[args.vp_index] = vp;
+       /*
+        * Pairs with smp_load_acquire() in mshv_try_assert_irq_fast(), which
+        * can run concurrently from an irqfd waker without holding pt_mutex.
+        * The release ensures the VP's initialising stores are visible to any
+        * reader that observes a non-NULL pointer in pt_vp_array.
+        */
+       smp_store_release(&partition->pt_vp_array[args.vp_index], vp);
 
        goto out;
 



Reply via email to