On Wed, 11 Feb 2026 10:24:28 +0000
<[email protected]> wrote:

> From: Pravin M Bathija <[email protected]>
> 
> This is version v7 of the patchset and it incorporates the
> recommendations made by Maxime Coquelin and Feng Cheng Wen.
> The patchset includes support for adding and removal of memory
> regions, getting max memory slots and other changes to vhost-user
> messages.  These messages are sent from vhost-user front-end (qemu
> or libblkio) to a vhost-user back-end (dpdk, spdk). Support functions
> for these message functions have been implemented in the interest of
> writing optimized code. Older functions, part of vhost-user back-end
> have also been optimized using these newly defined support functions.
> This implementation has been extensively tested by doing Read/Write
> I/O from multiple instances of fio + libblkio (front-end) talking to
> spdk/dpdk (back-end) based drives. Tested with qemu front-end
> talking to dpdk + testpmd (back-end) performing add/removal of memory
> regions. Also tested post-copy live migration after doing
> add_memory_region.
> 
> Version Log:
> Version v7 (Current version): Incorporate code review suggestions
> from Maxime Coquelin. Add debug messages to vhost_postcopy_register
> function.
> Version v6: Added the enablement of this feature
> as a final patch in this patch-set and other code optimizations as
> suggested by Maxime Coquelin.
> Version v5: removed the patch that increased the
> number of memory regions from 8 to 128. This will be submitted as a
> separate feature at a later point after incorporating additional
> optimizations. Also includes code optimizations as suggested by Feng
> Cheng Wen.
> Version v4: code optimizations as suggested by Feng Cheng Wen.
> Version v3: code optimizations as suggested by Maxime Coquelin
> and Thomas Monjalon.
> Version v2: code optimizations as suggested by Maxime Coquelin.
> Version v1: Initial patch set.
> 
> Pravin M Bathija (5):
>   vhost: add user to mailmap and define to vhost hdr
>   vhost_user: header defines for add/rem mem region
>   vhost_user: support function defines for back-end
>   vhost_user: Function defs for add/rem mem regions
>   vhost_user: enable configure memory slots
> 
>  .mailmap               |   1 +
>  lib/vhost/rte_vhost.h  |   4 +
>  lib/vhost/vhost_user.c | 392 ++++++++++++++++++++++++++++++++++++-----
>  lib/vhost/vhost_user.h |  10 ++
>  4 files changed, 365 insertions(+), 42 deletions(-)
> 

Lots more stuff found by AI code review, this is not ready.

Review of [PATCH v7 1-5/5] vhost: configure memory slots support
Author: Pravin M Bathija <[email protected]>

This patch series adds support for the VHOST_USER_F_CONFIGURE_MEM_SLOTS
protocol feature, enabling add/remove individual memory regions instead
of requiring a full SET_MEM_TABLE each time.  The approach is sound and
the feature is needed, but several correctness bugs need to be addressed.

Patch 3/5 -- vhost_user: support function defines for back-end
--------------------------------------------------------------------

Error: dev_invalidate_vrings loses *pdev update after numa_realloc

  translate_ring_addresses() calls numa_realloc(), which can reallocate
  the dev struct and update *pdev.  In the original set_mem_table the
  caller writes "*pdev = dev;" after the call to observe this.

  dev_invalidate_vrings() takes a plain "struct virtio_net *dev" and
  passes "&dev, &vq" to translate_ring_addresses -- but dev is a local
  copy.  If numa_realloc reallocates, the caller's *pdev (in
  vhost_user_add_mem_reg) is never updated, and all subsequent accesses
  use a stale/freed pointer -- a use-after-free.

  The function should take struct virtio_net **pdev (matching the
  pattern used in set_mem_table) and propagate updates back.

Error: async_dma_map_region can underflow reg_size

  The loop "reg_size -= page->size" uses unsigned subtraction. If
  page->size does not evenly divide reg->size (which can happen with
  misaligned guest pages), reg_size wraps to a huge value and the loop
  runs out of bounds.  The original async_dma_map avoids this by
  iterating over nr_guest_pages directly.

  Suggest checking "reg_size >= page->size" before subtracting, or
  better yet iterating like the original does.

Warning: free_all_mem_regions iterates all VHOST_MEMORY_MAX_NREGIONS
  but the original free_mem_region iterated only dev->mem->nregions

  This is intentional for the sparse-slot design, but free_mem_region()
  checks "reg->host_user_addr" while free_all_mem_regions checks
  "reg->mmap_addr" as the sentinel.  These should be consistent.
  A region that was partially initialized (mmap_addr set but
  host_user_addr not yet set, or vice versa) would behave
  differently depending on which sentinel is checked.

Warning: vhost_user_initialize_memory unconditionally allocates
  dev->mem without freeing any prior allocation

  If dev->mem is non-NULL when vhost_user_initialize_memory is called,
  the old allocation is leaked.  set_mem_table frees dev->mem before
  calling it, but add_mem_reg only checks "dev->mem == NULL" and skips
  the call otherwise.  This is safe as coded, but the function itself
  is fragile -- consider adding an assertion or early return if
  dev->mem is already set.

Info: vhost_user_postcopy_register changes use reg_msg_index but the
  second loop (postcopy_region_register) still iterates dev->mem->regions
  with nregions as the bound and just skips zeros.  This is correct
  but the two loops use different iteration strategies which is
  confusing.  Consider making both iterate VHOST_MEMORY_MAX_NREGIONS
  with a skip-zero check for consistency.


Patch 4/5 -- vhost_user: Function defs for add/rem mem regions
--------------------------------------------------------------------

Error: vhost_user_add_mem_reg does not set ctx->fds[0] = -1 after
  assigning reg->fd = ctx->fds[0]

  In the original set_mem_table, after "reg->fd = ctx->fds[i]" the
  code sets "ctx->fds[i] = -1" to prevent the fd from being double-
  closed if the error path calls close_msg_fds().  The new add_mem_reg
  omits this.  If vhost_user_mmap_region or vhost_user_postcopy_register
  fails, the error path calls close_msg_fds(ctx) which closes
  ctx->fds[0] -- but reg->fd still holds the same value.  Later,
  free_all_mem_regions will close(reg->fd) again -- double close.

  Fix: add "ctx->fds[0] = -1;" after "reg->fd = ctx->fds[0];".

Error: vhost_user_add_mem_reg error path frees all memory on
  single-region failure

  If postcopy_register fails after successfully mmap'ing one region
  (when other regions already exist), the "free_mem_table" label
  calls free_all_mem_regions + rte_free(dev->mem) + rte_free
  (dev->guest_pages), destroying ALL previously registered regions.
  This is disproportionate -- the error should only unmap the single
  region that was just added and decrement nregions.

Error: overlap check uses mmap_size for existing region but
  memory_size for proposed region

  The overlap check compares:
    current_region_guest_end = guest_user_addr + mmap_size - 1
    proposed_region_guest_end = userspace_addr + memory_size - 1

  mmap_size = memory_size + mmap_offset (set by vhost_user_mmap_region).
  So the existing region's range is inflated by the mmap_offset,
  potentially producing false overlaps.  Use current_region->size
  (which corresponds to memory_size) instead of mmap_size for a
  correct comparison.

Error: vhost_user_add_mem_reg uses guest_user_addr == 0 as "slot
  is empty" but guest_user_addr 0 is a valid guest virtual address

  The free-slot search does:
    if (dev->mem->regions[i].guest_user_addr == 0)
  
  Guest address 0 is valid (common in some VM configurations).
  If a region is mapped at guest VA 0, it would appear as a free
  slot.  The original code uses compact packing (nregions as the
  count), avoiding this problem.  Consider using a dedicated
  "in_use" flag or checking mmap_addr == NULL instead (mmap
  never returns NULL on success, it returns MAP_FAILED).

Warning: vhost_user_rem_mem_reg matches on guest_phys_addr but the
  spec says "identified by guest address, user address and size"

  The comment in the code says "identified by its guest address,
  user address and size. The mmap offset is ignored." but the
  comparison also checks guest_phys_addr (which is the GPA, not
  the guest user address).  The spec says matching is by
  userspace_addr (GVA), guest_phys_addr (GPA), and memory_size.
  So the match is actually correct in practice, but the comment
  is misleading -- it should mention GPA too, or remove the
  comment about what the spec says.

Warning: nregions becomes inconsistent with actual slot usage

  After add_mem_reg, nregions is incremented.  After rem_mem_reg,
  nregions is decremented.  But regions are not packed -- removing
  a region from the middle leaves a hole.  Functions like
  qva_to_vva, hva_to_gpa, and rte_vhost_va_from_guest_pa all
  iterate "i < mem->nregions" and index "mem->regions[i]"
  sequentially.  With sparse slots these functions will miss
  regions that are stored at indices >= nregions.

  Example: 3 regions at slots 0,1,2 (nregions=3).  Remove slot 1
  (memset to 0, nregions=2).  Now qva_to_vva iterates slots 0,1
  only -- slot 2 (still valid) is never checked.  Address
  translations for memory in the third region will fail.

  This is a correctness bug that will cause packet processing
  failures.  Either pack the array on removal (shift later entries
  down) or change all iteration to use VHOST_MEMORY_MAX_NREGIONS
  with skip-zero checks.

Warning: dev_invalidate_vrings does not update *pdev in add_mem_reg

  Related to the Error in patch 3 -- even if dev_invalidate_vrings
  is fixed to take **pdev, the caller vhost_user_add_mem_reg must
  also write "*pdev = dev;" to propagate the possibly-reallocated
  pointer back through the message handler framework.


Patch 5/5 -- vhost_user: enable configure memory slots
--------------------------------------------------------------------

Warning: the feature is enabled unconditionally but several
  address translation functions (qva_to_vva, hva_to_gpa,
  rte_vhost_va_from_guest_pa) have not been updated to handle
  sparse region arrays.  Enabling the feature flag before fixing
  the nregions iteration issue means any front-end that uses
  ADD_MEM_REG/REM_MEM_REG will hit broken address translations.


Summary
--------------------------------------------------------------------

The most critical issues to fix before this can be merged:

1. nregions vs sparse-slot iteration mismatch -- this will cause
   address translation failures at runtime after any REM_MEM_REG.
   All loops that iterate mem->regions must either use compact
   packing or be updated to scan all VHOST_MEMORY_MAX_NREGIONS
   slots with skip-zero checks.

2. dev_invalidate_vrings / translate_ring_addresses *pdev propagation
   -- use-after-free if numa_realloc fires.

3. Missing ctx->fds[0] = -1 in add_mem_reg -- double-close on
   error paths.

4. Overlap check using mmap_size instead of size -- false positive
   overlaps.

5. Error path in add_mem_reg destroys all regions instead of just
   the failed one.

Reply via email to