[PATCH v8 0/5] Support add/remove memory region and get-max-slots

Bathija, Pravin Pravin.Bathija at dell.com
Tue Apr 7 09:46:48 CEST 2026


Dear Stephen,

Thank you for the quick review. I have just submitted an updated patch set v9 with the changes that were suggested. The responses to your comments are inline.


Internal Use - Confidential
> -----Original Message-----
> From: Stephen Hemminger <stephen at networkplumber.org>
> Sent: Sunday, April 5, 2026 4:30 PM
> To: Bathija, Pravin <Pravin.Bathija at dell.com>
> Cc: dev at dpdk.org; thomas at monjalon.net; maxime.coquelin at redhat.com;
> fengchengwen at huawei.com
> Subject: Re: [PATCH v8 0/5] Support add/remove memory region and get-max-
> slots
>
>
> [EXTERNAL EMAIL]
>
> On Fri, 3 Apr 2026 01:54:29 +0000
> <pravin.bathija at dell.com> wrote:
>
> > From: Pravin M Bathija <pravin.bathija at dell.com>
> >
> > This is version v8 of the patchset and it incorporates the
> > recommendations made by Stephen Hemminger.
> > The async_dma_map_region function was rewritten to iterate guest pages
> > by host address range matching rather than assuming contiguous array
> > indices after sorting, and now returns an error code so that DMA
> > mapping failures are treated as fatal during add_mem_reg. The
> > dev_invalidate_vrings function was changed to accept a double pointer
> > to propagate pointer updates from numa_realloc through
> > translate_ring_addresses, preventing a user-after-free in both
> > add_mem_reg and rem_mem_reg callers.
> >
> > A new remove_guest_pages function was added to clean up stale guest
> > page entries when removing a region. The rem_mem_reg handler now
> > compacts the regions array using memmove after removal, keeping it
> > contiguous so that all existing nregions-based iteration in address
> > translation functions like qva_to_vva and hva_togpa continues to work
> > correctly. This compaction also eliminates the need for the
> > guest_user_addr == 0 free-slot sentinel in add_mem_reg, which was
> > problematic since guest virtual address zero is valid.
> >
> > The add_mem_reg error path was narrowed to only clean up the single
> > failed region instead of destroying all existing regions. A missing
> > ctx->fds[0] = -1 assignment was added after the fd handoff to prevent
> > double-close on error paths. The overlap check was corrected to use
> > region size instead of mmap_size, which included alignment padding and
> > caused false positives. The rem_mem_reg handler registration was fixed
> > to accept file descriptors as required by the vhost-user protocol,
> > with proper fd cleanup added in all exit paths. Finally, the postcopy
> > registration in add_mem_reg was changed to call
> > vhost_user_postcopy_region_register directly for the single new
> > region, avoiding the use of vhost_user_postcopy_register which reads
> > the wrong payload union member, and a defensive guard was added to
> > vhost_user_initialize_memory against double initialization.
> >
> > This implementation has been extensively tested by doing Read/Write
> > I/O from multiple instances of fio + libblkio (front-end) talking to
> > spdk/dpdk (back-end) based drives. Tested with qemu front-end talking
> > to dpdk testpmd (back-end) performing add/removal of memory regions.
> > Also tested post-copy live migration after doing add_memory_region.
> >
> > Version Log:
> > Version v8 (Current version): Incorporate code review suggestions from
> > Stephen Hemminger as described above.
> > Version v7: Incorporate code review suggestions from Maxime Coquelin.
> > Add debug messages to vhost_postcopy_register function.
> > Version v6: Added the enablement of this feature as a final patch in
> > this patch-set and other code optimizations as suggested by Maxime
> > Coquelin.
> > Version v5: removed the patch that increased the number of memory
> > regions from 8 to 128. This will be submitted as a separate feature at
> > a later point after incorporating additional optimizations. Also
> > includes code optimizations as suggested by Feng Cheng Wen.
> > Version v4: code optimizations as suggested by Feng Cheng Wen.
> > Version v3: code optimizations as suggested by Maxime Coquelin and
> > Thomas Monjalon.
> > Version v2: code optimizations as suggested by Maxime Coquelin.
> > Version v1: Initial patch set.
> >
> > Pravin M Bathija (5):
> >   vhost: add user to mailmap and define to vhost hdr
> >   vhost_user: header defines for add/rem mem region
> >   vhost_user: support function defines for back-end
> >   vhost_user: Function defs for add/rem mem regions
> >   vhost_user: enable configure memory slots
> >
> >  .mailmap               |   1 +
> >  lib/vhost/rte_vhost.h  |   4 +
> >  lib/vhost/vhost_user.c | 392
> > ++++++++++++++++++++++++++++++++++++-----
> >  lib/vhost/vhost_user.h |  10 ++
> >  4 files changed, 364 insertions(+), 43 deletions(-)
> >
>
> AI still finds a number of issues.
>
> Patches 1/5, 2/5, and 5/5 look fine.
>
> In patch 4/5, there is an fd leak in vhost_user_add_mem_reg() when
> vhost_user_mmap_region() fails. The fd has already been moved from
> ctx->fds[0] into reg->fd (and ctx->fds[0] set to -1) before the
> mmap call. On failure the goto lands at close_msg_fds which only closes ctx-
> >fds[] (all -1 at that point). The fd in reg->fd is never closed because
> free_mem_region() is not reached (it guards on reg->mmap_addr which mmap
> never set). Suggest adding
> close(reg->fd) before the goto, or introducing a cleanup label.

Fixed by adding close(reg->fd) and reg->fd = -1 before the goto close_msg_fds in the mmap failure path. This
closes the fd that was already moved out of ctx->fds[0] and ensures it is not leaked.

>
> Also in patch 4/5, dev_invalidate_vrings() hardcodes
> VHOST_USER_ASSERT_LOCK(dev, vq, VHOST_USER_ADD_MEM_REG) but is
> called from both add_mem_reg and rem_mem_reg. The static_assert passes
> because both handlers set lock_all_qps = true, but the debug assertion message
> will report the wrong message ID on the remove path. Consider passing the
> message ID as a parameter.

Fixed. Converted dev_invalidate_vrings into a macro plus implementation function pair. The macro accepts the message ID as a
parameter and performs the static_assert with the correct token at the call site, then calls _dev_invalidate_vrings() for the
actual work. The add_mem_reg caller passes VHOST_USER_ADD_MEM_REG and the rem_mem_reg caller passes VHOST_USER_REM_MEM_REG, so
the debug assertion message always reports the correct handler.

>
> Minor: in patch 3/5, vhost_user_initialize_memory() uses
> VHOST_MEMORY_MAX_NREGIONS as max_guest_pages. Upstream uses a
> hardcoded 8 for this. The two values happen to be equal today but have
> different semantics — max_guest_pages is about hugepage tracking
> granularity, not region count.

Agreed. Restored to the upstream hardcoded value of 8. The guest_pages array grows dynamically via add_one_guest_page when it
fills up, so this is just the initial allocation size and should not be coupled to the region count constant.




More information about the dev mailing list