[dpdk-dev] [PATCH v4 1/4] vfio: revert changes for map contiguous areas in one go
David Christensen
drc at linux.vnet.ibm.com
Wed Dec 2 19:36:34 CET 2020
On 12/1/20 9:46 PM, Nithin Dabilpuram wrote:
> In order to save DMA entries limited by kernel both for externel
> memory and hugepage memory, an attempt was made to map physically
> contiguous memory in one go. This cannot be done as VFIO IOMMU type1
> does not support partially unmapping a previously mapped memory
> region while Heap can request for multi page mapping and
> partial unmapping.
> Hence for going back to old method of mapping/unmapping at
> memseg granularity, this commit reverts
> commit d1c7c0cdf7ba ("vfio: map contiguous areas in one go")
>
> Also add documentation on what module parameter needs to be used
> to increase the per-container dma map limit for VFIO.
>
> Fixes: d1c7c0cdf7ba ("vfio: map contiguous areas in one go")
> Cc: anatoly.burakov at intel.com
> Cc: stable at dpdk.org
>
> Signed-off-by: Nithin Dabilpuram <ndabilpuram at marvell.com>
> Acked-by: Anatoly Burakov <anatoly.burakov at intel.com>
> ---
> doc/guides/linux_gsg/linux_drivers.rst | 10 ++++++
> lib/librte_eal/linux/eal_vfio.c | 59 +++++-----------------------------
> 2 files changed, 18 insertions(+), 51 deletions(-)
>
> diff --git a/doc/guides/linux_gsg/linux_drivers.rst b/doc/guides/linux_gsg/linux_drivers.rst
> index 90635a4..9a662a7 100644
> --- a/doc/guides/linux_gsg/linux_drivers.rst
> +++ b/doc/guides/linux_gsg/linux_drivers.rst
> @@ -25,6 +25,16 @@ To make use of VFIO, the ``vfio-pci`` module must be loaded:
> VFIO kernel is usually present by default in all distributions,
> however please consult your distributions documentation to make sure that is the case.
>
> +For DMA mapping of either external memory or hugepages, VFIO interface is used.
> +VFIO does not support partial unmap of once mapped memory. Hence DPDK's memory is
> +mapped in hugepage granularity or system page granularity. Number of DMA
> +mappings is limited by kernel with user locked memory limit of a process(rlimit)
> +for system/hugepage memory. Another per-container overall limit applicable both
> +for external memory and system memory was added in kernel 5.1 defined by
> +VFIO module parameter ``dma_entry_limit`` with a default value of 64K.
> +When application is out of DMA entries, these limits need to be adjusted to
> +increase the allowed limit.
> +
> Since Linux version 5.7,
> the ``vfio-pci`` module supports the creation of virtual functions.
> After the PF is bound to ``vfio-pci`` module,
> diff --git a/lib/librte_eal/linux/eal_vfio.c b/lib/librte_eal/linux/eal_vfio.c
> index 0500824..64b134d 100644
> --- a/lib/librte_eal/linux/eal_vfio.c
> +++ b/lib/librte_eal/linux/eal_vfio.c
> @@ -517,11 +517,9 @@ static void
> vfio_mem_event_callback(enum rte_mem_event type, const void *addr, size_t len,
> void *arg __rte_unused)
> {
> - rte_iova_t iova_start, iova_expected;
> struct rte_memseg_list *msl;
> struct rte_memseg *ms;
> size_t cur_len = 0;
> - uint64_t va_start;
>
> msl = rte_mem_virt2memseg_list(addr);
>
> @@ -539,63 +537,22 @@ vfio_mem_event_callback(enum rte_mem_event type, const void *addr, size_t len,
>
> /* memsegs are contiguous in memory */
> ms = rte_mem_virt2memseg(addr, msl);
> -
> - /*
> - * This memory is not guaranteed to be contiguous, but it still could
> - * be, or it could have some small contiguous chunks. Since the number
> - * of VFIO mappings is limited, and VFIO appears to not concatenate
> - * adjacent mappings, we have to do this ourselves.
> - *
> - * So, find contiguous chunks, then map them.
> - */
> - va_start = ms->addr_64;
> - iova_start = iova_expected = ms->iova;
> while (cur_len < len) {
> - bool new_contig_area = ms->iova != iova_expected;
> - bool last_seg = (len - cur_len) == ms->len;
> - bool skip_last = false;
> -
> - /* only do mappings when current contiguous area ends */
> - if (new_contig_area) {
> - if (type == RTE_MEM_EVENT_ALLOC)
> - vfio_dma_mem_map(default_vfio_cfg, va_start,
> - iova_start,
> - iova_expected - iova_start, 1);
> - else
> - vfio_dma_mem_map(default_vfio_cfg, va_start,
> - iova_start,
> - iova_expected - iova_start, 0);
> - va_start = ms->addr_64;
> - iova_start = ms->iova;
> - }
> /* some memory segments may have invalid IOVA */
> if (ms->iova == RTE_BAD_IOVA) {
> RTE_LOG(DEBUG, EAL, "Memory segment at %p has bad IOVA, skipping\n",
> ms->addr);
> - skip_last = true;
> + goto next;
> }
> - iova_expected = ms->iova + ms->len;
> + if (type == RTE_MEM_EVENT_ALLOC)
> + vfio_dma_mem_map(default_vfio_cfg, ms->addr_64,
> + ms->iova, ms->len, 1);
> + else
> + vfio_dma_mem_map(default_vfio_cfg, ms->addr_64,
> + ms->iova, ms->len, 0);
> +next:
> cur_len += ms->len;
> ++ms;
> -
> - /*
> - * don't count previous segment, and don't attempt to
> - * dereference a potentially invalid pointer.
> - */
> - if (skip_last && !last_seg) {
> - iova_expected = iova_start = ms->iova;
> - va_start = ms->addr_64;
> - } else if (!skip_last && last_seg) {
> - /* this is the last segment and we're not skipping */
> - if (type == RTE_MEM_EVENT_ALLOC)
> - vfio_dma_mem_map(default_vfio_cfg, va_start,
> - iova_start,
> - iova_expected - iova_start, 1);
> - else
> - vfio_dma_mem_map(default_vfio_cfg, va_start,
> - iova_start,
> - iova_expected - iova_start, 0);
> - }
> }
> }
>
Acked-by: David Christensen <drc at linux.vnet.ibm.com>
More information about the dev
mailing list