[dpdk-dev] [PATCH v2] eal/bus: use RTE_IOVA_PA only if phys addresses are available

Alejandro Lucero alejandro.lucero at netronome.com
Tue Oct 30 13:58:35 CET 2018


On Mon, Sep 17, 2018 at 2:06 PM Stojaczyk, Dariusz <
dariusz.stojaczyk at intel.com> wrote:

>
>
> > -----Original Message-----
> > From: Burakov, Anatoly
> > Sent: Monday, September 17, 2018 12:34 PM
> > To: Stojaczyk, Dariusz <dariusz.stojaczyk at intel.com>; dev at dpdk.org;
> > Santosh Shukla <santosh.shukla at caviumnetworks.com>; Hemant Agrawal
> > <hemant.agrawal at nxp.com>; Jerin Jacob
> > <jerin.jacob at caviumnetworks.com>
> > Cc: Maxime Coquelin <maxime.coquelin at redhat.com>; Chas Williams
> > <chas3 at att.com>
> > Subject: Re: [PATCH v2] eal/bus: use RTE_IOVA_PA only if phys addresses
> > are available
> >
> > On 07-Sep-18 4:58 PM, Darek Stojaczyk wrote:
> > > When neither RTE_IOVA_VA nor RTE_IOVA_PA was explicitly requested,
> > > DPDK would currently fallback to the default RTE_IOVA_PA mode and
> > > possibly encounter a failure later on if running as a non-priviledged
> > > user. Attempting to use RTE_IOVA_VA if no phys addresses are available
> > > may help in this case.
> > >
> > > Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk at intel.com>
> > > ---
> > > Changes since v1:
> > >   * added a missing rte_memory.h include
> > >
> > >   lib/librte_eal/common/eal_common_bus.c | 19 +++++++++++++++----
> > >   1 file changed, 15 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/lib/librte_eal/common/eal_common_bus.c
> > > b/lib/librte_eal/common/eal_common_bus.c
> > > index 0943851cc..68c581b8a 100644
> > > --- a/lib/librte_eal/common/eal_common_bus.c
> > > +++ b/lib/librte_eal/common/eal_common_bus.c
> > > @@ -37,6 +37,7 @@
> > >   #include <rte_bus.h>
> > >   #include <rte_debug.h>
> > >   #include <rte_string_fns.h>
> > > +#include <rte_memory.h>
> > >
> > >   #include "eal_private.h"
> > >
> > > @@ -236,9 +237,19 @@ rte_bus_get_iommu_class(void)
> > >                     mode |= bus->get_iommu_class();
> > >     }
> > >
> > > -   if (mode != RTE_IOVA_VA) {
> > > -           /* Use default IOVA mode */
> > > -           mode = RTE_IOVA_PA;
> > > +   if (mode == RTE_IOVA_VA)
> > > +           return RTE_IOVA_VA;
> > > +
> > > +   if (mode & RTE_IOVA_PA) {
> > > +           /* Not all buses support RTE_IOVA_VA, fallback to
> > RTE_IOVA_PA */
> > > +           return RTE_IOVA_PA;
> > > +   }
> > > +
> > > +   if (rte_eal_using_phys_addrs()) {
> > > +           /* Default to RTE_IOVA_PA only if it's supported */
> > > +           return RTE_IOVA_PA;
> > >     }
> > > -   return mode;
> > > +
> > > +   /* Since RTE_IOVA_PA is unsupported, fallback to RTE_IOVA_VA */
> > > +   return RTE_IOVA_VA;
> > >   }
> > >
> >
> > This is a good change, however I think that this is too pessimistic. If
> i don't
> > have any devices that explictly require IOVA_PA, i should be running in
> > IOVA_VA mode.
>
> Another problem may occur when trying to hotplug devices that support only
> 39bit DMA. You may not be able to map any memory with vfio when in
> RTE_IOVA_VA mode, as virtual addresses likely occupy more than 39 bits.
>
>
There is now a hint for trying to map memory as low as possible instead of
using default Linux mmap base address. This makes devices with addressing
limitations being usable as long as the physical memory to map is not more
than what those devices allow.



> The rte_pci bus enforces RTE_IOVA_PA whenever it finds such devices on
> init.
>
> I have no doubt the logic can be improved here, but for now RTE_IOVA_PA is
> the only safe default.
>
> D.
>
> >
> > This of course doesn't take hotplug into account, so a command-line
> switch
> > to force one or the other should also be available.
> >
> > For example, at startup, i might have devices bound to VFIO, so IOVA_VA
> > mode is picked. However, even though at a time of startup none of the
> > devices require physical addresses, i also know that i might later
> hotplug a
> > device that requires IOVA_PA (leaving the question of hotplug brokenness
> > aside for now...) - currently, this scenario will not work, as i will be
> forced to
> > use IOVA_VA mode unless i happen to have a IOVA_PA device available at
> > startup.
> >
> > Similarly, if i'm running DPDK as root but am only using virtual devices
> like
> > pcap, i should be able to force DPDK into using VA addresses [*], yet
> > currently i will be forced to use IOVA_PA if i don't *also* have a few
> devices
> > bound exclusively to VFIO.
> >
> > [*] Do we have vdev devices that require IOVA_PA? I can't think of any...
> >
> > --
> > Thanks,
> > Anatoly
>


More information about the dev mailing list