[dpdk-dev] [RFC] eal/memory: introducing an option to set iova as va

Jerin Jacob jerin.jacob at caviumnetworks.com
Tue Jun 6 12:38:16 CEST 2017


-----Original Message-----
> Date: Tue, 6 Jun 2017 10:57:20 +0100
> From: Bruce Richardson <bruce.richardson at intel.com>
> To: santosh <santosh.shukla at caviumnetworks.com>
> CC: thomas at monjalon.net, dev at dpdk.org, jerin.jacob at caviumnetworks.com,
>  hemant.agrawal at nxp.com
> Subject: Re: [dpdk-dev] [RFC] eal/memory: introducing an option to set iova
>  as va
> User-Agent: Mutt/1.8.1 (2017-04-11)
> 
> On Mon, Jun 05, 2017 at 10:24:11AM +0530, santosh wrote:
> > Hi Bruce,
> > 
> > 
> > On Friday 02 June 2017 02:57 PM, Bruce Richardson wrote:
> > > On Fri, Jun 02, 2017 at 09:54:46AM +0530, santosh wrote:
> > >> Ping?
> > >>
> > >> On Wednesday 24 May 2017 09:41 PM, Santosh Shukla wrote:
> > >>
> > >>> Some NPU hardware like OCTEONTX follows push model to get
> > >>> the packet from the pktio device. Where packet allocation
> > >>> and freeing done by the HW. Since HW can operate only on
> > >>> IOVA with help of SMMU/IOMMU, When packet receives from the
> > >>> Ethernet device, It is the IOVA address(which is PA in existing scheme).
> > >>>
> > >>> Mapping IOVA as PA is expensive on those HW, where every
> > >>> packet needs to be converted to VA from PA/IOVA.
> > >>>
> > >>> This patch proposes the scheme where the user can set IOVA
> > >>> as VA by using an eal command line argument. That helps to
> > >>> avoid costly lookup for VA in SW by leveraging the SMMU
> > >>> translation feature.
> > >>>
> > >>> Signed-off-by: Santosh Shukla <santosh.shukla at caviumnetworks.com>
> > >>> ---
> > > Hi,
> > >
> > > I agree this is a problem that needs to be solved, but this doesn't look
> > > like a particularly future-proofed solution. Given that we should
> > > use the IOMMU on as many platforms as possible for protection, we
> > > probably need to find an automatic way for DPDK to use IO addresses
> > > correctly. Is this therefore better done as part of the VFIO and
> > > UIO-specific code in EAL - as that is the part that knows how the memory
> > > mapping is done, and in the VFIO case, what address ranges were
> > > programmed in. The mempool driver was something else I considered but it
> > > is probably too high a level to implement this.
> > 
> > The other approach which we evaluated, Its detail:
> > 0) Introduce a new bus api whose job is to detect iommu capable devices on that
> > bus {/ are those devices bind to iommu capable driver or not?}. Let's call that
> > api rte_bus_chk_iommu_dev();
> > 
> > 1) The scheme is like If _all_ the devices bind to iommu kdrv then return iova=va
> > 2) Otherwise switch to default mode i.e.. iova=pa.
> > 3) Based on rte_bus_chk_iommu_dev() return value, 
> > accordingly program iova=va Or iova=pa in vfio_type1/spapr_map(). 
> > 
> > 4) User from the command line can always override iova=va, 
> > in case if he wants to default scheme( iova=pa mode). For that purpose - Introduce eal
> > option something like --iova-pa Or --override-iova Or --iova-default 
> > or some better name.
> > 
> > Proposed API snap:
> > 
> > enum iova_mode {
> >     iova_va;
> >     iova_pa;
> >     iova_unknown;
> > };
> > 
> > /**
> >  * Look for iommu devices on that Bus.
> >  * And find out that those devices bind to iommu
> >  * capable driver example vfio.
> >  *
> >  *
> >  * @return
> >  *      On success return valid iova mode (iova_va or iova_pa)
> >  *      On failure return iova_unkown.
> >  */
> > typedef int (*rte_bus_chk_iommu_dev_t)(void);
> > 
> > 
> > By this approach, 
> > - We can automatically detect iova is va or pa
> > and then program accordingly. 
> > - Also, the user can always switch to default iova mode.
> > - Drivers like dpaa2 can use this API to detect iova mode then 
> > program dma_map accordingly. Currently they are doing in ifdef-way.
> > 
> > Comments? thoughts? Or if anyone has better proposal then, please
> > suggest.
> > 
> 
> That sounds a more complete solution. However, it's probably a lot of
> work to implement. :-)
> 
> I also wonder if we want to simplify things a little and disallow
> mixed-mode operation i.e. all devices have to use UIO or all use VFIO?
> Would that help to allow simplification or other options. Having a whole
> new bus type seems strange for this. Can each bus just report whether
> it's members require physical addresses. Then the EAL can manage a
> single flag to report whether we are using VA or PA?

That's the plan. Each bus op can say, VA or PA or Don't care(in the
case of vdev). And rte_bus aggregation function check all the buses
preferred address scheme and decide the mode of operation. Yes, We will
keep aggregation logic simple now, where when all bus says to go with VA
and Don't care, we will go with VA else PA.

> /Bruce


More information about the dev mailing list