[dpdk-dev] [PATCH] eal: disable IOVA mode detection by default

Jerin Jacob jerin.jacob at caviumnetworks.com
Wed Nov 1 07:54:32 CET 2017


-----Original Message-----
> Date: Tue, 31 Oct 2017 23:21:18 -0700
> From: Ferruh Yigit <ferruh.yigit at intel.com>
> To: Jerin Jacob <jerin.jacob at caviumnetworks.com>
> CC: Thomas Monjalon <thomas at monjalon.net>, Bruce Richardson
>  <bruce.richardson at intel.com>, Sergio Gonzalez Monroy
>  <sergio.gonzalez.monroy at intel.com>, dev at dpdk.org, Jianfeng Tan
>  <jianfeng.tan at intel.com>, Santosh Shukla
>  <santosh.shukla at caviumnetworks.com>
> Subject: Re: [dpdk-dev] [PATCH] eal: disable IOVA mode detection by default
> User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101
>  Thunderbird/52.4.0
> 
> On 10/31/2017 9:02 PM, Jerin Jacob wrote:
> > -----Original Message-----
> >> Date: Wed, 1 Nov 2017 01:07:26 +0000
> >> From: Ferruh Yigit <ferruh.yigit at intel.com>
> >> To: Thomas Monjalon <thomas at monjalon.net>, Bruce Richardson
> >>  <bruce.richardson at intel.com>, Sergio Gonzalez Monroy
> >>  <sergio.gonzalez.monroy at intel.com>
> >> CC: dev at dpdk.org, Ferruh Yigit <ferruh.yigit at intel.com>, Jianfeng Tan
> >>  <jianfeng.tan at intel.com>, Santosh Shukla
> >>  <santosh.shukla at caviumnetworks.com>
> >> Subject: [dpdk-dev] [PATCH] eal: disable IOVA mode detection by default
> >> X-Mailer: git-send-email 2.13.6
> >>
> >> Fix kernel crash with KNI because KNI requires physical addresses.
> > 
> > The actual fix would be to make KNI IOMMU aware based on the DPDK mode.
> > 
> > ie. On slow path,
> > 
> > /* Get iommu domain for iova to physical addr conversion */             
> > if (rte_eal_iova_mode() == RTE_IOVA_VA)
> > 	kni->iommu_domain = iommu_get_domain_for_dev(dev);
> > else
> > 	kni->iommu_domain = NULL;
> > 
> > On fast path,
> > 
> > static inline u64 kni_iova_to_phys(struct ... *kni, dma_addr_t dma_addr)    
> > {                                                                               
> >         /* Translation is installed only when IOMMU is present */               
> >         if (kni->iommu_domain)                                                  
> >                 return iommu_iova_to_phys(kni->iommu_domain, dma_addr);         
> >         return dma_addr;                                                        
> > } 
> > 
> >>
> >> A config option introduced to disable IOVA mode detection and to set it
> >> to physical address by default. Disabling config option will enable IOVA
> >> mode detection.
> >>
> >> When there is no intension to use KNI, it is safe to enable detection.
> >>
> >> Config option disable IOVA mode detection by default to be sure only who
> >> is aware of result enable it.
> >>
> >> Fixes: 72d013644bd6 ("mem: honor IOVA mode in malloc virt2phy")
> >>
> >> Signed-off-by: Ferruh Yigit <ferruh.yigit at intel.com>
> >> ---
> >> Cc: Jianfeng Tan <jianfeng.tan at intel.com>
> >> Cc: Santosh Shukla <santosh.shukla at caviumnetworks.com>
> >> Cc: Thomas Monjalon <thomas at monjalon.net>
> >> ---
> >>  config/common_base                | 5 +++++
> >>  lib/librte_eal/bsdapp/eal/eal.c   | 4 ++++
> >>  lib/librte_eal/linuxapp/eal/eal.c | 4 ++++
> >>  3 files changed, 13 insertions(+)
> >>
> >> diff --git a/config/common_base b/config/common_base
> >> index 82ee75456..903e7685b 100644
> >> --- a/config/common_base
> >> +++ b/config/common_base
> >> @@ -107,6 +107,11 @@ CONFIG_RTE_MALLOC_DEBUG=n
> >>  CONFIG_RTE_EAL_NUMA_AWARE_HUGEPAGES=n
> >>  
> >>  #
> >> +# Disabling PHYS_IOVA may crash kernel for KNI, use with caution
> >> +#
> >> +CONFIG_RTE_EAL_USE_PHYS_IOVA=y
> > 
> > Defeat the purpose of all dynamic probing scheme.
> > Either we can fix the KNI or revert the following patch for this release.
> > 
> > http://dpdk.org/commit/f37dfab2
> 
> This commit just enables IOVA VA mode for Intel drivers, that is how I can able
> to observe the issue, but it is not the source of the problem. Reverting that
> commit will not solve KNI crash with any other PMD that enables IOVA VA mode.

I don't understand why a PMD needs to enable IOVA_VA if it can support IOVA_PA.
IMO, IOVA_VA should be enabled only for those device it can WORK ONLY on
IOVA_VA mode. Forget about KNI, If we set CONFIG_RTE_EAL_USE_PHYS_IOVA
as y then the normal stuff wont work for if PMD can operate only in
IOVA_VA mode(like octeontx).

Regarding the KNI crash, it can be avoid by first checking the exiting
mode(rte_eal_iova_mode()). i.e since legacy driver like KNI need real
physical address to work "now", it can grace full exit on the init time if
mode == IOVA_VA;

> 
> Related to the KNI, iommu is not involved at all, I am not clear with your above
> suggestion, physical address is required for kernelspace - userspace communication.

vhost-kernel addressed this case with IOVA as VA. I need to spend cycles
on what takes to remove physical address dependency from KNI.
But someone can add that support if it is required in future.


> KNI uses physical address for two things:
> 
> 1- Creates a buffer in userspace, a memzone, and shares its physical address to
> the kernel, so that both kernel and app can access same buffer.
> 
> The question is, even all devices supports IOVA VA mode, why creating a memzone
> that has real physical address info is not possible anymore?
> For KNI case this is not related at all with what devices supports.
> 
> 2- For each mbuf that will be sent to kernel, dpdk app puts physical address of
> that mbuf into shared buffer, so that kernel can access it. So that kernel can
> access to mbuf data that can be coming from any mempool.
> For this the physical address of the mbuf is required.
> 
> 
> Overall KNI needs to know physical address of mbufs and memzone.
> 
> This patch provides a way to have old behavior with a config option, so that KNI
> can work and anyone who needs to create memzone with physical address can a way
> to have it, and all PMDs will work fine.
> 
> If there is no need for these, it is possible to disable config and dynamic
> probing scheme is already there.
> 
> Thanks,
> ferruh


More information about the dev mailing list