[dpdk-dev] [RFC] Add support for device dma mask

Burakov, Anatoly anatoly.burakov at intel.com
Thu Jun 28 12:03:55 CEST 2018


On 28-Jun-18 10:56 AM, Alejandro Lucero wrote:
> 
> 
> On Thu, Jun 28, 2018 at 9:54 AM, Burakov, Anatoly 
> <anatoly.burakov at intel.com <mailto:anatoly.burakov at intel.com>> wrote:
> 
>     On 27-Jun-18 5:52 PM, Alejandro Lucero wrote:
> 
> 
> 
>         On Wed, Jun 27, 2018 at 2:24 PM, Burakov, Anatoly
>         <anatoly.burakov at intel.com <mailto:anatoly.burakov at intel.com>
>         <mailto:anatoly.burakov at intel.com
>         <mailto:anatoly.burakov at intel.com>>> wrote:
> 
>              On 27-Jun-18 11:13 AM, Alejandro Lucero wrote:
> 
> 
> 
>                  On Wed, Jun 27, 2018 at 9:17 AM, Burakov, Anatoly
>                  <anatoly.burakov at intel.com
>         <mailto:anatoly.burakov at intel.com>
>         <mailto:anatoly.burakov at intel.com
>         <mailto:anatoly.burakov at intel.com>>
>                  <mailto:anatoly.burakov at intel.com
>         <mailto:anatoly.burakov at intel.com>
> 
>                  <mailto:anatoly.burakov at intel.com
>         <mailto:anatoly.burakov at intel.com>>>> wrote:
> 
>                       On 26-Jun-18 6:37 PM, Alejandro Lucero wrote:
> 
>                           This RFC tries to handle devices with addressing
>                  limitations.
>                           NFP devices
>                           4000/6000 can just handle addresses with 40
>         bits implying
>                           problems for handling
>                           physical address when machines have more than
>         1TB of
>                  memory. But
>                           because how
>                           iovas are configured, which can be equivalent
>         to physical
>                           addresses or based on
>                           virtual addresses, this can be a more likely
>         problem.
> 
>                           I tried to solve this some time ago:
> 
>         https://www.mail-archive.com/dev@dpdk.org/msg45214.html
>         <https://www.mail-archive.com/dev@dpdk.org/msg45214.html>
>                 
>         <https://www.mail-archive.com/dev@dpdk.org/msg45214.html
>         <https://www.mail-archive.com/dev@dpdk.org/msg45214.html>>
>                                 
>         <https://www.mail-archive.com/dev@dpdk.org/msg45214.html
>         <https://www.mail-archive.com/dev@dpdk.org/msg45214.html>
>                 
>         <https://www.mail-archive.com/dev@dpdk.org/msg45214.html
>         <https://www.mail-archive.com/dev@dpdk.org/msg45214.html>>>
> 
>                           It was delayed because there was some changes in
>                  progress with
>                           EAL device
>                           handling, and, being honest, I completely
>         forgot about this
>                           until now, when
>                           I have had to work on supporting NFP devices
>         with DPDK and
>                           non-root users.
> 
>                           I was working on a patch for being applied on
>         main DPDK
>                  branch
>                           upstream, but
>                           because changes to memory initialization
>         during the
>                  last months,
>                           this can not
>                           be backported to stable versions, at least the
>         part
>                  where the
>                           hugepages iovas
>                           are checked.
> 
>                           I realize stable versions only allow bug
>         fixing, and this
>                           patchset could
>                           arguably not be considered as so. But without
>         this, it
>                  could be,
>                           although
>                           unlikely, a DPDK used in a machine with more
>         than 1TB,
>                  and then
>                           NFP using
>                           the wrong DMA host addresses.
> 
>                           Although virtual addresses used as iovas are more
>                  dangerous, for
>                           DPDK versions
>                           before 18.05 this is not worse than with physical
>                  addresses,
>                           because iovas,
>                           when physical addresses are not available, are
>         based on a
>                           starting address set
>                           to 0x0.
> 
> 
>                       You might want to look at the following patch:
> 
>         http://patches.dpdk.org/patch/37149/
>         <http://patches.dpdk.org/patch/37149/>
>                  <http://patches.dpdk.org/patch/37149/
>         <http://patches.dpdk.org/patch/37149/>>
>                       <http://patches.dpdk.org/patch/37149/
>         <http://patches.dpdk.org/patch/37149/>
>                  <http://patches.dpdk.org/patch/37149/
>         <http://patches.dpdk.org/patch/37149/>>>
> 
>                       Since this patch, IOVA as VA mode uses VA
>         addresses, and
>                  that has
>                       been backported to earlier releases. I don't think
>         there's
>                  any case
>                       where we used zero-based addresses any more.
> 
> 
>                  But memsegs get the iova based on hugepages physaddr,
>         and for VA
>                  mode that is based on 0x0 as starting point.
> 
>                  And as far as I know, memsegs iovas are what end up
>         being used
>                  for IOMMU mappings and what devices will use.
> 
> 
>              For when physaddrs are available, IOVA as PA mode assigns IOVA
>              addresses to PA, while IOVA as VA mode assigns IOVA
>         addresses to VA
>              (both 18.05+ and pre-18.05 as per above patch, which was
>         applied to
>              pre-18.05 stable releases).
> 
>              When physaddrs aren't available, IOVA as VA mode assigns IOVA
>              addresses to VA, both 18.05+ and pre-18.05, as per above patch.
> 
> 
>         This is right.
> 
>              If physaddrs aren't available and IOVA as PA mode is used,
>         then i as
>              far as i can remember, even though technically memsegs get
>         their
>              addresses set to 0x0 onwards, the actual addresses we get in
>              memzones etc. are RTE_BAD_IOVA.
> 
> 
>         This is not right. Not sure if this was the intention, but if PA
>         mode and physaddrs not available, this code inside
>         vfio_type1_dma_map:
> 
>         if(rte_eal_iova_mode() == RTE_IOVA_VA)
> 
>         dma_map.iova = dma_map.vaddr;
> 
>         else
> 
>         dma_map.iova = ms[i].iova;
> 
> 
>         does the IOMMU mapping using the iovas and not the vaddr, with
>         the iovas starting at 0x0.
> 
> 
>     Yep, you're right, apologies. I confused this with no-huge option.
> 
> 
> So, what do you think about the patchset? Could it be this applied to 
> stable versions?
> 
> I'll send a patch for current 18.05 code which will have the dma mask 
> and the hugepage check, along with changes for doing the mmaps below the 
> dma mask limit.

I've looked through the code, it looks OK to me (bar some things like 
missing .map file additions and a gratuitous rte_panic :) ).

There was a patch/discussion not too long ago about DMA masks for some 
IOMMU's - perhaps we can also extend this approach to that?

https://patches.dpdk.org/patch/33192/

> 
> 
> 
>     -- 
>     Thanks,
>     Anatoly
> 
> 


-- 
Thanks,
Anatoly


More information about the dev mailing list