[dpdk-dev] [RFC] Add support for device dma mask

Burakov, Anatoly anatoly.burakov at intel.com
Thu Jun 28 12:30:30 CEST 2018


On 28-Jun-18 11:27 AM, Alejandro Lucero wrote:
> 
> 
> On Thu, Jun 28, 2018 at 11:03 AM, Burakov, Anatoly 
> <anatoly.burakov at intel.com <mailto:anatoly.burakov at intel.com>> wrote:
> 
>     On 28-Jun-18 10:56 AM, Alejandro Lucero wrote:
> 
> 
> 
>         On Thu, Jun 28, 2018 at 9:54 AM, Burakov, Anatoly
>         <anatoly.burakov at intel.com <mailto:anatoly.burakov at intel.com>
>         <mailto:anatoly.burakov at intel.com
>         <mailto:anatoly.burakov at intel.com>>> wrote:
> 
>              On 27-Jun-18 5:52 PM, Alejandro Lucero wrote:
> 
> 
> 
>                  On Wed, Jun 27, 2018 at 2:24 PM, Burakov, Anatoly
>                  <anatoly.burakov at intel.com
>         <mailto:anatoly.burakov at intel.com>
>         <mailto:anatoly.burakov at intel.com
>         <mailto:anatoly.burakov at intel.com>>
>                  <mailto:anatoly.burakov at intel.com
>         <mailto:anatoly.burakov at intel.com>
>                  <mailto:anatoly.burakov at intel.com
>         <mailto:anatoly.burakov at intel.com>>>> wrote:
> 
>                       On 27-Jun-18 11:13 AM, Alejandro Lucero wrote:
> 
> 
> 
>                           On Wed, Jun 27, 2018 at 9:17 AM, Burakov, Anatoly
>                           <anatoly.burakov at intel.com
>         <mailto:anatoly.burakov at intel.com>
>                  <mailto:anatoly.burakov at intel.com
>         <mailto:anatoly.burakov at intel.com>>
>                  <mailto:anatoly.burakov at intel.com
>         <mailto:anatoly.burakov at intel.com>
>                  <mailto:anatoly.burakov at intel.com
>         <mailto:anatoly.burakov at intel.com>>>
>                           <mailto:anatoly.burakov at intel.com
>         <mailto:anatoly.burakov at intel.com>
>                  <mailto:anatoly.burakov at intel.com
>         <mailto:anatoly.burakov at intel.com>>
> 
>                           <mailto:anatoly.burakov at intel.com
>         <mailto:anatoly.burakov at intel.com>
>                  <mailto:anatoly.burakov at intel.com
>         <mailto:anatoly.burakov at intel.com>>>>> wrote:
> 
>                                On 26-Jun-18 6:37 PM, Alejandro Lucero wrote:
> 
>                                    This RFC tries to handle devices with
>         addressing
>                           limitations.
>                                    NFP devices
>                                    4000/6000 can just handle addresses
>         with 40
>                  bits implying
>                                    problems for handling
>                                    physical address when machines have
>         more than
>                  1TB of
>                           memory. But
>                                    because how
>                                    iovas are configured, which can be
>         equivalent
>                  to physical
>                                    addresses or based on
>                                    virtual addresses, this can be a more
>         likely
>                  problem.
> 
>                                    I tried to solve this some time ago:
> 
>         https://www.mail-archive.com/dev@dpdk.org/msg45214.html
>         <https://www.mail-archive.com/dev@dpdk.org/msg45214.html>
>                 
>         <https://www.mail-archive.com/dev@dpdk.org/msg45214.html
>         <https://www.mail-archive.com/dev@dpdk.org/msg45214.html>>
>                                 
>         <https://www.mail-archive.com/dev@dpdk.org/msg45214.html
>         <https://www.mail-archive.com/dev@dpdk.org/msg45214.html>
>                 
>         <https://www.mail-archive.com/dev@dpdk.org/msg45214.html
>         <https://www.mail-archive.com/dev@dpdk.org/msg45214.html>>>
>                                                 
>         <https://www.mail-archive.com/dev@dpdk.org/msg45214.html
>         <https://www.mail-archive.com/dev@dpdk.org/msg45214.html>
>                 
>         <https://www.mail-archive.com/dev@dpdk.org/msg45214.html
>         <https://www.mail-archive.com/dev@dpdk.org/msg45214.html>>
>                                 
>         <https://www.mail-archive.com/dev@dpdk.org/msg45214.html
>         <https://www.mail-archive.com/dev@dpdk.org/msg45214.html>
>                 
>         <https://www.mail-archive.com/dev@dpdk.org/msg45214.html
>         <https://www.mail-archive.com/dev@dpdk.org/msg45214.html>>>>
> 
>                                    It was delayed because there was some
>         changes in
>                           progress with
>                                    EAL device
>                                    handling, and, being honest, I completely
>                  forgot about this
>                                    until now, when
>                                    I have had to work on supporting NFP
>         devices
>                  with DPDK and
>                                    non-root users.
> 
>                                    I was working on a patch for being
>         applied on
>                  main DPDK
>                           branch
>                                    upstream, but
>                                    because changes to memory initialization
>                  during the
>                           last months,
>                                    this can not
>                                    be backported to stable versions, at
>         least the
>                  part
>                           where the
>                                    hugepages iovas
>                                    are checked.
> 
>                                    I realize stable versions only allow bug
>                  fixing, and this
>                                    patchset could
>                                    arguably not be considered as so. But
>         without
>                  this, it
>                           could be,
>                                    although
>                                    unlikely, a DPDK used in a machine
>         with more
>                  than 1TB,
>                           and then
>                                    NFP using
>                                    the wrong DMA host addresses.
> 
>                                    Although virtual addresses used as
>         iovas are more
>                           dangerous, for
>                                    DPDK versions
>                                    before 18.05 this is not worse than
>         with physical
>                           addresses,
>                                    because iovas,
>                                    when physical addresses are not
>         available, are
>                  based on a
>                                    starting address set
>                                    to 0x0.
> 
> 
>                                You might want to look at the following
>         patch:
> 
>         http://patches.dpdk.org/patch/37149/
>         <http://patches.dpdk.org/patch/37149/>
>                  <http://patches.dpdk.org/patch/37149/
>         <http://patches.dpdk.org/patch/37149/>>
>                           <http://patches.dpdk.org/patch/37149/
>         <http://patches.dpdk.org/patch/37149/>
>                  <http://patches.dpdk.org/patch/37149/
>         <http://patches.dpdk.org/patch/37149/>>>
>                                <http://patches.dpdk.org/patch/37149/
>         <http://patches.dpdk.org/patch/37149/>
>                  <http://patches.dpdk.org/patch/37149/
>         <http://patches.dpdk.org/patch/37149/>>
>                           <http://patches.dpdk.org/patch/37149/
>         <http://patches.dpdk.org/patch/37149/>
>                  <http://patches.dpdk.org/patch/37149/
>         <http://patches.dpdk.org/patch/37149/>>>>
> 
>                                Since this patch, IOVA as VA mode uses VA
>                  addresses, and
>                           that has
>                                been backported to earlier releases. I
>         don't think
>                  there's
>                           any case
>                                where we used zero-based addresses any more.
> 
> 
>                           But memsegs get the iova based on hugepages
>         physaddr,
>                  and for VA
>                           mode that is based on 0x0 as starting point.
> 
>                           And as far as I know, memsegs iovas are what
>         end up
>                  being used
>                           for IOMMU mappings and what devices will use.
> 
> 
>                       For when physaddrs are available, IOVA as PA mode
>         assigns IOVA
>                       addresses to PA, while IOVA as VA mode assigns IOVA
>                  addresses to VA
>                       (both 18.05+ and pre-18.05 as per above patch,
>         which was
>                  applied to
>                       pre-18.05 stable releases).
> 
>                       When physaddrs aren't available, IOVA as VA mode
>         assigns IOVA
>                       addresses to VA, both 18.05+ and pre-18.05, as per
>         above patch.
> 
> 
>                  This is right.
> 
>                       If physaddrs aren't available and IOVA as PA mode
>         is used,
>                  then i as
>                       far as i can remember, even though technically
>         memsegs get
>                  their
>                       addresses set to 0x0 onwards, the actual addresses
>         we get in
>                       memzones etc. are RTE_BAD_IOVA.
> 
> 
>                  This is not right. Not sure if this was the intention,
>         but if PA
>                  mode and physaddrs not available, this code inside
>                  vfio_type1_dma_map:
> 
>                  if(rte_eal_iova_mode() == RTE_IOVA_VA)
> 
>                  dma_map.iova = dma_map.vaddr;
> 
>                  else
> 
>                  dma_map.iova = ms[i].iova;
> 
> 
>                  does the IOMMU mapping using the iovas and not the
>         vaddr, with
>                  the iovas starting at 0x0.
> 
> 
>              Yep, you're right, apologies. I confused this with no-huge
>         option.
> 
> 
>         So, what do you think about the patchset? Could it be this
>         applied to stable versions?
> 
>         I'll send a patch for current 18.05 code which will have the dma
>         mask and the hugepage check, along with changes for doing the
>         mmaps below the dma mask limit.
> 
> 
>     I've looked through the code, it looks OK to me (bar some things
>     like missing .map file additions and a gratuitous rte_panic :) ).
> 
>     There was a patch/discussion not too long ago about DMA masks for
>     some IOMMU's - perhaps we can also extend this approach to that?
> 
>     https://patches.dpdk.org/patch/33192/
>     <https://patches.dpdk.org/patch/33192/>
> 
> 
> 
> I completely missed that patch.
> 
> It seems it could also be applied for that case adding a dma mask set if 
> it is an emulated VT-d with that 39 bits restriction.
> 
> I'll take a look at that patch and submit a new patchset including 
> changes for that case. I did also forget the hotplug case where the 
> hugepage checking needs to be invoked.

Great.

Just in case, the original link i provided was to a v2. v3 was accepted:

https://patches.dpdk.org/patch/33650/

Thanks!

> 
> Thanks
> 
> 
> 
> 
>              --     Thanks,
>              Anatoly
> 
> 
> 
> 
>     -- 
>     Thanks,
>     Anatoly
> 
> 


-- 
Thanks,
Anatoly


More information about the dev mailing list