[dpdk-dev] [PATCH] bus/pci: fix IOVA as VA mode selection

Jerin Jacob Kollanukkaran jerinj at marvell.com
Wed Jul 10 11:49:59 CEST 2019


Not sure if it is problem from my email client or David email settings, I am getting the David email ONLY as HTML.
And outlook creating format issues when change to plain text on reply.
It looks like it is due to "Content-Type: multipart/alternative".

Please find inline reply.


From: David Marchand <david.marchand at redhat.com> 
Sent: Wednesday, July 10, 2019 1:40 PM
To: Jerin Jacob Kollanukkaran <jerinj at marvell.com>; Burakov, Anatoly <anatoly.burakov at intel.com>
Cc: dev <dev at dpdk.org>; Thomas Monjalon <thomas at monjalon.net>; Ben Walker <benjamin.walker at intel.com>
Subject: Re: [EXT] Re: [dpdk-dev] [PATCH] bus/pci: fix IOVA as VA mode selection

Hello guys,


> 
> Currently (again, disregarding your interpretation of how IOVA as VA works
> and looking at the actual commit history), we always seem to imply that IOVA
> as PA works for all devices, and we use IOVA_AS_VA flag to indicate that the
> device *also* supports IOVA as VA mode.
> 
> But we don't have any way to express a *requirement* for IOVA as VA mode
> - only for IOVA as PA mode. That is the purpose of the new flag. You are
> stating that the IOVA_AS_VA drv flag is an expression of that requirement,
> but that is not reflected in the codebase - our commit history indicates that
> we don't treat IOVA as VA as hard requirement whenever this flag is
> specified (and i would argue that we shouldn't).

> No objection to further classify it.

I propose to introduce:
* RTE_PCI_DRV_IOVA_AS_PA which means "the combination of the pmd+kmod+hw supports usage of Physical Addresses"
* RTE_PCI_DRV_IOVA_AS_VA which means "the combination of the pmd+kmod+hw supports usage of Virtual Addresses"

- For the pci bus, the algorigthm would be:

devices_want_pa = false
devices_want_va = false

Foreach pci device
  Skip blacklisted devices
  Skip unbound devices (i.e. we only consider devices bound to a known kernel driver)
  Skip unsupported devices (i.e. we only consider devices that have a pmd that supports them)

  If the combination pmd+kmod only supports VA (RTE_PCI_DRV_IOVA_AS_VA capability in driver flags), then devices_want_va = true
  Else if the combination pmd+kmod only supports PA (RTE_PCI_DRV_IOVA_AS_PA capability in driver flags), then devices_want_pa = true

If devices_want_va and !devices_want_pa
  return RTE_IOVA_VA
If devices_want_pa and !devices_want_va
  return RTE_IOVA_PA

return RTE_IOVA_DC

---------------
[Jerin] I am fine with introducing RTE_PCI_DRV_IOVA_AS_PA instead of RTE_PCI_DRV_IOVA_AS_DC(As I proposed earlier).
Only my concern is there may not be any PCIe device which is hitting the following.

If devices_want_pa and !devices_want_va
  return RTE_IOVA_PA

Assuming for i40e etc, You will change to RTE_PCI_DRV_IOVA_AS_PA | RTE_PCI_DRV_IOVA_AS_VA.
No strong option on RTE_PCI_DRV_IOVA_AS_PA vs RTE_PCI_DRV_IOVA_AS_DC.

---------------
Notes:
* the IOMMU limitations are considered as a per device/driver thing, since the kmod is the one that configures the system IOMMU,
* the case "devices_want_pa and devices_want_va" is considered as DC, we leave EAL decide based on the physical addresses availability because we can't comply with all present devices/drivers in the system.
  This means that at bus probe time for a device, we must add a check that the combination is fulfilled (and avoid this check in the drivers themselves).


- For the global bus code, that aggregates the different buses preferences, we need to do the same, while I suspect a bug at the moment.

The algorigthm:

buses_want_pa = false
buses_want_va = false

Foreach bus
  If the bus reports RTE_IOVA_VA, then buses_want_va = true
  Else if the bus reports RTE_IOVA_PA, then buses_want_pa = true

If buses_want_va and !buses_want_pa
  return RTE_IOVA_VA
If buses_want_pa and !buses_want_va
  return RTE_IOVA_PA

return RTE_IOVA_DC

- Finally at EAL level, we keep the current code.

----------------------------------

[Jerin] Algorithm look OK to me.  All of the following devices[1] added RTE_PCI_DRV_IOVA_AS_VA in the driver list
to run in VA mode to enable DPDK to run with out root privilege. But due to recent change it making as PA i.e
need root privilege to run.

May it is a separate topic to dicuss what would be default if the system has IOMMU and device is RTE_PCI_DRV_IOVA_AS_VA | RTE_PCI_DRV_IOVA_AS_PA.
I would say RTE_IOVA_VA. But I don’t know why it changed to RTE_IOVA_PA for hotplug or SPDK?


[1] http://patchwork.dpdk.org/patch/53206/
[2] http://patchwork.dpdk.org/patch/50274/
[3] http://patchwork.dpdk.org/patch/50991/
[4] http://patchwork.dpdk.org/patch/46134/


-----------------------------------

Hope I did not miss anything.
If we agree on this, I will send the changes and an update in the documentation.


-- 
David Marchand



More information about the dev mailing list