[dpdk-dev] [PATCH v1 1/1] kernel/linux: introduce vfio_pf kernel module

Jerin Jacob jerinjacobk at gmail.com
Fri Nov 1 13:12:54 CET 2019


On Fri, Nov 1, 2019 at 5:24 PM Luca Boccassi <bluca at debian.org> wrote:
>
> For distros, out-of-tree kernel modules are painful. From my POV, it

I agree.

> would be preferable to try and find a solution upstream, even if it is
> going to be difficult and require a lot of negotiation and work.

I understand from RH, They are not packaging out of tree modules,
Would like to know which are the distributions
packaging existing KNI and IGB_UIO modules?

IMO, packaging 1 module vs N module is the same pain as it a matter of
kernel dependency in packaging.
If some reason, we decide to remove the IGB_UIO then we can remove
this module as well.



>
> On Thu, 2019-10-31 at 18:03 +0100, Thomas Monjalon wrote:
> > We don't get enough attention on this topic.
> > Let me rephrase the issue and the proposals with more people Cc'ed.
> >
> > We are talking about SR-IOV VFs in VMs
> > with a PF managed on the host by DPDK.
> > The PF driver is either a (1) bifurcated (Mellanox case),
> > or (2) bound to UIO with igb_uio, or (3) bound to VFIO.
> >
> > In case 1, the PF is still managed by a kernel driver, so no issue.
> >
> > In case 2, the PF is managed by UIO.
> > There is no SR-IOV support in upstream UIO,
> > but the out-of-tree module igb_uio works.
> > However we would like to drop this legacy module from DPDK.
> > Some (most) Linux distributions do not package igb_uio anyway.
> > The other issue is that igb_uio is using physical addressing,
> > which is not acceptable with OCTEON TX2 for performance reason.
> >
> > In case 3, the PF is managed by VFIO. This is the case we want to
> > fix.
> > VFIO does not allow to create VFs.
> > The workaround is to create VFs before binding the PF to VFIO.
> > But since Linux 4.19, VFIO forbids any SR-IOV VF management.
> > There is a security concern about allowing userspace to manage SR-IOV
> > VF messages and taking the responsibility for VFs in the guest.
> >
> > It is desired to allow the system admin deciding the security levels,
> > by adding a flag in VFIO "let me manage VFs, I know what I am doing".
> > Reference of "recent" discussion:
> > https://lkml.org/lkml/2018/3/6/855
> >
> > For now, there is no upstream solution merged.
> >
> > This patch is proposing a solution using an out-of-tree module.
> > In this case, the admin will decide explicitly to bind the PF to
> > vfio_pf.
> > Unfortunately this solution won't work in environments which
> > forbid any out-of-tree module.
> > Another concern is that it looks like DPDK-only solution.
> >
> > We have an issue but we do not want to propose a half-solution
> > which would harm other projects and users.
> > So the question is:
> > Do we accept this patch as a temporary solution?
> > Or can we get an agreement soon for an upstream kernel solution?
> >
> > Thanks for reading and giving your (clear) opinion.
> >
> >
> > 06/09/2019 15:27, Jerin Jacob Kollanukkaran:
> > > From: Thomas Monjalon <
> > > thomas at monjalon.net
> > > >
> > > > 06/09/2019 11:12,
> > > > vattunuru at marvell.com
> > > > :
> > > > > From: Vamsi Attunuru <
> > > > > vattunuru at marvell.com
> > > > > >
> > > > >
> > > > > The DPDK use case such as VF representer or OVS offload etc
> > > > > would call
> > > > > for PF and VF PCIe devices to bind vfio-pci module to enable
> > > > > IOMMU
> > > > > protection.
> > > > >
> > > > > In addition to vSwitch use case, unlike, other PCI class of
> > > > > devices,
> > > > > Network class of PCIe devices would have additional
> > > > > responsibility on
> > > > > the PF devices such as promiscuous mode support etc.
> > > > >
> > > > > The above use cases demand VFIO needs bound to PF and its VF
> > > > > devices.
> > > > > This is use case is not supported in Linux kernel, due to a
> > > > > security
> > > > > issue where it is possible to have DoS in case if VF attached
> > > > > to guest
> > > > > over vfio-pci and netdev kernel driver runs on it and which
> > > > > something
> > > > > VF representer would like to enable it.
> > > > >
> > > > > Since we can not differentiate, the vfio-pci bounded VF devices
> > > > > runs
> > > > > DPDK application or netdev driver in guest, we can not
> > > > > introduce any
> > > > > scheme to fix DoS case and therefore not have proper support of
> > > > > this
> > > > > in the upstream kernel.
> > > > >
> > > > > The igb_uio enables such PF and VF binding support for non-
> > > > > iommu
> > > > > devices to make VF representer or OVS offload run on non-iommu
> > > > > devices
> > > > > with DoS vulnerability for netdev driver as VF.
> > > > >
> > > > > This kernel module, facilitate to enable SRIOV on PF devices,
> > > > > therefore, to run both PF and VF devices in VFIO mode knowing
> > > > > its
> > > > > impacts like igb_uio driver functions of non-iommu devices.
> > > > >
> > > > > Signed-off-by: Vamsi Attunuru <
> > > > > vattunuru at marvell.com
> > > > > >
> > > > > Signed-off-by: Jerin Jacob <
> > > > > jerinj at marvell.com
> > > > > >
> > > >
> > > > Sorry I fail to properly understand the explanation above.
> > > > Please try to split in shorter sentences.
> > > >
> > > > About the request to add an out-of-tree Linux kernel driver, I
> > > > guess Jerin is well
> > > > aware that we don't want such anymore.
> > >
> > > Yes. I am aware of it. I don't like the out of tree modules either.
> > > But, This case,
> > > I suggested Vamsi to have out of tree module.
> > >
> > > Let me describe the issue and let us discuss how to tackle
> > > the  problem:
> > >
> > > # Linux kernel wont allow VFIO PF to have SRIOV enable.
> > >
> > > Patches and on going discussion are here:
> > > https://patchwork.kernel.org/patch/10522381/
> > >
> > > https://lwn.net/Articles/748526/
> > >
> > >
> > > Based on my understanding the reason for NOT allowing the
> > > VFIO PF to have SRIOV enable is genuine from kernel point of
> > > View but not from DPDK point of view.
> > >
> > > Here is the sequence  to describe the problem
> > > 1) Consider Linux kernel allowed VFIO PCI SRIOV enable
> > > 2) PF bound to vfio-pci
> > > 3) using SRIOV infrastructure of vfio-pci  PF driver,
> > > VFs  are created
> > > 4) DPDK application bound to PF and VF, No issue here.
> > > 5) Assume DPDK application bound to PF and VF bound
> > > To netdev kernel driver. Now, there is a genuine  concern
> > > From kernel point of view that, DPDK PF can intercept,
> > > VF mailbox message or so and deny the Kernel request
> > > Or what if DPDK PF application crashes?
> > >
> > > To avoid the case (5), (3) is not allowed in stock kernel.
> > > Which makes sense IMO.
> > >
> > > Now, From DPDK PoV, step 5 is valid as we have
> > > Rte_flow's VF action etc used to enable such case.
> > > Where, user can program the PF's rte_flow to steer
> > > Some traffic to VF, where VF can be, DPDK application or
> > > Linux kernel netdev driver.
> > >
> > > This patch enables the step (3) to enable step (5) from DPDK
> > > PoV. i.e DPDK needs to allow PF to bind to DPDK with VFs.
> > >
> > > Why this issue now:
> > > - igb_uio kernel driver is used as enabling step (3)
> > > See store_max_vfs() kernel/linux/igb_uio/igb_uio.c
> > >  This is fine for non-iommu device, IOMMU devices
> > > needs VFIO.
> > > - We would like support VFIO for IOMMU protection
> > > And enable step (5) as DPDK supports form the spec level.
> > > i.e need to fix feature disparity between iommu vs
> > > non-iommu based devices.
> > >
> > > Note:
> > > We may not need a  brand new kernel module, we could move
> > > this logic to igb_uio if maintenance is concern.
> >
> >
> >
> >
> --
> Kind regards,
> Luca Boccassi


More information about the dev mailing list