[dpdk-dev] [dpdk-dev, RFC] drivers: advertise kmod dependencies in pmdinfo

Trahe, Fiona fiona.trahe at intel.com
Fri Sep 2 15:52:46 CEST 2016



> -----Original Message-----
> From: Neil Horman [mailto:nhorman at tuxdriver.com]
> Sent: Friday, September 2, 2016 2:33 PM
> To: Trahe, Fiona <fiona.trahe at intel.com>
> Cc: Stephen Hemminger <stephen at networkplumber.org>; dev at dpdk.org;
> Olivier Matz <olivier.matz at 6wind.com>; Thomas Monjalon
> <thomas.monjalon at 6wind.com>
> Subject: Re: [dpdk-dev] [dpdk-dev, RFC] drivers: advertise kmod dependencies
> in pmdinfo
> 
> On Fri, Sep 02, 2016 at 09:19:26AM +0000, Trahe, Fiona wrote:
> >
> >
> > > -----Original Message-----
> > > From: Neil Horman [mailto:nhorman at tuxdriver.com]
> > > Sent: Thursday, September 1, 2016 8:16 PM
> > > To: Stephen Hemminger <stephen at networkplumber.org>
> > > Cc: Trahe, Fiona <fiona.trahe at intel.com>; dev at dpdk.org; Olivier Matz
> > > <olivier.matz at 6wind.com>; Thomas Monjalon
> > > <thomas.monjalon at 6wind.com>
> > > Subject: Re: [dpdk-dev] [dpdk-dev, RFC] drivers: advertise kmod
> > > dependencies in pmdinfo
> > >
> > > On Thu, Sep 01, 2016 at 10:41:22AM -0700, Stephen Hemminger wrote:
> > > > On Thu, 1 Sep 2016 13:35:19 -0400
> > > > Neil Horman <nhorman at tuxdriver.com> wrote:
> > > >
> > > > > On Thu, Sep 01, 2016 at 12:55:27PM +0000, Trahe, Fiona wrote:
> > > > > > Hi Neil and Olivier,
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Olivier
> > > > > > > Matz
> > > > > > > Sent: Wednesday, August 31, 2016 2:40 PM
> > > > > > > To: Neil Horman <nhorman at tuxdriver.com>
> > > > > > > Cc: dev at dpdk.org; thomas.monjalon at 6wind.com
> > > > > > > Subject: Re: [dpdk-dev] [dpdk-dev, RFC] drivers: advertise
> > > > > > > kmod dependencies in pmdinfo
> > > > > > >
> > > > > > > Hi Neil,
> > > > > > >
> > > > > > > On 08/31/2016 03:27 PM, Neil Horman wrote:
> > > > > > > > On Wed, Aug 31, 2016 at 11:21:18AM +0200, Olivier Matz wrote:
> > > > > > > >> Hi Neil,
> > > > > > > >>
> > > > > > > >> On 08/30/2016 03:23 PM, Neil Horman wrote:
> > > > > > > >>> On Fri, Aug 26, 2016 at 03:20:46PM +0200, Olivier Matz wrote:
> > > > > > > >>>> Add a new macro DRIVER_REGISTER_KMOD_DEP() that allows
> > > > > > > >>>> a driver to declare the list of kernel modules required to run
> properly.
> > > > > > > >>>>
> > > > > > > >>>> Today, most PCI drivers require uio/vfio.
> > > > > > > >>>>
> > > > > > > >>>> Signed-off-by: Olivier Matz <olivier.matz at 6wind.com>
> > > > > > > >>>>
> > > > > > > >>>> ---
> > > > > > > >>>> In this RFC, I supposed that all PCI drivers require a
> > > > > > > >>>> the loading of a uio/vfio module (except mlx*), this may be
> wrong.
> > > > > > > >>>> Comments are welcome!
> > > > > > > >>>>
> > > > > > > >>>>
> > > > > > > >>>>  buildtools/pmdinfogen/pmdinfogen.c      |  1 +
> > > > > > > >>>>  buildtools/pmdinfogen/pmdinfogen.h      |  1 +
> > > > > > > >>>>  drivers/crypto/qat/rte_qat_cryptodev.c  |  2 ++
> > > > > > > >>>>  drivers/net/bnx2x/bnx2x_ethdev.c        |  4 ++++
> > > > > > > >>>>  drivers/net/bnxt/bnxt_ethdev.c          |  2 ++
> > > > > > > >>>>  drivers/net/cxgbe/cxgbe_ethdev.c        |  2 ++
> > > > > > > >>>>  drivers/net/e1000/em_ethdev.c           |  2 ++
> > > > > > > >>>>  drivers/net/e1000/igb_ethdev.c          |  4 ++++
> > > > > > > >>>>  drivers/net/ena/ena_ethdev.c            |  2 ++
> > > > > > > >>>>  drivers/net/enic/enic_ethdev.c          |  2 ++
> > > > > > > >>>>  drivers/net/fm10k/fm10k_ethdev.c        |  2 ++
> > > > > > > >>>>  drivers/net/i40e/i40e_ethdev.c          |  2 ++
> > > > > > > >>>>  drivers/net/i40e/i40e_ethdev_vf.c       |  2 ++
> > > > > > > >>>>  drivers/net/ixgbe/ixgbe_ethdev.c        |  4 ++++
> > > > > > > >>>>  drivers/net/mlx4/mlx4.c                 |  2 ++
> > > > > > > >>>>  drivers/net/mlx5/mlx5.c                 |  3 +++
> > > > > > > >>>>  drivers/net/nfp/nfp_net.c               |  2 ++
> > > > > > > >>>>  drivers/net/qede/qede_ethdev.c          |  4 ++++
> > > > > > > >>>>  drivers/net/szedata2/rte_eth_szedata2.c |  2 ++
> > > > > > > >>>>  drivers/net/thunderx/nicvf_ethdev.c     |  2 ++
> > > > > > > >>>>  drivers/net/virtio/virtio_ethdev.c      |  2 ++
> > > > > > > >>>>  drivers/net/vmxnet3/vmxnet3_ethdev.c    |  2 ++
> > > > > > > >>>>  lib/librte_eal/common/include/rte_dev.h | 14
> ++++++++++++++
> > > > > > > >>>>  tools/dpdk-pmdinfo.py                   |  5 ++++-
> > > > > > > >>>>  24 files changed, 69 insertions(+), 1 deletion(-)
> > > > > > > >>>>
> > > > > > > >>>
> > > > > > > >>> Generally speaking, I like the idea, it makes sense to
> > > > > > > >>> me in terms of using pmdinfo to export this information
> > > > > > > >>>
> > > > > > > >>> That said, This may need to be a set of macros.  By that
> > > > > > > >>> I mean (and correct
> > > > > > > me
> > > > > > > >>> if I'm wrong here), but the relationship between pmd's
> > > > > > > >>> and kernel modules
> > > > > > > is in
> > > > > > > >>> some cases, more complex than a 'requires' or 'depends'
> > > > > > > >>> relationship.  That
> > > > > > > is
> > > > > > > >>> to say, some pmd may need user space hardware access,
> > > > > > > >>> but can use either
> > > > > > > uio OR
> > > > > > > >>> vfio, but doesn't need both, and can continue to
> > > > > > > >>> function if only one is available.  Other PMD's may be
> > > > > > > >>> able to use vfio or uio, but can still function without
> > > > > > > >>> either.  And some, as your patch implements, simply
> > > > > > > >>> require one or
> > > > > > > the
> > > > > > > >>> other to function.  As such it seems like you may want a
> > > > > > > >>> few macros, in the
> > > > > > > form
> > > > > > > >>> of:
> > > > > > > >>>
> > > > > > > >>> DRIVER_REGISTER_KMOD_REQUEST - List of modules to
> > > > > > > >>> attempt loading,
> > > > > > > ignore any
> > > > > > > >>> failures
> > > > > > > >>> DRIVER_REGISTER_KMOD_REQUIRE - List of modules required
> > > > > > > >>> to be
> > > > > > > loaded after
> > > > > > > >>> request macro completes, fail if any are not loaded
> > > > > > > >>>
> > > > > > > >>> Thats just spitballing, mind you, theres probably a
> > > > > > > >>> better way to do it, but
> > > > > > > the
> > > > > > > >>> idea is to list a set of modules you would like to have,
> > > > > > > >>> and then create a parsable syntax to describe the
> > > > > > > >>> modules that need to be loaded after the
> > > > > > > request
> > > > > > > >>> is complete so that you can accurately codify the
> > > > > > > >>> situations I described
> > > > > > > above.
> > > > > > > >>
> > > > > > > >> Thank you for your feedback.
> > > > > > > >> However, I'm not sure I'm perfectly getting what you suggest.
> > > > > > > >>
> > > > > > > >> Do you think some PMDs could request a kernel module
> > > > > > > >> without really requiring it? Do you have an example in mind?
> > > > > > > >>
> > > > > > > > Yes, thats precisely it.  The most clear example I could
> > > > > > > > think of (though I'm not sure if any pmd currently
> > > > > > > > supports this), is a pmd that supports both UIO and VFIO
> > > > > > > > communication with the kernel.  Such a PMD requires that
> > > > > > > > one of
> > > > > > > those
> > > > > > > > two modules be loaded, but only one (i.e. both are not
> > > > > > > > required), so if only
> > > > > > > the
> > > > > > > > uio kernel module loads is a success case, likewise if
> > > > > > > > only the vfio module loads can be treated as success.
> > > > > > > > Both loading are clearly successful.  Only if neither load
> > > > > > > > do we have a failure case.  I'm suggesting that the
> > > > > > > > grammer that your exports define should take those cases
> > > > > > > > into account.  Its not always as
> > > simple as "I must have the following modules"
> > > > > > > >
> > > > > > > >> The syntax I've submitted lets you define several lists
> > > > > > > >> of modules, so that the user or the script that starts
> > > > > > > >> the application can decide which kmod list is better
> > > > > > > >> according to the
> > > environment.
> > > > > > > >>
> > > > > > > > If you have a human intervening in the module load
> > > > > > > > process, sure, then its
> > > > > > > fine.
> > > > > > > > But it seems that this particular feature that you're
> > > > > > > > implemnting might have automated uses.  That is to say the
> > > > > > > > dpdk core library might be interested in parsing this
> > > > > > > > particular information to direct module autoloading, and
> > > > > > > > if thats desireable then you need to define these lists
> > > > > > > > such that you can
> > > codify failure and success conditions.
> > > > > > > >
> > > > > > > >> For example, most drivers will advertise
> > > > > > > >> "uio,igb_uio:uio,uio_pci_generic:vfio,vfio-pci", and the
> > > > > > > >> user or script will have to choose between loading:
> > > > > > > >> - uio igb_uio
> > > > > > > >> - uio uio_pci_generic
> > > > > > > >> - vfio vfio-pci
> > > > > > > >>
> > > > > > > > Oh, I see, so your list is a colon delimited list of
> > > > > > > > module load sets, where at least one set must succeed by
> > > > > > > > loading all modules in its set, but the failure of any one
> > > > > > > > set isn't fatal to the
> > > process?  e.g. a string like this:
> > > > > > > >
> > > > > > > > uio,igb_uio:vfio,vfio-pci
> > > > > > > >
> > > > > > > > could be interpreted to mean "I must load (uio AND
> > > > > > > > igb_uio) OR (vfio AND vfio-pci).  If the evaluation of
> > > > > > > > that statement results in false, then the operation fails, otherwise
> it succedes.
> > > > > > > >
> > > > > > > > If thats the case, then, apologies, we're on the same
> > > > > > > > page, and this will work just fine.
> > > > > > >
> > > > > > > Yep, that's the idea.
> > > > > > >
> > > > > > > Colon and commas are the best separators I've thought about,
> > > > > > > but any idea to make the syntax clearer is welcome ;)
> > > > > > >
> > > > > > > Maybe a syntax like is clearer:
> > > > > > >   "(mod1 & mod2)|(mod3 & mod4)" ?
> > > > > > > But it would let the user think that more complex
> > > > > > > expressions are valid, like "(mod1 & (mod2 | mod3)) | mod4",
> > > > > > > which is probably
> > > overkill.
> > > > > > >
> > > > > > > Regards,
> > > > > > > Olivier
> > > > > >
> > > > > > This RFC seems like a good idea - and something the Intel
> > > > > > QuickAssist PMD
> > > could benefit from.
> > > > > > However the (mod1 & mod2) can handle the QAT case better in my
> > > opinion.
> > > > > > i.e.
> > > > > > as well as needing one of
> > > > > > * uio igb_uio
> > > > > > * uio uio_pci_generic
> > > > > > * vfio vfio-pci
> > > > > > QAT PMD also needs one of (depending on which physical device
> > > > > > is
> > > > > > plugged)
> > > > > >  * qat_dh895xcc
> > > > > >  * qat_c62x
> > > > > >  * qat_c3xxx
> > > > > >
> > > > > > So the original syntax would result in a very long list of possible
> variations.
> > > > > > What really reflects the dependencies would be ((uio &
> > > > > > igb_uio) | (uio & uio_pci_generic) | (vfio & vfio_pci)) &
> > > > > > (qat_dh895xcc | qat_c62x | qat_c3xxx)
> > > > > >
> > > > > Ah, I didn't consider that hardware specifics might create a use
> > > > > case where a pmd must have one or more kernel modules available
> > > > > for hw support.  Perhaps it is worthwhile to automate hardware
> > > > > support - that is to say, any module loading script should
> > > > > automatically look at the pci table exported from a pmd, and, if
> > > > > found, load any modules that claim support for that
> > > > > device:vendor tuple?  Though that might break in the case of
> > > > > uio, if there are separate driver modules that
> > > support native hardware and uio access.
> >
> > Actually if the script output was intended to be used to auto-load
> > dependent kmods, then even the above would not suffice for the QAT
> > driver (and presumably for other PMDs with specific HW dependencies).
> > i.e. the qat_dhxxxx modules have further dependencies themselves on an
> > intel_qat module, and there are other steps documented in the
> But any dependency chain such as what you describe is covered in the next step
> of the chain.  That is to say if the qat pmd has a hardware dependency on
> qat_dhxxx (or qat_cxxx, etc), and those modules depend on intel_qat, the pmd
> doesn't need to know that, because qat_dhxxx and companions should all list
> intel_qat as a dependency that modprobe will resolve when installing the kernel
> module.
> 
> > guide which must be taken after loading the kmods.
> I'm not sure what you mean by this.  Are you referring to the qat
> documentation that comes with the DPDK?  I only see three additional items
> there to address
> 
> 1) Removing other modules when using the 01.org kernel modules
> 
> 2) installation of firmware
> 
> 3) Binding of the device to user space for VFIO/UIO
> 
> All three of these tasks fall outside the scope of what this macro is meant to do.
> We could try to create macros for them to export information for use in a
> loading script if you like, but I wouldn't.  All three of the above items fall in my
> mind under the category of administrative responsibilities.  That is to say, they
> are orthogonoal to defining a module dependency structure, and if they're
> arent properly completed, the module dependency chain won't matter anyway.
> 

Another manual step is documented, which must be done after insmod: 
echo 32 > /sys/bus/pci/drivers/dh895xcc/0000\:03\:00.0/sriov_numvfs
(steps will vary for different hardware types)
Which I agree like the others are outside the scope of what this macro is meant to do.
So using the macro to facilitate auto-loading of modules isn't a very useful feature
for the QAT driver.

> > The use-case I'd addressed was for the script to identify and just
> > throw an error where dependent modules are missing.
> >
> 
> That doesn't really add much value then, since missing modules already result in
> errors when the PMD tries to initalize.
> 
> > I don't see a simple solution, but also don't see a strong need to find one.
> > Documentation and if necessary a driver-specific script seem sufficient to me.
> >
> > My conclusion is the RFC is a nice feature for some drivers, but if introduced
> needs
> > to be optional as it doesn't handle the complexities of all drivers.
> >
> 
> I agree its an optional export. If there are no dependencies, or if the author
> wishes to to simply not supply any, thats fine, the results will be in
> accordance with that, but I strongly disagree that its optional implies the fact
> that we can ignore the complexities of the depedencies that can be exported.
> 
> The more I think about it the more I like Stephens idea, possibly with some
> macro assistance.  That is to say:
> 
> 1) Start by loading hardware specific modules, the information for which is
> already available.  You can parse the pci table that a pmd exports and match it
> with the pci aliases retrieved via modinfo
> 
> 2) Load a special virt driver if no hardware is found on the system in (1).
> special virt drivers might be worth tagging with a VIRT/VFIO/UIO tag export for
> pmdinfo
> 
> That allows to set asside the complexities of our dependency chain, as we can
> assume hardware support modules will codify any real dependencies there, and
> a
> VIRT tag will let us find any modules needed for hardware the is assigned into
> our guest.
> 
> Neil
> 
> Neil
> 
> > > >
> > > > I ended up writing a script that went the other way.
> > > > First look at the hardware and load VFIO if IOMMU is available.
> > > > Then look for special driver needed for Xen and HyperV Lastly fallback
> > > > to loading igb_uio if no VFIO and PCI device present.
> > > >
> > > > In other words it is a system not driver issue.
> > > >
> > > That sounds like a reasonable approach, yes.
> > > Neil
> > >
> > > >
> >


More information about the dev mailing list