[dpdk-dev] [RFC PATCH 0/6] Restructure EAL device model for bus support

Jan Blunck jblunck at infradead.org
Thu Nov 17 17:54:23 CET 2016


On Thu, Nov 17, 2016 at 2:08 PM, Shreyansh Jain <shreyansh.jain at nxp.com> wrote:
> On Thursday 17 November 2016 05:25 PM, Jan Blunck wrote:
>>
>> On Thu, Nov 17, 2016 at 6:29 AM, Shreyansh Jain <shreyansh.jain at nxp.com>
>> wrote:
>>>
>>> DPDK has been inherently a PCI inclined framework. Because of this, the
>>> design of device tree (or list) within DPDK is also PCI inclined. A
>>> non-PCI
>>> device doesn't have a way of being expressed without using hooks started
>>> from
>>> EAL to PMD.
>>>
>>> With this cover letter, some patches are presented which try to break
>>> this
>>> strict linkage of EAL with PCI devices. Aim is to generalize the device
>>> hierarchy on the lines of how Linux handles it:
>>>
>>>         device A1
>>>           |
>>>   +==.===='==============.============+ Bus A.
>>>      |                    `--> driver A11     \
>>>   device A2                `-> driver A12      \______
>>>                                                 |CPU |
>>>                                                 /`````
>>>         device A1                              /
>>>           |                                   /
>>>   +==.===='==============.============+ Bus A`
>>>      |                    `--> driver B11
>>>   device A2                `-> driver B12
>>>
>>> Simply put:
>>>  - a bus is connect to CPU (or core)
>>>  - devices are conneted to Bus
>>>  - drivers are running instances which manage one or more devices
>>>  - bus is responsible for identifying devices (and interrupt propogation)
>>>  - driver is responsible for initializing the device
>>>
>>> (*Reusing text from email [1])
>>> In context of DPDK EAL:
>>>  - a generic bus (not a driver, not a device). I don't know how to
>>> categorize
>>>    a bus. It is certainly not a device, and then handler for a bus
>>> (physical)
>>>    can be considered a 'bus driver'. So, just 'rte_bus'.
>>>  - there is a bus for each physical implementation (or virtual). So, a
>>> rte_bus
>>>    Object for PCI, VDEV, ABC, DEF and so on.
>>>  - Buses are registered just like a PMD - RTE_PMD_BUS_REGISTER()
>>>  - Each registered bus is part of a doubly list.
>>>    -- Each device inherits rte_bus
>>>    -- Each driver inherits rte_bus
>>>    -- Device and Drivers lists are part of rte_bus
>>>  - eth_driver is no more required - it was just a holder for PMDs to
>>> register
>>>    themselves. It can be replaced with rte_xxx_driver and corresponding
>>> init/
>>>    uninit moved to rte_driver
>>>  - rte_eth_dev modified to disassociate itself from rte_pci_device and
>>> connect
>>>    to generic rte_device
>>>
>>> Once again, improvising from [1]:
>>>
>>>                                   __ rte_bus_list
>>>                                  /
>>>                      +----------'---+
>>>                      |rte_bus       |
>>>                      | driver_list------> List of rte_bus specific
>>>                      | device_list----    devices
>>>                      | scan         | `-> List of rte_bus associated
>>>                      | match        |     drivers
>>>                      | dump         |
>>>                      | ..some refcnt| (#)
>>>                      +--|------|----+
>>>               _________/        \_________
>>>     +--------/----+                     +-\---------------+
>>>     |rte_device   |                     |rte_driver       |
>>>     | rte_bus     |                     | rte_bus         |
>>>     | rte_driver  |(#)                  | init            |
>>>     |             |                     | uninit          |
>>>     |  devargs    |                     | dev_private_size|
>>>     +---||--------+                     | drv_flags       |(#)
>>>         ||                              | intr_handle(2*) |(#)
>>>         | \                             +----------\\\----+
>>>         |  \_____________                           \\\
>>>         |                \                          |||
>>>  +------|---------+ +----|----------+               |||
>>>  |rte_pci_device  | |rte_xxx_device | (4*)          |||
>>>  | PCI specific   | | xxx device    |               |||
>>>  | info (mem,)    | | specific fns  |              / | \
>>>  +----------------+ +---------------+             /  |  \
>>>                             _____________________/  /    \
>>>                            /                    ___/      \
>>>             +-------------'--+    +------------'---+    +--'------------+
>>>             |rte_pci_driver  |    |rte_vdev_driver |    |rte_xxx_driver |
>>>             | PCI id table,  |    | <probably,     |    | ....          |
>>>             | other driver   |    |  nothing>      |    +---------------+
>>>             | data           |    +----------------+
>>>             +----------------+
>>>
>>>     (1*) Problem is that probe functions have different arguments. So,
>>>          generalizing them might be some rework in the respective device
>>>          layers
>>>     (2*) Interrupt handling for each driver type might be different. I am
>>> not
>>>          sure how to generalize that either. This is grey area for me.
>>>     (3*) Probably exposing a bitmask for device capabilities. Nothing
>>> similar
>>>          exists now to relate it. Don't know if that is useful. Allowing
>>>          applications to question a device about what it supports and
>>> what it
>>>          doesn't - making it more flexible at application layer (but more
>>> code
>>>          as well.)
>>>     (4*) Even vdev would be an instantiated as a device. It is not being
>>> done
>>>          currently.
>>>     (#)  Items which are still pending
>>>
>>> With this cover letter, some patches have been posted. These are _only_
>>> for
>>> discussion purpose. They are not complete and neither compilable.
>>> All the patches, except 0001, have sufficient context about what the
>>> changes
>>> are and rationale for same. Obviously, code is best answer.
>>>
>>> === Patch description: ===
>>>
>>> Patch 0001: Introduce container_of macro. Originally a patch from Jan.
>>>
>>> Patch 0002: introduce changes to rte_device/rte_driver for rte_bus, and
>>>             rte_bus definition itself.
>>>
>>> Patch 0003: Add a new layer for 'bus driver' with linuxapp PCI as an
>>> example
>>>
>>> Patch 0004: Changes with respect to rte_bus APIs and impact on
>>> eal_common_pci
>>
>>
>> From my point of view it is beneficial to keep the PCI support in one
>> place. Moving the match() and scan()
>> to the drivers directory doesn't seem to be technically required but
>> it makes the code harder to read and understand.
>
>
> It is not a technical requirement - just logical placement. These are bus
> specific logic and should exist with the bus driver - where ever that is
> placed.
> Though, I am not entirely sure about 'harder to read'. If a person is
> reading through PCI's bus implementation, I am assuming it would be nice to
> have all the hooks (or default hooks) at same place. Isn't it?
>
> Or, maybe you are suggesting move all librte_eal/*/*pci* out to some common
> location. If so, that is something I haven't yet given serious thought.
>

Yes, either have everything PCI related stuff in drivers or in
librte_eal (or librte_eal_pci). I don't believe it is required to have
two different locations for the rte_pci_bus and the rte_pci_*
functions that are now in librte_eal.


>>
>>
>>> Patch 0005: Change to rte_eal_init (of linuxapp only, for now) for
>>> supporting
>>>             bus->scan. Probe is still being done old way, but in a new
>>> wrapper
>>>
>>> Patch 0006: eth_driver removal and corresponding changes in ixgbe_ethdev,
>>> as
>>>             an example. Includes changes to rte_ethdev to remove most
>>> possible
>>>             PCI references. But, work still remains.
>>
>>
>> Making rte_ethdev independent from PCI device is not directly related
>> to the rest of the patches that add bus support. I think it makes
>> sense to handle this separately because of the impact of refactoring
>> rte_ethdev.
>
>
> Once rte_device is available, the change is independent.
> Only dependency on it was changes required to rte_ethdev.c file because of
> various PCI naming/variables/functions. And that is indeed a very large
> change - including changes to drivers/* which I haven't yet done.
>
> Are you suggesting breaking out of this series? If so, can be done but only
> when formal patches start coming out.
>

Yes, that is fine too.

>
>>
>>
>>>
>>> === Pending Items/Questions: ===
>>>
>>>  - Interrupt and notification handling. How to allow drivers to be
>>> notified
>>>    of presence/plugging of a device.
>>>  - Placement of bus driver/handling code. librte_bus, bus/, drivers/bus?
>>>  -- Also from a pespective of a external library and whether symbols
>>> would be
>>>     available in that.
>>>  -- and secondary processes
>>>  - VDEV bus is missing from current set.
>>>  - Locking of list for supporting hotplugging. Or, at the least safe add/
>>>    remove
>>>  - PMDINFOGEN support or lack of it.
>>>  - Is there ever a case where rte_eth_dev needs to be resolved from
>>>    rte_pci_device? I couldn't find any such use and neither a use-case
>>> for it.
>>>  - There should be a way for Bus APIs to return a generic list handle so
>>> that
>>>    EAL doesn't need to bother about bus->driver_list like dereferencing.
>>> This
>>>    is untidy as well as less portable (in terms of code movement, not
>>> arch).
>>>  - Are more helper hooks required for a bus?
>>>  -- I can think of scan, match, dump, find, plug (device), unplug
>>> (device),
>>>     associate (driver), disassociate (driver). But, most of the work is
>>>     already being done by lower instances (rte_device/driver etc).
>>>
>>> Further:
>>>  - In next few days I will make all necessary changes on the lines
>>> mentioned
>>>    in the patches. This would include changing the drivers/* and
>>> librte_eal/*
>>>  - As an when review comments float in and agreement reached, I will keep
>>>    changing the model
>>>  - There are grey areas like interrupt, notification, locking of bus/list
>>>    which require more discussion. I will try and post a rfc for those as
>>> well
>>>    or if someone can help me on those - great
>>
>>
>> As already hinted on IRC I'm taking a look at the interrupt usage of
>> ethdev.
>
>
> Great. Thank you. Do let me know feedback for any changes that you might
> require along the way.
>
>
>>
>>>  - Change would include PCI bus and VDEV bus handling. A new bus (NXP's
>>> FSLMC)
>>>    would also be layered over this series to verify the model of 'bus
>>>    registration'. This is also part of 17.02 roadmap.
>>>
>>> [1] http://dpdk.org/ml/archives/dev/2016-November/050186.html
>>>
>>> Jan Viktorin (1):
>>>   eal: define container macro
>>>
>>> Shreyansh Jain (5):
>>>   eal: introduce bus-device-driver structure
>>>   bus: add bus driver layer
>>>   eal/common: handle bus abstraction for device/driver objects
>>>   eal: supporting bus model in init process
>>>   eal: removing eth_driver
>>>
>>>  bus/Makefile                               |  36 +++
>>>  bus/pci/Makefile                           |  37 +++
>>>  bus/pci/linuxapp/pci_bus.c                 | 418
>>> +++++++++++++++++++++++++++++
>>>  bus/pci/linuxapp/pci_bus.h                 |  55 ++++
>>>  drivers/net/ixgbe/ixgbe_ethdev.c           |  49 ++--
>>>  lib/librte_eal/common/eal_common_bus.c     | 188 +++++++++++++
>>>  lib/librte_eal/common/eal_common_dev.c     |  31 ++-
>>>  lib/librte_eal/common/eal_common_pci.c     | 226 +++++++++-------
>>>  lib/librte_eal/common/include/rte_bus.h    | 243 +++++++++++++++++
>>>  lib/librte_eal/common/include/rte_common.h |  18 ++
>>>  lib/librte_eal/common/include/rte_dev.h    |  36 +--
>>>  lib/librte_eal/common/include/rte_pci.h    |  11 +-
>>>  lib/librte_eal/linuxapp/eal/Makefile       |   1 +
>>>  lib/librte_eal/linuxapp/eal/eal.c          |  51 +++-
>>>  lib/librte_eal/linuxapp/eal/eal_pci.c      | 298 --------------------
>>>  lib/librte_ether/rte_ethdev.c              |  36 ++-
>>>  lib/librte_ether/rte_ethdev.h              |   6 +-
>>>  17 files changed, 1262 insertions(+), 478 deletions(-)
>>>  create mode 100644 bus/Makefile
>>>  create mode 100644 bus/pci/Makefile
>>>  create mode 100644 bus/pci/linuxapp/pci_bus.c
>>>  create mode 100644 bus/pci/linuxapp/pci_bus.h
>>>  create mode 100644 lib/librte_eal/common/eal_common_bus.c
>>>  create mode 100644 lib/librte_eal/common/include/rte_bus.h
>>>
>>> --
>>> 2.7.4
>>>
>>
>
> -
> Shreyansh


More information about the dev mailing list