[dpdk-dev] [RFC PATCH 2/6] eal: introduce bus-device-driver structure

Jan Blunck jblunck at infradead.org
Thu Nov 17 17:13:44 CET 2016


On Thu, Nov 17, 2016 at 2:00 PM, Shreyansh Jain <shreyansh.jain at nxp.com> wrote:
>
> On Thursday 17 November 2016 04:49 PM, Jan Blunck wrote:
>>
>> On Thu, Nov 17, 2016 at 6:30 AM, Shreyansh Jain <shreyansh.jain at nxp.com>
>> wrote:
>>>
>>> A device is connected to a bus and services by a driver associated with
>>> the bus. It is responsibility of the bus to identify the devices (scan)
>>> and then assign each device to a matching driver.
>>>
>>> A PMD would allocate a rte_xxx_driver and rte_xxx_device.
>>> rte_xxx_driver has rte_driver and rte_bus embedded. Similarly,
>>> rte_xxx_device
>>> has rte_device and rte_bus embedded.
>>
>>
>> I don't think so: the rte_xxx_device embeds the generic rte_device and
>> references a the rte_bus
>> that it is attached to.
>
>
> You mean?
>
>  struct rte_pci_device {
>    struct rte_device device;
>    struct rte_bus *bus;
>    ...
> }
>
> If yes then I have a different view.
> 'device' is connected to a bus. pci_device is just a type of device. Only
> way it should know about the bus is through the parent rte_device.
> rte_device can reference the bus.
>

No. Actually my English was bad. I meant that the rte_device
references the rte_bus but shouldn't embed it.

>
>>
>>> When a ethernet or crypto device (rte_eth_dev, rte_cryptodev) is
>>> allocated,
>>> it contains a reference of rte_device and rte_driver.
>>> Each ethernet device implementation would use container_of for finding
>>> the
>>> enclosing structure of rte_xxx_*.
>>>
>>>                             +-------------------+
>>>  +--------------+           |rte_pci_device     |
>>>  |rte_eth_dev   |           |+-----------------+|
>>>  |+------------+|   .-------->rte_device       ||
>>>  ||rte_device*-----'        |+-----------------+|
>>>  |+------------+|           ||rte_bus          ||
>>>  |              |           |+-----------------+|
>>>  /              /           +-------------------+
>>>
>>> Signed-off-by: Shreyansh Jain <shreyansh.jain at nxp.com>
>>> ---
>>>  lib/librte_eal/common/include/rte_bus.h | 243
>>> ++++++++++++++++++++++++++++++++
>>>  lib/librte_eal/common/include/rte_dev.h |  36 ++---
>>>  2 files changed, 261 insertions(+), 18 deletions(-)
>>>  create mode 100644 lib/librte_eal/common/include/rte_bus.h
>>>
>>> diff --git a/lib/librte_eal/common/include/rte_bus.h
>>> b/lib/librte_eal/common/include/rte_bus.h
>>> new file mode 100644
>>> index 0000000..dc3aeb8
>>> --- /dev/null
>>> +++ b/lib/librte_eal/common/include/rte_bus.h
>>> @@ -0,0 +1,243 @@
>>> +/*-
>>> + *   BSD LICENSE
>>> + *
>>> + *   Copyright(c) 2016 NXP
>>> + *   All rights reserved.
>>> + *
>>> + *   Redistribution and use in source and binary forms, with or without
>>> + *   modification, are permitted provided that the following conditions
>>> + *   are met:
>>> + *
>>> + *     * Redistributions of source code must retain the above copyright
>>> + *       notice, this list of conditions and the following disclaimer.
>>> + *     * Redistributions in binary form must reproduce the above
>>> copyright
>>> + *       notice, this list of conditions and the following disclaimer in
>>> + *       the documentation and/or other materials provided with the
>>> + *       distribution.
>>> + *     * Neither the name of NXP nor the names of its
>>> + *       contributors may be used to endorse or promote products derived
>>> + *       from this software without specific prior written permission.
>>> + *
>>> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
>>> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
>>> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
>>> FOR
>>> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
>>> COPYRIGHT
>>> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
>>> INCIDENTAL,
>>> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
>>> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
>>> USE,
>>> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
>>> ANY
>>> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
>>> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE
>>> USE
>>> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
>>> DAMAGE.
>>> + */
>>> +
>>> +#ifndef _RTE_BUS_H_
>>> +#define _RTE_BUS_H_
>>> +
>>> +/**
>>> + * @file
>>> + *
>>> + * RTE PMD Bus Abstraction interfaces
>>> + *
>>> + * This file exposes APIs and Interfaces for Bus Abstraction over the
>>> devices
>>> + * drivers in EAL.
>>> + */
>>> +
>>> +#ifdef __cplusplus
>>> +extern "C" {
>>> +#endif
>>> +
>>> +#include <stdio.h>
>>> +#include <sys/queue.h>
>>> +
>>> +#include <rte_log.h>
>>> +#include <rte_dev.h>
>>> +
>>> +
>>> +/** Double linked list of buses */
>>> +TAILQ_HEAD(rte_bus_list, rte_bus);
>>> +
>>> +/**
>>> + * Bus specific scan for devices attached on the bus.
>>> + * For each bus object, the scan would be reponsible for finding devices
>>> and
>>> + * adding them to its private device list.
>>> + *
>>> + * Successful detection of a device results in rte_device object which
>>> is
>>> + * embedded within the respective device type (rte_pci_device, for
>>> example).
>>> + * Thereafter, PCI specific bus would need to perform
>>> + * container_of(rte_pci_device) to obtain PCI device object.
>>> + *
>>> + * Scan failure of a bus is not treated as exit criteria for
>>> application. Scan
>>> + * for all other buses would still continue.
>>> + *
>>> + * @param void
>>> + * @return
>>> + *     0 for successful scan
>>> + *     !0 (<0) for unsuccessful scan with error value
>>> + */
>>> +typedef int (* bus_scan_t)(void);
>>> +
>>> +/**
>>> + * Bus specific match for devices and drivers which can service them.
>>> + * For each scanned device, during probe the match would link the
>>> devices with
>>> + * drivers which can service the device.
>>> + *
>>> + * It is the work of each bus handler to obtain the specific device
>>> object
>>> + * using container_of (or typecasting, as a less preferred way).
>>> + *
>>> + * @param drv
>>> + *     Driver object attached to the bus
>>> + * @param dev
>>> + *     Device object which is being probed.
>>> + * @return
>>> + *     0 for successful match
>>> + *     !0 for unsuccessful match
>>> + */
>>> +typedef int (* bus_match_t)(struct rte_driver *drv, struct rte_device
>>> *dev);
>>
>>
>> Do you think this should do match & probe?
>
>
> It is only matching as of now.
> rte_eal_probe() from eal.c calls this function for the bus and probe for
> each device of that bus. (Look at 'pci_probe_all_drivers' called from
> rte_eal_pci_probe).
>
>>
>> I believe it is better to separate this into two functions to match()
>> and probe(). The
>> result of matching tells if the driver would want to claim the device
>> in general. But
>> probe()ing the device should only happen if the device isn't claimed
>> by a driver yet and
>> is not blacklisted.
>
>
> Agree. This is what rte_eal_pci_probe() is doing for PCI. Similar
> implementation would exists in bus specific area (this is a different debate
> where...) for probe on each driver (associated with that bus).
>
>
>>
>>> +
>>> +/**
>>> + * Dump the devices on the bus.
>>> + * Each bus type can define its own definition of information to dump.
>>> + *
>>> + * @param bus
>>> + *     Handle for bus, device from which are to be dumped.
>>> + * @param f
>>> + *     Handle to output device or file.
>>> + * @return void
>>> + */
>>> +typedef void (* bus_dump_t)(struct rte_bus *bus, FILE *f);
>>> +
>>> +/**
>>> + * Search for a specific device in device list of the bus
>>> + * This would rely on the bus specific addressing. Each implementation
>>> would
>>> + * extract its specific device type and perform address compare.
>>> + *
>>> + * @param dev
>>> + *     device handle to search for.
>>> + * @return
>>> + *     rte_device handle for matched device, or NULL
>>> + */
>>> +typedef struct rte_device * (* bus_device_get_t)(struct rte_device
>>> *dev);
>>
>>
>> From my experience it is better to delegate this to a helper:
>>
>> int bus_for_each_dev(struct rte_bus *bus, struct rte_device *start,
>> int (*match)(struct rte_device *dev, void *data), void *data);
>>
>> Based on that you can easily implement all kinds of functions like
>> bus_find_dev(), bus_find_dev_by_name(), bus_dump(), ...
>
>
> Interesting idea. Thanks. I will have a look on this.
>
>>
>>> +
>>> +struct rte_bus {
>>> +       TAILQ_ENTRY(rte_bus) next;   /**< Next bus object in linked list
>>> */
>>> +       struct rte_driver_list driver_list; /**< List of all drivers of
>>> bus */
>>
>>
>> TAILQ_HEAD?
>
>
> Yes. That should be TAILQ_HEAD. I just reused the definition create in
> rte_dev.h using TAILQ_HEAD. I will change this.
>
>>
>>> +       struct rte_device_list device_list; /**< List of all devices on
>>> bus */
>>
>>
>> TAILQ_HEAD?
>
>
> Same as above. I will change this.
>
>>
>>> +       const char *name;            /**< Name of the bus */
>>> +       /* Mandatory hooks */
>>> +       bus_scan_t *scan;            /**< Hook for scanning for devices
>>> */
>>> +       bus_match_t *match;          /**< Hook for matching device &
>>> driver */
>>> +       /* Optional hooks */
>>> +       bus_dump_t *dump_dev;        /**< Hook for dumping devices on bus
>>> */
>>> +       bus_device_get_t *find_dev;  /**< Search for a device on bus */
>>> +};
>>> +
>>> +/** @internal
>>> + * Add a device to a bus.
>>> + *
>>> + * @param bus
>>> + *     Bus on which device is to be added
>>> + * @param dev
>>> + *     Device handle
>>> + * @return
>>> + *     None
>>> + */
>>> +void
>>> +rte_eal_bus_add_device(struct rte_bus *bus, struct rte_device *dev);
>>
>>
>> Why do we need this? From my understanding the rte_bus->scan() is
>> adding the devices to the rte_bus->device_list.
>
>
> Plugging a device *after* scan has been completed. Hotplugging.
> Specially for vdev, this would be required, in my opinion.
>
>
>>
>>> +/** @internal
>>> + * Remove a device from its bus.
>>> + *
>>> + * @param dev
>>> + *     Device handle to remove
>>> + * @return
>>> + *     None
>>> + */
>>> +void
>>> +rte_eal_bus_remove_device(struct rte_device *dev);
>>> +
>>> +/** @internal
>>> + * Associate a driver with a bus.
>>> + *
>>> + * @param bus
>>> + *     Bus on which driver is to be added
>>> + * @param dev
>>> + *     Driver handle
>>> + * @return
>>> + *     None
>>> + */
>>> +void
>>> +rte_eal_bus_add_driver(struct rte_bus *bus, struct rte_driver *drv);
>>> +
>>
>>
>> What happens if a driver is added at runtime to a bus? Does that
>> immediately
>> trigger a match & probe of unclaimed devices?
>
>
> My take:
>  - scan the bus for any newly plugged devices. Might be required in case a
> device is logical entity represented by multiple entries in sysfs.
>  -- add only those which are not already added - maybe address/id match
>  - Trigger driver probe for all devices which don't have a driver assigned
> to them (unclaimed, as you stated).
>
> (This is part of my further changes - I think I forgot to put them as note
> in cover letter).
>
>
>>
>>> +/** @internal
>>> + * Disassociate a driver from its bus.
>>> + *
>>> + * @param dev
>>> + *     Driver handle to remove
>>> + * @return
>>> + *     None
>>> + */
>>> +void
>>> +rte_eal_bus_remove_driver(struct rte_driver *drv);
>>> +
>>> +/**
>>> + * Register a Bus handler.
>>> + *
>>> + * @param driver
>>> + *   A pointer to a rte_bus structure describing the bus
>>> + *   to be registered.
>>> + */
>>> +void rte_eal_bus_register(struct rte_bus *bus);
>>> +
>>> +/**
>>> + * Unregister a Bus handler.
>>> + *
>>> + * @param driver
>>> + *   A pointer to a rte_bus structure describing the bus
>>> + *   to be unregistered.
>>> + */
>>> +void rte_eal_bus_unregister(struct rte_bus *bus);
>>> +
>>> +/**
>>> + * Obtain handle for bus given its name.
>>> + *
>>> + * @param bus_name
>>> + *     Name of the bus handle to search
>>> + * @return
>>> + *     Pointer to Bus object if name matches any registered bus object
>>> + *     NULL, if no matching bus found
>>> + */
>>> +struct rte_bus * rte_eal_get_bus(const char *bus_name);
>>> +
>>> +/**
>>> + * Register a device driver.
>>> + *
>>> + * @param driver
>>> + *   A pointer to a rte_dev structure describing the driver
>>> + *   to be registered.
>>> + */
>>> +void rte_eal_driver_register(struct rte_driver *driver);
>>> +
>>> +/**
>>> + * Unregister a device driver.
>>> + *
>>> + * @param driver
>>> + *   A pointer to a rte_dev structure describing the driver
>>> + *   to be unregistered.
>>> + */
>>> +void rte_eal_driver_unregister(struct rte_driver *driver);
>>> +
>>> +/** Helper for Bus registration */
>>> +#define RTE_PMD_REGISTER_BUS(nm, bus) \
>>> +RTE_INIT(businitfn_ ##nm); \
>>> +static void businitfn_ ##nm(void) \
>>> +{\
>>> +       (bus).name = RTE_STR(nm);\
>>> +       rte_eal_bus_register(&bus); \
>>> +}
>>> +
>>> +#ifdef __cplusplus
>>> +}
>>> +#endif
>>> +
>>> +#endif /* _RTE_BUS_H */
>>> diff --git a/lib/librte_eal/common/include/rte_dev.h
>>> b/lib/librte_eal/common/include/rte_dev.h
>>> index 8840380..b08bab5 100644
>>> --- a/lib/librte_eal/common/include/rte_dev.h
>>> +++ b/lib/librte_eal/common/include/rte_dev.h
>>> @@ -116,12 +116,14 @@ TAILQ_HEAD(rte_device_list, rte_device);
>>>
>>>  /* Forward declaration */
>>>  struct rte_driver;
>>> +struct rte_bus;
>>>
>>>  /**
>>>   * A structure describing a generic device.
>>>   */
>>>  struct rte_device {
>>>         TAILQ_ENTRY(rte_device) next; /**< Next device */
>>> +       struct rte_bus *bus;          /**< Bus on which device is placed
>>> */
>>>         struct rte_driver *driver;    /**< Associated driver */
>>>         int numa_node;                /**< NUMA node connection */
>>>         struct rte_devargs *devargs;  /**< Device user arguments */
>>> @@ -144,31 +146,29 @@ void rte_eal_device_insert(struct rte_device *dev);
>>>  void rte_eal_device_remove(struct rte_device *dev);
>>>
>>>  /**
>>> - * A structure describing a device driver.
>>> + * @internal
>>> + * TODO
>>>   */
>>> -struct rte_driver {
>>> -       TAILQ_ENTRY(rte_driver) next;  /**< Next in list. */
>>> -       const char *name;                   /**< Driver name. */
>>> -       const char *alias;              /**< Driver alias. */
>>> -};
>>> +typedef int (*driver_init_t)(struct rte_device *eth_dev);
>>>
>>>  /**
>>> - * Register a device driver.
>>> - *
>>> - * @param driver
>>> - *   A pointer to a rte_dev structure describing the driver
>>> - *   to be registered.
>>> + * @internal
>>> + * TODO
>>>   */
>>> -void rte_eal_driver_register(struct rte_driver *driver);
>>> +typedef int (*driver_uninit_t)(struct rte_device *eth_dev);
>>>
>>>  /**
>>> - * Unregister a device driver.
>>> - *
>>> - * @param driver
>>> - *   A pointer to a rte_dev structure describing the driver
>>> - *   to be unregistered.
>>> + * A structure describing a device driver.
>>>   */
>>> -void rte_eal_driver_unregister(struct rte_driver *driver);
>>> +struct rte_driver {
>>> +       TAILQ_ENTRY(rte_driver) next;  /**< Next in list. */
>>> +       struct rte_bus *bus;           /**< Bus which drivers services */
>>
>>
>> I think this should be TAILQ_ENTRY instead.
>
>
> rte_bus
>       `-> device_list-. <- TAILQ_HEAD (rte_device)
>                       |
>                 rte_pci_device:TAILQ_ENTRY(rte_device)
>
> rte_device just references the bus it belong to.
>
> Am I missing something?
>

This is the rte_driver though. Don't we keep the global list of all
registered drivers? If yes, we need a second TAILQ_ENTRY() that is
registered with the rte_bus.


>>
>>> +       const char *name;              /**< Driver name. */
>>> +       const char *alias;             /**< Driver alias. */
>>> +       driver_init_t *init;           /**< Driver initialization */
>>> +       driver_uninit_t *uninit;       /**< Driver uninitialization */
>>
>>
>> Shouldn't this be probe() and remove()?
>
>
> rte_xxx_driver already includes probe and remove.
> Mandate for init is to allocate the ethernet or cryptodev (or some other
> functional unit). Whereas, probe is responsible for driver specific
> intialization - PCI specific, in case of PCI driver.
>

There is close to zero PCI specific code left in
rte_eth_dev_pci_probe() function. Basically it generates the correct
name from the pci_dev. Everything else it delegates to eth_driver
(which should die anyway).

Example:

struct rte_pci_driver foo_ethernet_driver {
     .driver = {
               "net_foo",
     },
     .id_table = ...,
     .drv_flags = RTE_PCI_DRV_NEED_MAPPING,
     .probe = foo_probe_pci,
}

struct rte_pci_driver foo_crypto_driver {
     .driver = {
               "crypto_foo",
     },
    .id_table = ...,
     .drv_flags = RTE_PCI_DRV_NEED_MAPPING,
     .probe = foo_probe_pci,
}

The bus looks at the driver and does everything requested (mapping
etc) before it calls the driver specific probe. If we delegate the
functional initialization to the next layer (rte_driver) this mean the
probe() operates on a rte_device. In this case it would need to upcast
to the rte_pci_device.

>From my understanding this is going two steps forward and one step
back. Maybe this confusion arises because the rte_bus->probe()
function is missing? My understanding is that the PCI rte_pci_bus is
providing the functionality that is required (kernel driver unloading,
mapping, interrupt wiring, ...) by an instance of a rte_pci_driver.



> Initially I had thought of moving rte_pci_driver->probe() to rte_driver.
> But, I couldn't understand where would functional device initialization
> exist. It cannot be rte_xxx_driver. It should be generic place.
>
> Such functional device initialization is in path of the probe function, but
> so does driver specific information. If we remove rte_pci_driver->probe and
> move to rte_driver->probe(), it would only mean rte_driver calling a hook
> within the PCI driver specific domain.
>
>>
>>> +       unsigned int dev_private_size; /**< Size of device private data
>>> ??*/
>>
>>
>> I don't think that dev_private_size is really required at this level.
>> First of all this is related to the rte_eth_dev structure and
>> therefore it really depends on the driver_init_t (aka probe()) if it
>> is actually allocating an rte_eth_dev or not. Anyway that is up to the
>> drivers probe() function.
>
>
> I moved it based on input through ML (or IRC, I don't remember).
> My understanding: if this is common for all rte_xxx_driver instances (for
> allocating their ethernet/crypto structure), it would actually make sense to
> keep at this level so that init() can use it for allocating necessary space
> before handling over the rte_xxx_driver.
>
> But again, I am OK keeping it in the underlying layer - as you have rightly
> pointed out, it would only be used at rte_xxx_driver layer.
>
>>
>>> +};
>>>
>>>  /**
>>>   * Initalize all the registered drivers in this process
>>> --
>>> 2.7.4
>>>
>>
>
> -
> Shreyansh


More information about the dev mailing list