[PATCH v5 1/2] ethdev: introduce the Tx map API for aggregated ports
Jiawei(Jonny) Wang
jiaweiw at nvidia.com
Fri Feb 17 07:44:23 CET 2023
> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit at amd.com>
> Sent: Friday, February 17, 2023 1:58 AM
> To: Jiawei(Jonny) Wang <jiaweiw at nvidia.com>; Slava Ovsiienko
> <viacheslavo at nvidia.com>; Ori Kam <orika at nvidia.com>; NBU-Contact-
> Thomas Monjalon (EXTERNAL) <thomas at monjalon.net>;
> andrew.rybchenko at oktetlabs.ru; Aman Singh <aman.deep.singh at intel.com>;
> Yuying Zhang <yuying.zhang at intel.com>
> Cc: dev at dpdk.org; Raslan Darawsheh <rasland at nvidia.com>
> Subject: Re: [PATCH v5 1/2] ethdev: introduce the Tx map API for aggregated
> ports
>
> On 2/14/2023 3:48 PM, Jiawei Wang wrote:
> > When multiple ports are aggregated into a single DPDK port,
> > (example: Linux bonding, DPDK bonding, failsafe, etc.), we want to
> > know which port use for Tx via a queue.
> >
> > This patch introduces the new ethdev API
> > rte_eth_dev_map_aggr_tx_affinity(), it's used to map a Tx queue with
> > an aggregated port of the DPDK port (specified with port_id), The
> > affinity is the number of the aggregated port.
> > Value 0 means no affinity and traffic could be routed to any
> > aggregated port, this is the default current behavior.
> >
> > The maximum number of affinity is given by rte_eth_dev_count_aggr_ports().
> >
> > Add the trace point for ethdev rte_eth_dev_count_aggr_ports() and
> > rte_eth_dev_map_aggr_tx_affinity() functions.
> >
> > Add the testpmd command line:
> > testpmd> port config (port_id) txq (queue_id) affinity (value)
> >
> > For example, there're two physical ports connected to a single DPDK
> > port (port id 0), and affinity 1 stood for the first physical port and
> > affinity 2 stood for the second physical port.
> > Use the below commands to config tx phy affinity for per Tx Queue:
> > port config 0 txq 0 affinity 1
> > port config 0 txq 1 affinity 1
> > port config 0 txq 2 affinity 2
> > port config 0 txq 3 affinity 2
> >
> > These commands config the Tx Queue index 0 and Tx Queue index 1 with
> > phy affinity 1, uses Tx Queue 0 or Tx Queue 1 send packets, these
> > packets will be sent from the first physical port, and similar with
> > the second physical port if sending packets with Tx Queue 2 or Tx
> > Queue 3.
> >
> > Signed-off-by: Jiawei Wang <jiaweiw at nvidia.com>
>
> <...>
>
> > diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c index
> > dc0a4eb12c..1d5b3a16b2 100644
> > --- a/lib/ethdev/rte_ethdev.c
> > +++ b/lib/ethdev/rte_ethdev.c
> > @@ -6915,6 +6915,55 @@
> rte_eth_buffer_split_get_supported_hdr_ptypes(uint16_t port_id, uint32_t
> *ptypes
> > return j;
> > }
> >
> > +int rte_eth_dev_count_aggr_ports(uint16_t port_id) {
> > + struct rte_eth_dev *dev;
> > + int ret;
> > +
> > + RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > + dev = &rte_eth_devices[port_id];
> > +
> > + if (*dev->dev_ops->count_aggr_ports == NULL)
> > + return -ENOTSUP;
>
> What do you think to return a default value when dev_ops is not defined,
> assuming device is not a bounded device.
> Not sure which one is better for application, return a default value or error.
>
For device which isn't a boned device, the count should be zero.
So, we can return 0 as default value if the PMD doesn't support.
Per application perspective, it only needs to check the count > 0.
>
> > + ret = eth_err(port_id, (*dev->dev_ops->count_aggr_ports)(port_id));
> > +
> > + rte_eth_trace_count_aggr_ports(port_id, ret);
> > +
> > + return ret;
> > +}
> > +
> > +int rte_eth_dev_map_aggr_tx_affinity(uint16_t port_id, uint16_t
> tx_queue_id,
> > + uint8_t affinity)
> > +{
> > + struct rte_eth_dev *dev;
> > + int ret;
> > +
> > + RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > + dev = &rte_eth_devices[port_id];
> > +
> > + if (tx_queue_id >= dev->data->nb_tx_queues) {
> > + RTE_ETHDEV_LOG(ERR, "Invalid Tx queue_id=%u\n",
> tx_queue_id);
> > + return -EINVAL;
> > + }
> > +
>
> Although documentation says this API should be called before configure, if user
> misses it I guess above can crash, is there a way to add runtime check, like
> checking 'dev->data->dev_configured'?
>
OK, I will add the checking and report the error if (dev->data->dev_configured == 0).
>
> > + if (*dev->dev_ops->map_aggr_tx_affinity == NULL)
> > + return -ENOTSUP;
> > +
> > + if (dev->data->dev_started) {
> > + RTE_ETHDEV_LOG(ERR,
> > + "Port %u must be stopped to allow configuration\n",
> > + port_id);
> > + return -EBUSY;
> > + }
> > +
> > + ret = eth_err(port_id, (*dev->dev_ops->map_aggr_tx_affinity)(port_id,
> > + tx_queue_id, affinity));
> > +
>
> Should API check if port_id is a bonding port before it continue with mapping?
>
I added this check in the app before, will move to ethdev layer.
> > + rte_eth_trace_map_aggr_tx_affinity(port_id, tx_queue_id, affinity,
> > +ret);
> > +
> > + return ret;
> > +}
> > +
> > RTE_LOG_REGISTER_DEFAULT(rte_eth_dev_logtype, INFO);
> >
> > RTE_INIT(ethdev_init_telemetry)
> > diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h index
> > c129ca1eaf..07b8250eb8 100644
> > --- a/lib/ethdev/rte_ethdev.h
> > +++ b/lib/ethdev/rte_ethdev.h
> > @@ -2589,6 +2589,52 @@ int rte_eth_hairpin_bind(uint16_t tx_port,
> > uint16_t rx_port); __rte_experimental int
> > rte_eth_hairpin_unbind(uint16_t tx_port, uint16_t rx_port);
> >
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * Get the number of aggregated ports.
> > + *
> > + * @param port_id
> > + * The port identifier of the Ethernet device.
> > + * @return
> > + * - (>=0) the number of aggregated port if success.
> > + * - (-ENOTSUP) if not supported.
> > + */
> > +__rte_experimental
> > +int rte_eth_dev_count_aggr_ports(uint16_t port_id);
>
>
> Can you please give more details in the function description, in the context of
> this patch it is clear, but someone sees it first time can be confused what is
> "aggregated ports" is.
>
OK, for multiple ports are aggregated into single one, we can call these ports as "aggregated ports".
Will add more description in next patch.
> What is expected value for regular pysical port, that doesn't have any sub-
> devices, 0 or 1? Can you please document?
>
OK, API return 0 for regular physical port (w/o bonded).
Will add document in next patch.
>
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * Map a Tx queue with an aggregated port of the DPDK port (specified with
> port_id).
> > + * When multiple ports are aggregated into a single one,
> > + * it allows to choose which port to use for Tx via a queue.
> > + *
> > + * The application should use rte_eth_dev_map_aggr_tx_affinity()
> > + * after rte_eth_dev_configure(), rte_eth_tx_queue_setup(), and
> > + * before rte_eth_dev_start().
> > + *
> > + * @param port_id
> > + * The identifier of the port used in rte_eth_tx_burst().
> > + * @param tx_queue_id
> > + * The index of the transmit queue used in rte_eth_tx_burst().
> > + * The value must be in the range [0, nb_tx_queue - 1] previously supplied
> > + * to rte_eth_dev_configure().
> > + * @param affinity
> > + * The number of the aggregated port.
> > + * Value 0 means no affinity and traffic could be routed to any aggregated
> port.
> > + * The first aggregated port is number 1 and so on.
> > + * The maximum number is given by rte_eth_dev_count_aggr_ports().
> > + *
> > + * @return
> > + * Zero if successful. Non-zero otherwise.
> > + */
> > +__rte_experimental
> > +int rte_eth_dev_map_aggr_tx_affinity(uint16_t port_id, uint16_t
> tx_queue_id,
> > + uint8_t affinity);
> > +
> > /**
> > * Return the NUMA socket to which an Ethernet device is connected
> > *
> > diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map index
> > dbc2bffe64..685aa71e51 100644
> > --- a/lib/ethdev/version.map
> > +++ b/lib/ethdev/version.map
> > @@ -300,6 +300,8 @@ EXPERIMENTAL {
> > rte_mtr_meter_profile_get;
> >
> > # added in 23.03
> > + rte_eth_dev_count_aggr_ports;
> > + rte_eth_dev_map_aggr_tx_affinity;
> > rte_flow_async_create_by_index;
> > };
> >
Thanks.
More information about the dev
mailing list