[dpdk-dev] [RFC 0/7] hide eth dev related structures
Jerin Jacob
jerinjacobk at gmail.com
Wed Sep 15 11:45:05 CEST 2021
On Tue, Sep 14, 2021 at 7:03 PM Ananyev, Konstantin
<konstantin.ananyev at intel.com> wrote:
>
>
> Hi Jerin,
>
> > > NOTE: This is just an RFC to start further discussion and collect the feedback.
> > > Due to significant amount of work, changes required are applied only to two
> > > PMDs so far: net/i40e and net/ice.
> > > So to build it you'll need to add:
> > > -Denable_drivers='common/*,mempool/*,net/ice,net/i40e'
> > > to your config options.
> >
> > >
> > > That approach was selected to avoid(/minimize) possible performance losses.
> > >
> > > So far I done only limited amount functional and performance testing.
> > > Didn't spot any functional problems, and performance numbers
> > > remains the same before and after the patch on my box (testpmd, macswap fwd).
> >
> >
> > Based on testing on octeonxt2. We see some regression in testpmd and
> > bit on l3fwd too.
> >
> > Without patch: 73.5mpps/core in testpmd iofwd
> > With out patch: 72 5mpps/core in testpmd iofwd
> >
> > Based on my understanding it is due to additional indirection.
>
> From your patch below, it looks like not actually additional indirection,
> but extra memory dereference - func and dev pointers are now stored
> at different places.
Yup. I meant the same. We are on the same page.
> Plus the fact that now we dereference rte_eth_devices[]
> data inside PMD function. Which probably prevents compiler and CPU to load
> rte_eth_devices[port_id].data and rte_eth_devices[port_id]. pre_tx_burst_cbs[queue_id]
> in advance before calling actual RX/TX function.
Yes.
> About your approach: I don’t mind to add extra opaque 'void *data' pointer,
> but would prefer not to expose callback invocations code into inline function.
> Main reason for that - I think it still need to be reworked to allow adding/removing
> callbacks without stopping the device. Something similar to what was done for cryptodev
> callbacks. To be able to do that in future without another ABI breakage callbacks related part
> needs to be kept internal.
> Though what we probably can do: add two dynamic arrays of opaque pointers to rte_eth_burst_api.
> One for rx/tx queue data pointers, second for rx/tx callback pointers.
> To be more specific, something like:
>
> typedef uint16_t (*rte_eth_rx_burst_t)( void *rxq, struct rte_mbuf **rx_pkts, uint16_t nb_pkts, void *cbs);
> typedef uint16_t (*rte_eth_tx_burst_t)(void *txq, struct rte_mbuf **tx_pkts, uint16_t nb_pkts, void *cbs);
> ....
>
> struct rte_eth_burst_api {
> rte_eth_rx_burst_t rx_pkt_burst;
> /**< PMD receive function. */
> rte_eth_tx_burst_t tx_pkt_burst;
> /**< PMD transmit function. */
> rte_eth_tx_prep_t tx_pkt_prepare;
> /**< PMD transmit prepare function. */
> rte_eth_rx_queue_count_t rx_queue_count;
> /**< Get the number of used RX descriptors. */
> rte_eth_rx_descriptor_status_t rx_descriptor_status;
> /**< Check the status of a Rx descriptor. */
> rte_eth_tx_descriptor_status_t tx_descriptor_status;
> /**< Check the status of a Tx descriptor. */
> struct {
> void **queue_data; /* point to rte_eth_devices[port_id].data-> rx_queues */
> void **cbs; /* points to rte_eth_devices[port_id].post_rx_burst_cbs */
> } rx_data, tx_data;
> } __rte_cache_aligned;
>
> static inline uint16_t
> rte_eth_rx_burst(uint16_t port_id, uint16_t queue_id,
> struct rte_mbuf **rx_pkts, const uint16_t nb_pkts)
> {
> struct rte_eth_burst_api *p;
>
> if (port_id >= RTE_MAX_ETHPORTS || queue_id >= RTE_MAX_QUEUES_PER_PORT)
> return 0;
>
> p = &rte_eth_burst_api[port_id];
> return p->rx_pkt_burst(p->rx_data.queue_data[queue_id], rx_pkts, nb_pkts, p->rx_data.cbs[queue_id]);
That works.
> }
>
> Same for TX.
>
> If that looks ok to everyone, I'll try to prepare next version based on that.
Looks good to me.
> In theory that should avoid extra dereference problem and even reduce indirection.
> As a drawback data->rxq/txq should always be allocated for RTE_MAX_QUEUES_PER_PORT entries,
> but I presume that’s not a big deal.
>
> As a side question - is there any reason why rte_ethdev_trace_rx_burst() is invoked at very last point,
> while rte_ethdev_trace_tx_burst() after CBs but before actual tx_pkt_burst()?
> It would make things simpler if tracng would always be done either on entrance or exit of rx/tx_burst.
exit is fine.
>
> >
> > My suggestion to fix the problem by:
> > Removing the additional `data` redirection and pull callback function
> > pointers back
> > and keep rest as opaque as done in the existing patch like [1]
> >
> > I don't believe this has any real implication on future ABI stability
> > as we will not be adding
> > any new item in rte_eth_fp in any way as new features can be added in slowpath
> > rte_eth_dev as mentioned in the patch.
Ack
I will happy to test again after the rework and report any performance
issues if any.
Thaks for the great work :-)
> >
> > [2] is the patch of doing the same as I don't see any performance
> > regression after [2].
> >
> >
> > [1]
> > - struct rte_eth_burst_api {
> > - struct rte_eth_fp {
> > + void *data;
> > rte_eth_rx_burst_t rx_pkt_burst;
> > /**< PMD receive function. */
> > rte_eth_tx_burst_t tx_pkt_burst;
> > @@ -85,8 +100,19 @@ struct rte_eth_burst_api {
> > /**< Check the status of a Rx descriptor. */
> > rte_eth_tx_descriptor_status_t tx_descriptor_status;
> > /**< Check the status of a Tx descriptor. */
> > + /**
> > + * User-supplied functions called from rx_burst to post-process
> > + * received packets before passing them to the user
> > + */
> > + struct rte_eth_rxtx_callback
> > + *post_rx_burst_cbs[RTE_MAX_QUEUES_PER_PORT];
> > + /**
> > + * User-supplied functions called from tx_burst to pre-process
> > + * received packets before passing them to the driver for transmission.
> > + */
> > + struct rte_eth_rxtx_callback *pre_tx_burst_cbs[RTE_MAX_QUEUES_PER_PORT];
> > uintptr_t reserved[2];
> > -} __rte_cache_min_aligned;
> > +} __rte_cache_aligned;
> >
> > [2]
> > https://pastebin.com/CuqkrCW4
More information about the dev
mailing list