[dpdk-dev] [PATCH] parray: introduce internal API for dynamic arrays

Morten Brørup mb at smartsharesystems.com
Tue Jun 22 12:01:03 CEST 2021


> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Ananyev,
> Konstantin
> 
> >
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Ananyev,
> > > Konstantin
> > >
> > > >
> > > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Ananyev,
> > > > > Konstantin
> > > > >
> > > > > > > How can we hide the callbacks since they are used by inline
> > > burst
> > > > > functions.
> > > > > >
> > > > > > I probably I owe a better explanation to what I meant in
> first
> > > mail.
> > > > > > Otherwise it sounds confusing.
> > > > > > I'll try to write a more detailed one in next few days.
> > > > >
> > > > > Actually I gave it another thought over weekend, and might be
> we
> > > can
> > > > > hide rte_eth_dev_cb even in a simpler way. I'd use
> eth_rx_burst()
> > > as
> > > > > an example, but the same principle applies to other 'fast'
> > > functions.
> > > > >
> > > > >  1. Needed changes for PMDs rx_pkt_burst():
> > > > >     a) change function prototype to accept 'uint16_t port_id'
> and
> > > > > 'uint16_t queue_id',
> > > > >          instead of current 'void *'.
> > > > >     b) Each PMD rx_pkt_burst() will have to call
> > > rte_eth_rx_epilog()
> > > > > function at return.
> > > > >          This  inline function will do all CB calls for that
> queue.
> > > > >
> > > > > To be more specific, let say we have some PMD: xyz with RX
> > > function:
> > > > >
> > > > > uint16_t
> > > > > xyz_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
> uint16_t
> > > > > nb_pkts)
> > > > > {
> > > > >      struct xyz_rx_queue *rxq = rx_queue;
> > > > >      uint16_t nb_rx = 0;
> > > > >
> > > > >      /* do actual stuff here */
> > > > >     ....
> > > > >     return nb_rx;
> > > > > }
> > > > >
> > > > > It will be transformed to:
> > > > >
> > > > > uint16_t
> > > > > xyz_recv_pkts(uint16_t port_id, uint16_t queue_id, struct
> rte_mbuf
> > > > > **rx_pkts, uint16_t nb_pkts)
> > > > > {
> > > > >          struct xyz_rx_queue *rxq;
> > > > >          uint16_t nb_rx;
> > > > >
> > > > >          rxq = _rte_eth_rx_prolog(port_id, queue_id);
> > > > >          if (rxq == NULL)
> > > > >              return 0;
> > > > >          nb_rx = _xyz_real_recv_pkts(rxq, rx_pkts, nb_pkts);
> > > > >          return _rte_eth_rx_epilog(port_id, queue_id, rx_pkts,
> > > > > nb_pkts);
> > > > > }
> > > > >
> > > > > And somewhere in ethdev_private.h:
> > > > >
> > > > > static inline void *
> > > > > _rte_eth_rx_prolog(uint16_t port_id, uint16_t queue_id);
> > > > > {
> > > > >    struct rte_eth_dev *dev = &rte_eth_devices[port_id];
> > > > >
> > > > > #ifdef RTE_ETHDEV_DEBUG_RX
> > > > >         RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, NULL);
> > > > >         RTE_FUNC_PTR_OR_ERR_RET(*dev->rx_pkt_burst, NULL);
> > > > >
> > > > >         if (queue_id >= dev->data->nb_rx_queues) {
> > > > >                 RTE_ETHDEV_LOG(ERR, "Invalid RX queue_id=%u\n",
> > > > > queue_id);
> > > > >                 return NULL;
> > > > >         }
> > > > > #endif
> > > > >   return dev->data->rx_queues[queue_id];
> > > > > }
> > > > >
> > > > > static inline uint16_t
> > > > > _rte_eth_rx_epilog(uint16_t port_id, uint16_t queue_id, struct
> > > rte_mbuf
> > > > > **rx_pkts, const uint16_t nb_pkts);
> > > > > {
> > > > >     struct rte_eth_dev *dev = &rte_eth_devices[port_id];
> > > > >
> > > > > #ifdef RTE_ETHDEV_RXTX_CALLBACKS
> > > > >         struct rte_eth_rxtx_callback *cb;
> > > > >
> > > > >         /* __ATOMIC_RELEASE memory order was used when the
> > > > >          * call back was inserted into the list.
> > > > >          * Since there is a clear dependency between loading
> > > > >          * cb and cb->fn/cb->next, __ATOMIC_ACQUIRE memory
> order is
> > > > >          * not required.
> > > > >          */
> > > > >         cb = __atomic_load_n(&dev->post_rx_burst_cbs[queue_id],
> > > > >                                 __ATOMIC_RELAXED);
> > > > >
> > > > >         if (unlikely(cb != NULL)) {
> > > > >                 do {
> > > > >                         nb_rx = cb->fn.rx(port_id, queue_id,
> > > rx_pkts,
> > > > > nb_rx,
> > > > >                                                 nb_pkts, cb-
> > > >param);
> > > > >                         cb = cb->next;
> > > > >                 } while (cb != NULL);
> > > > >         }
> > > > > #endif
> > > > >
> > > > >         rte_ethdev_trace_rx_burst(port_id, queue_id, (void
> > > **)rx_pkts,
> > > > > nb_rx);
> > > > >         return nb_rx;
> > > > >  }
> > > >
> > > > That would make the compiler inline _rte_eth_rx_epilog() into the
> > > driver when compiling the DPDK library. But
> > > > RTE_ETHDEV_RXTX_CALLBACKS is a definition for the application
> > > developer to use when compiling the DPDK application.
> > >
> > > I believe it is for both - user app and DPDK drivers.
> > > AFAIK, they both have to use the same rte_config.h, otherwise
> things
> > > will be broken.
> > > If let say RTE_ETHDEV_RXTX_CALLBACKS is not enabled in ethdev, then
> > > user wouldn't be able to add a callback at first place.
> >
> > In the case of RTE_ETHDEV_RXTX_CALLBACKS, it is independent:
> 
> Not really.
> There are few libraries within DPDK that do rely on rx/tx callbacks:
> bpf, latencystat, pdump, power.

I do not consider these to be core libraries in DPDK. If these libraries are used in an application, that application must be compiled with RTE_ETHDEV_RXTX_CALLBACKS.

> With the approach above their functionality will be broken -
> setup functions will return success, but actual callbacks will not be
> invoked.

I just took a look at bpf and latencystat. Bpf correctly checks for the return code, and returns an error if ethdev has been compiled without RTE_ETHDEV_RXTX_CALLBACKS. Latencystat checks for the return code, but only logs the error and continues as if everything is good anyway. That is a bug in the latencystat library.

> From other side, some libraries do invoke rx/tx burst on their own: ip-
> pipeline, graph.
> For them callback invocation will continue to work, even when
> RTE_ETHDEV_RXTX_CALLBACKS is disabled in the app.
> In general, building DPDK libs and user app with different rte_config.h
> is really a bad idea.
> It might work in some cases, but I believe it is not supported and user
> should not rely on it.
> If user needs to disable RTE_ETHDEV_RXTX_CALLBACKS in his app, then the
> proper way would be:
> - update rte_config.h
> - rebuild both DPDK and the app with new config

In principle, I completely agree with your reasoning from a high level perspective.

However, accepting it would probably lead to the RTE_ETHDEV_RXTX_CALLBACKS compile time configuration option being completely removed, and ethdev callbacks being always supported. And I don't think such a performance degradation of a core DPDK library should be accepted.

<rant on>
I was opposed to the "callback hooks" concept from the beginning, and still am. The path that packets take through various functions and pipeline stages should be determined and implemented by the application, not by the DPDK libraries.

If we want to provide a standardized advanced IP pipeline for DPDK, we could offer it as a middle layer library using the underlying DPDK libraries to implement various callbacks, IP fragmentation reassembly, etc.. Don't tweak the core libraries (costing memory and/or performance) to support an increasing amount of supplemental libraries, which may not be used by all applications.

We don't want DPDK to become like the Linux IP stack, with callback hooks and runtime installable protocol handling everywhere. All this cruft with their small performance degradations adds up!
<rant off>

> 
> >
> > If it is not compiled with the DPDK library, attempts to install
> callbacks from the application will fail with ENOTSUP.
> >
> > If it is not compiled with the DPDK application, no time will be
> spent trying to determine if any there are any callbacks to call.
> >
> > > BTW,  such change will allow us to make RTE_ETHDEV_RXTX_CALLBACKS
> > > internal for ethdev/PMD layer, which is a good thing from my
> > > perspective.
> >
> > If it can be done without degrading performance for applications not
> using callbacks.



More information about the dev mailing list