[dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists

Bruce Richardson bruce.richardson at intel.com
Wed Jun 15 16:22:23 CEST 2016


On Wed, Jun 15, 2016 at 03:20:27PM +0100, Ananyev, Konstantin wrote:
> 
> 
> > -----Original Message-----
> > From: Ivan Boule [mailto:ivan.boule at 6wind.com]
> > Sent: Wednesday, June 15, 2016 3:07 PM
> > To: Richardson, Bruce; Ananyev, Konstantin
> > Cc: Thomas Monjalon; Pattan, Reshma; dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
> > 
> > On 06/15/2016 03:29 PM, Bruce Richardson wrote:
> > > On Wed, Jun 15, 2016 at 12:40:16PM +0000, Ananyev, Konstantin wrote:
> > >> Hi Ivan,
> > >>
> > >>> -----Original Message-----
> > >>> From: Ivan Boule [mailto:ivan.boule at 6wind.com]
> > >>> Sent: Wednesday, June 15, 2016 1:15 PM
> > >>> To: Thomas Monjalon; Ananyev, Konstantin
> > >>> Cc: Pattan, Reshma; dev at dpdk.org
> > >>> Subject: Re: [dpdk-dev] [PATCH v9 1/8] ethdev: use locks to protect Rx/Tx callback lists
> > >>>
> > >>> On 06/15/2016 10:48 AM, Thomas Monjalon wrote:
> > >>>
> > >>>>>
> > >>>>>> I think the read access would need locking but we do not want it
> > >>>>>> in fast path.
> > >>>>>
> > >>>>> I don't think it would be needed.
> > >>>>> As I said - read/write interaction didn't change from what we have right now.
> > >>>>> But if you have some particular scenario in mind that you believe would cause
> > >>>>> a race condition - please speak up.
> > >>>>
> > >>>> If we add/remove a callback during a burst? Is it possible that the next
> > >>>> pointer would have a wrong value leading to a crash?
> > >>>> Maybe we need a comment to state that we should not alter burst
> > >>>> callbacks while running burst functions.
> > >>>>
> > >>>
> > >>> Hi Reshma,
> > >>>
> > >>> You claim that the "rte_eth_rx_cb_lock" does not need to be hold in the
> > >>> function "rte_eth_rx_burst()" while parsing the "post_rx_burst_cbs" list
> > >>> of RX callbacks associated with the polled RX queue to safely remove RX
> > >>> callback(s) in parallel.
> > >>> The problem is not [only] with the setting and the loading of "cb->next"
> > >>> that you assume to be atomic operations, which is certainly true on most
> > >>> CPUs.
> > >>> I see the 2 important following issues:
> > >>>
> > >>> 1) the "rte_eth_rxtx_callback" data structure associated with a removed
> > >>> RX callback could still be accessed in the callback parsing loop of the
> > >>> function "rte_eth_rx_burst()" after having been freed in parallel.
> > >>>
> > >>> BUT SUCH A BAD SITUATION WILL NOT CURRENTLY HAPPEN, THANKS TO THE NICE
> > >>> MEMORY LEAK BUG in the function "rte_eth_remove_rx_callback()"  that
> > >>> does not free the "rte_eth_rxtx_callback" data structure associated with
> > >>> the removed callback !
> > >>
> > >> Yes, though it is documented behaviour, someone can probably
> > >> refer it as a feature, not a bug ;)
> > >>
> > >
> > > +1
> > > This is definitely not a bug, this is absolutely by design. One may argue with
> > > the design, but it was done for a definite reason, so as to avoid paying the
> > > penalty of having locks. It pushes more responsibility onto the app, but it
> > > does allow the app to choose the best solution for managing the freeing of
> > > memory for its situation. The alternative is to force all apps to pay the cost
> > > of having locks, even if better options for freeing the memory are available.
> > >
> > > /Bruce
> > >
> > 
> > -1 (not to say 0xFFFFFFFF)
> > 
> > This is definitely an API design bug !
> > I would say that if you don't know how to free a resource that you
> > allocate, it is very likely that you are wrong allocating it.
> > And this is exactly what happens here with RX/TX callback data structures.
> > This problem can easily be addressed by just changing the API as follows:
> > 
> > Change
> >      void *
> >      rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
> >                              rte_rx_callback_fn fn, void *user_param)
> > 
> > to
> >      int
> >      rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
> >                              struct rte_eth_rxtx_callback *cb)
> > 
> > In addition of solving the problem, this approach makes the API
> > consistent and let the application allocate "rte_eth_rxtx_callback" data
> > structures through any appropriate mean.
> 
> That might make API a bit cleaner, but I don't see how it fixes the generic problem:
> there is still no way to know by the upper layer when it is safe to free/re-use
> removed callback, but to make sure that all IO on that queue is stopped
> (I.E. some external synchronisation around the queue).   

Actually, it allows other, more creative solutions, like an app having a
statically allocated set/pool of callback structures that it doesn't need to
allocate or free at all.

/Bruce



More information about the dev mailing list