[dpdk-dev] [PATCH v4 2/8] lib/librte_ether: defind RX/TX lock mode

Lu, Wenzhuo wenzhuo.lu at intel.com
Mon Jun 13 03:06:54 CEST 2016


Hi Konstantin,

> -----Original Message-----
> From: Ananyev, Konstantin
> Sent: Monday, June 13, 2016 7:17 AM
> To: Lu, Wenzhuo; Tao, Zhe; dev at dpdk.org
> Cc: Richardson, Bruce; Chen, Jing D; Liang, Cunming; Wu, Jingjing; Zhang,
> Helin
> Subject: RE: [PATCH v4 2/8] lib/librte_ether: defind RX/TX lock mode
> 
> Hi Wenzhuo,
> 
> >
> > Hi Konstantin,
> >
> >
> > > -----Original Message-----
> > > From: Ananyev, Konstantin
> > > Sent: Wednesday, June 8, 2016 5:20 PM
> > > To: Lu, Wenzhuo; Tao, Zhe; dev at dpdk.org
> > > Cc: Richardson, Bruce; Chen, Jing D; Liang, Cunming; Wu, Jingjing;
> > > Zhang, Helin
> > > Subject: RE: [PATCH v4 2/8] lib/librte_ether: defind RX/TX lock mode
> > >
> > >
> > >
> > > >
> > > > Hi Konstantin,
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Ananyev, Konstantin
> > > > > Sent: Tuesday, June 7, 2016 5:59 PM
> > > > > To: Tao, Zhe; dev at dpdk.org
> > > > > Cc: Lu, Wenzhuo; Richardson, Bruce; Chen, Jing D; Liang,
> > > > > Cunming; Wu, Jingjing; Zhang, Helin
> > > > > Subject: RE: [PATCH v4 2/8] lib/librte_ether: defind RX/TX lock
> > > > > mode
> > > > >
> > > > >
> > > > > Hi Zhe & Wenzhuo,
> > > > >
> > > > > Please find my comments below.
> > > > > BTW, for clarification - is that patch for 16.11?
> > > > > I believe it's too late to introduce such significant change in 16.07.
> > > > > Thanks
> > > > > Konstantin
> > > > Thanks for the comments.
> > > > Honestly, our purpose is 16.07. Realizing the big impact, we use
> > > > NEXT_ABI to comment our change. So, I think although we want to
> > > > merge it in
> > > 16.07 this change will become effective after we remove NEXT_ABI in
> 16.11.
> > >
> > > I don't think it is achievable.
> > > First I think your code is not in proper shape yet, right now.
> > > Second, as you said, it is a significant change and I would like to
> > > hear opinions from the rest of the community.
> > Agree it should have risk. I mean our target is 16.07. But surely if it can be
> achieved depends on the feedback from the community.
> >
> > >
> > > >
> > > > >
> > > > > > Define lock mode for RX/TX queue. Because when resetting the
> > > > > > device we want the resetting thread to get the lock of the
> > > > > > RX/TX queue to make sure the RX/TX is stopped.
> > > > > >
> > > > > > Using next ABI macro for this ABI change as it has too much
> > > > > > impact. 7 APIs and 1 global variable are impacted.
> > > > > >
> > > > > > Signed-off-by: Wenzhuo Lu <wenzhuo.lu at intel.com>
> > > > > > Signed-off-by: Zhe Tao <zhe.tao at intel.com>
> > > > > > ---
> > > > > >  lib/librte_ether/rte_ethdev.h | 62
> > > > > > +++++++++++++++++++++++++++++++++++++++++++
> > > > > >  1 file changed, 62 insertions(+)
> > > > > >
> > > > > > diff --git a/lib/librte_ether/rte_ethdev.h
> > > > > > b/lib/librte_ether/rte_ethdev.h index 74e895f..4efb5e9 100644
> > > > > > --- a/lib/librte_ether/rte_ethdev.h
> > > > > > +++ b/lib/librte_ether/rte_ethdev.h
> > > > > > @@ -354,7 +354,12 @@ struct rte_eth_rxmode {
> > > > > >  		jumbo_frame      : 1, /**< Jumbo Frame Receipt
> enable. */
> > > > > >  		hw_strip_crc     : 1, /**< Enable CRC stripping by
> hardware. */
> > > > > >  		enable_scatter   : 1, /**< Enable scatter packets rx
> handler */
> > > > > > +#ifndef RTE_NEXT_ABI
> > > > > >  		enable_lro       : 1; /**< Enable LRO */
> > > > > > +#else
> > > > > > +		enable_lro       : 1, /**< Enable LRO */
> > > > > > +		lock_mode        : 1; /**< Using lock path */
> > > > > > +#endif
> > > > > >  };
> > > > > >
> > > > > >  /**
> > > > > > @@ -634,11 +639,68 @@ struct rte_eth_txmode {
> > > > > >  		/**< If set, reject sending out tagged pkts */
> > > > > >  		hw_vlan_reject_untagged : 1,
> > > > > >  		/**< If set, reject sending out untagged pkts */
> > > > > > +#ifndef RTE_NEXT_ABI
> > > > > >  		hw_vlan_insert_pvid : 1;
> > > > > >  		/**< If set, enable port based VLAN insertion */
> > > > > > +#else
> > > > > > +		hw_vlan_insert_pvid : 1,
> > > > > > +		/**< If set, enable port based VLAN insertion */
> > > > > > +		lock_mode : 1;
> > > > > > +		/**< If set, using lock path */ #endif
> > > > > >  };
> > > > > >
> > > > > >  /**
> > > > > > + * The macros for the RX/TX lock mode functions  */ #ifdef
> > > > > > +RTE_NEXT_ABI #define RX_LOCK_FUNCTION(dev, func) \
> > > > > > +	(dev->data->dev_conf.rxmode.lock_mode ? \
> > > > > > +	func ## _lock : func)
> > > > > > +
> > > > > > +#define TX_LOCK_FUNCTION(dev, func) \
> > > > > > +	(dev->data->dev_conf.txmode.lock_mode ? \
> > > > > > +	func ## _lock : func)
> > > > > > +#else
> > > > > > +#define RX_LOCK_FUNCTION(dev, func) func
> > > > > > +
> > > > > > +#define TX_LOCK_FUNCTION(dev, func) func #endif
> > > > > > +
> > > > > > +/* Add the lock RX/TX function for VF reset */ #define
> > > > > > +GENERATE_RX_LOCK(func, nic) \ uint16_t func ## _lock(void
> > > > > > +*rx_queue, \
> > > > > > +		      struct rte_mbuf **rx_pkts, \
> > > > > > +		      uint16_t nb_pkts) \
> > > > > > +{					\
> > > > > > +	struct nic ## _rx_queue *rxq = rx_queue; \
> > > > > > +	uint16_t nb_rx = 0; \
> > > > > > +						\
> > > > > > +	if (rte_spinlock_trylock(&rxq->rx_lock)) { \
> > > > > > +		nb_rx = func(rx_queue, rx_pkts, nb_pkts); \
> > > > > > +		rte_spinlock_unlock(&rxq->rx_lock); \
> > > > > > +	} \
> > > > > > +	\
> > > > > > +	return nb_rx; \
> > > > > > +}
> > > > > > +
> > > > > > +#define GENERATE_TX_LOCK(func, nic) \ uint16_t func ##
> > > > > > +_lock(void *tx_queue, \
> > > > > > +		      struct rte_mbuf **tx_pkts, \
> > > > > > +		      uint16_t nb_pkts) \
> > > > > > +{					\
> > > > > > +	struct nic ## _tx_queue *txq = tx_queue; \
> > > > > > +	uint16_t nb_tx = 0; \
> > > > > > +						\
> > > > > > +	if (rte_spinlock_trylock(&txq->tx_lock)) { \
> > > > > > +		nb_tx = func(tx_queue, tx_pkts, nb_pkts); \
> > > > > > +		rte_spinlock_unlock(&txq->tx_lock); \
> > > > > > +	} \
> > > > > > +	\
> > > > > > +	return nb_tx; \
> > > > > > +}
> > > > >
> > > > > 1. As I said in off-line dicussiion, I think this locking could
> > > > > (and I think better be) impelented completely on rte_ethdev layer.
> > > > > So actual PMD code will be unaffected.
> > > > > Again that avoids us to introduce _lock version of every RX/Tx
> > > > > function in each PMD.
> > > > One purpose of implementing the lock in PMD layer is to avoid ABI
> > > > change. But we introduce the field lock_mode in struct
> > > > rte_eth_rx/txmode. So seems it's not a good reason now :) The
> > > > other purpose is we want to add a lock for every queue. But in rte
> > > > layer the queue is void *, so we add the lock in the specific
> > > > structures of the NICs. But as
> > > you mentioned below, we can add the lock as
> > > dev->data->rx_queue_state it the struct rte_eth_dev_data.
> > > > So, I prefer to add the lock in rte layer now.
> > >
> > > OK.
> > >
> > > >
> > > > >
> > > > > 2. Again, as discussed offline, I think it is better to have an
> > > > > explicit
> > > > > rte_eth_(rx|tx)_burst_lock(sync?) API, instead of add new fileds
> > > > > into RX/TX config strcutures.
> > > > > Would help to avoid any confusion, I think.
> > > > We want the users to choose the rx/tx path without  lock if
> > > > they're sensitive to the performance and can handle the reset
> > > > event in their APP. After
> > > introducing new fields of config struct, users can change the config
> > > to choose the different path.
> > >
> > > I understand what you are doing.
> > >
> > > > If we introduce new API, it may be harder for the use to use it. I
> > > > mean when users want to use lock mode, they may need to replace
> > > > all the
> > > rte_eth_rx/tx_burst by rte_eth_rx/tx_burst_lock.
> > >
> > > Yes, my opinion if users would like to use locking API they need to
> > > call it explicitly.
> > >
> > >
> > > >So if we add the lock in rte layer, I still prefer adding lock_mode
> > > >in the  configuration, and the rte_eth_rx/tx_burst is changed like
> > > >this, rte_eth_rx/tx_burst  {
> > > > + if lock_mode
> > > > + try_lock
> > > > ......
> > > > + if lock_mode
> > > > + release_lock
> > > > }
> > >
> > > My preference is to keep existing rx/tx_burst() functions unaffected
> > > by that patch.
> > > At least for now.
> > > I suppose that will minimise the risks and help users to avoid
> > > confusion what API
> > > (locking/non-locking) is in use.
> > OK. Let me add new APIs.
> >
> > >
> > > >
> > > >
> > > > >
> > > > > 3.  I thought the plan was to introduce a locking in all
> > > > > appropriate control path functions (dev_start/dev_stop etc.)
> > > > > Without that locking version of RX/TX seems a bit useless.
> > > > > Yes, I understand that you do use locking inside dev_reset, but
> > > > > I suppose the plan was to have a generic solution, no?
> > > > > Again, interrupt fire when user invokes dev_start/stop or so, so
> > > > > we still need some synchronisation between them.
> > > > >
> > > > > To be more specific, I thought about something like that:
> > > > >
> > > > > static inline uint16_t
> > > > > rte_eth_rx_burst_lock(uint8_t port_id, uint16_t queue_id,
> > > > >                  struct rte_mbuf **rx_pkts, const uint16_t nb_pkts) {
> > > > >         struct rte_eth_dev *dev = &rte_eth_devices[port_id];
> > > > >
> > > > > #ifdef RTE_LIBRTE_ETHDEV_DEBUG
> > > > >         RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, 0);
> > > > >         RTE_FUNC_PTR_OR_ERR_RET(*dev->rx_pkt_burst, 0);
> > > > >
> > > > >         if (queue_id >= dev->data->nb_rx_queues) {
> > > > >                 RTE_PMD_DEBUG_TRACE("Invalid RX queue_id=%d\n",
> queue_id);
> > > > >                 return 0;
> > > > >         }
> > > > > #endif
> > > > >
> > > > > + if
> > > > > + (rte_spinlock_trylock(&dev->data->rx_queue_state[rx_queue_id].
> > > > > + lock)
> > > == 0)
> > > > > +	return 0;
> > > > > +  else if (dev->data->rx_queue_state[rx_queue_id] ==
> > > > > RTE_ETH_QUEUE_STATE_STOPPED)) {
> > > > > +	rte_spinlock_unlock(&dev->data-
> >rx_queue_state[rx_queue_id].unlock);
> > > > > +	return 0;
> > > > > +
> > > > >
> > > > >  nb_rx = (*dev->rx_pkt_burst)(dev->data->rx_queues[queue_id],
> > > > >                         rx_pkts, nb_pkts);
> > > > >
> > > > > + rte_spinlock_unlock(&dev->data-
> >rx_queue_state[rx_queue_id].un
> > > > > + lock
> > > > > + );
> > > > >
> > > > > ....
> > > > >
> > > > > return nb_rx;
> > > > > }
> > > > >
> > > > > And inside queue_start:
> > > > >
> > > > > int
> > > > > rte_eth_dev_rx_queue_start(uint8_t port_id, uint16_t rx_queue_id)
> {
> > > > >         struct rte_eth_dev *dev;
> > > > >
> > > > >         RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
> > > > >
> > > > >         dev = &rte_eth_devices[port_id];
> > > > >         if (rx_queue_id >= dev->data->nb_rx_queues) {
> > > > >                 RTE_PMD_DEBUG_TRACE("Invalid RX queue_id=%d\n",
> > > rx_queue_id);
> > > > >                 return -EINVAL;
> > > > >         }
> > > > >
> > > > >         RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_start,
> > > > > -ENOTSUP);
> > > > >
> > > > >
> > > > > rte_spinlock_lock(&dev->data->rx_queue_state[rx_queue_id].lock)
> > > > I think you add the lock here to stop the rx/tx.
> > > > But to my opinion, we should lock the rx/tx much earlier before
> > > > starting the queue. For example, when stop the port, the resource
> > > > of the
> > > queues may be released.
> > >
> > > I didn't get you here...
> > > Before releasing the queue resources, queue_stop() has to be executed,
> right?
> > Sorry, I saw your example with rte_eth_dev_rx_queue_start, I didn't
> > know you also want to change rte_eth_dev_rx_queue_stop too.
> > Agree this should work it we call queue_start/stop when reset the
> > port. But we will not call them. I find the queue_stop/start are per- queue
> functions and not supported by all NICs.
> 
> But right now you do reset only for ixgbe/i40e.
Not only for ixgbe/i40e. You forget igb, which doesn't support queue_start/stop :)

> For these devices we defiantly do support queue start/stop.
> And again, it is not only about reset op.
> If we want to add rx locked (synced), I think it should be in sync with all
> control API that changes queue state.
> As I said before it is a lot of work and a lot of hassle...
> So probably the easiest (and might be safiest) way just leave things as there
> are right now:
> we allow user to setup a callback on VF reset, and it is user responsibility to
> make sure no RX/TX is active while reset operation is performed.
> Pretty much what Olivier and Stephen suggested, as I understand.
Agree. It's not a good way to add lock for just one feature. It could be tricky if we want to extend the lock to other features. A whole picture is needed.
We've sent another patch set to let the user setup a callback on VF reset. Depend on that, user can use existing rte APIs to reset the VF port. But how about your opinion if we add a specific rte_reset API? It may be easier for the user.

> Konstantin
> 
> > Our solution now is stop the whole port and restart the whole port. We
> will not stop/restart queue by queue.
> >
> > >
> > > >The rx/tx cannot be executed. So I prefer to get the lock before
> > > >stopping the
> > > ports.
> > >
> > > Might be I wasn't clear enough here.
> > > What I think we need to have:
> > >  -To stop/start/rx/tx the queue (or do any other action that might
> > > change the queue internal structure)
> > >    you have to grab the lock.
> > >    After queue is stopped it's state has to be changed to
> > > QUEUE_STATE_STOPPED (whti queue lock grabbed),
> > >    so rx/tx_locked wouldn't    proceed with that queue.
> > >   - dev_stop() - has to stop all its queues first, i.e. it needs to
> > > call queue_stop() for all of them.
> > >  So after dev_stop() had finished  - all device queues have to be in
> > > QUEUE_STATE_STOPPED Same about dev_start() - after it does all other
> > > things - it will call queue_start() for all it's queues.
> > > that will bring them into QUEUE_STARTED.
> > > After that rx/tx_locked can use them again.
> > >
> > > >Maybe better to keep the spinlock in the dev_reset.
> > >
> > > Might be not :)
> > >
> > > >
> > > > >
> > > > >         if (dev->data->rx_queue_state[rx_queue_id] !=
> > > > > RTE_ETH_QUEUE_STATE_STOPPED) {
> > > > >                 RTE_PMD_DEBUG_TRACE("Queue %" PRIu16" of device
> > > > > with port_id=%" PRIu8
> > > > >                         " already started\n",
> > > > >                         rx_queue_id, port_id);
> > > > >                 ret = -EINVAL 0;
> > > > >         } else
> > > > >         	ret = dev->dev_ops->rx_queue_start(dev, rx_queue_id);
> > > > >
> > > > >
> > > > > rte_spinlock_unlock(&dev->data-
> >rx_queue_state[rx_queue_id].unlo
> > > > > ck);
> > > > >
> > > > >    return ret;
> > > > > }
> > > > >
> > > > > Then again, we don't need to do explicit locking inside dev_reset().
> > > > > Does it make sense to you guys?
> > > > Please see the answer above.
> > > >
> > > > >
> > > > >
> > > > > > +
> > > > > > +/**
> > > > > >   * A structure used to configure an RX ring of an Ethernet port.
> > > > > >   */
> > > > > >  struct rte_eth_rxconf {
> > > > > > --
> > > > > > 2.1.4



More information about the dev mailing list