[dpdk-dev] [PATCH v2 1/4] ether: support deferred queue setup

Ananyev, Konstantin konstantin.ananyev at intel.com
Thu Mar 15 14:16:50 CET 2018


Hi Qi,

> -----Original Message-----
> From: Zhang, Qi Z
> Sent: Thursday, March 15, 2018 3:14 AM
> To: Ananyev, Konstantin <konstantin.ananyev at intel.com>; thomas at monjalon.net
> Cc: dev at dpdk.org; Xing, Beilei <beilei.xing at intel.com>; Wu, Jingjing <jingjing.wu at intel.com>; Lu, Wenzhuo <wenzhuo.lu at intel.com>
> Subject: RE: [dpdk-dev] [PATCH v2 1/4] ether: support deferred queue setup
> 
> Hi Konstantin:
> 
> > -----Original Message-----
> > From: Ananyev, Konstantin
> > Sent: Wednesday, March 14, 2018 8:32 PM
> > To: Zhang, Qi Z <qi.z.zhang at intel.com>; thomas at monjalon.net
> > Cc: dev at dpdk.org; Xing, Beilei <beilei.xing at intel.com>; Wu, Jingjing
> > <jingjing.wu at intel.com>; Lu, Wenzhuo <wenzhuo.lu at intel.com>; Zhang, Qi Z
> > <qi.z.zhang at intel.com>
> > Subject: RE: [dpdk-dev] [PATCH v2 1/4] ether: support deferred queue setup
> >
> > Hi Qi,
> >
> > >
> > > The patch let etherdev driver expose the capability flag through
> > > rte_eth_dev_info_get when it support deferred queue configuraiton,
> > > then base on the flag rte_eth_[rx|tx]_queue_setup could decide
> > > continue to setup the queue or just return fail when device already
> > > started.
> > >
> > > Signed-off-by: Qi Zhang <qi.z.zhang at intel.com>
> > > ---
> > >  doc/guides/nics/features.rst  |  8 ++++++++
> > > lib/librte_ether/rte_ethdev.c | 30 ++++++++++++++++++------------
> > > lib/librte_ether/rte_ethdev.h | 11 +++++++++++
> > >  3 files changed, 37 insertions(+), 12 deletions(-)
> > >
> > > diff --git a/doc/guides/nics/features.rst
> > > b/doc/guides/nics/features.rst index 1b4fb979f..36ad21a1f 100644
> > > --- a/doc/guides/nics/features.rst
> > > +++ b/doc/guides/nics/features.rst
> > > @@ -892,7 +892,15 @@ Documentation describes performance values.
> > >
> > >  See ``dpdk.org/doc/perf/*``.
> > >
> > > +.. _nic_features_queue_deferred_setup_capabilities:
> > >
> > > +Queue deferred setup capabilities
> > > +---------------------------------
> > > +
> > > +Supports queue setup / release after device started.
> > > +
> > > +* **[provides] rte_eth_dev_info**:
> > >
> > ``deferred_queue_config_capa:DEV_DEFERRED_RX_QUEUE_SETUP,DEV_DEFE
> > RRED_
> > > TX_QUEUE_SETUP,DEV_DEFERRED_RX_QUEUE_RELE
> > > ASE,DEV_DEFERRED_TX_QUEUE_RELEASE``.
> > > +* **[related]  API**: ``rte_eth_dev_info_get()``.
> > >
> > >  .. _nic_features_other:
> > >
> > > diff --git a/lib/librte_ether/rte_ethdev.c
> > > b/lib/librte_ether/rte_ethdev.c index a6ce2a5ba..6c906c4df 100644
> > > --- a/lib/librte_ether/rte_ethdev.c
> > > +++ b/lib/librte_ether/rte_ethdev.c
> > > @@ -1425,12 +1425,6 @@ rte_eth_rx_queue_setup(uint16_t port_id,
> > uint16_t rx_queue_id,
> > >  		return -EINVAL;
> > >  	}
> > >
> > > -	if (dev->data->dev_started) {
> > > -		RTE_PMD_DEBUG_TRACE(
> > > -		    "port %d must be stopped to allow configuration\n", port_id);
> > > -		return -EBUSY;
> > > -	}
> > > -
> > >  	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_infos_get,
> > -ENOTSUP);
> > >  	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_setup,
> > -ENOTSUP);
> > >
> > > @@ -1474,10 +1468,19 @@ rte_eth_rx_queue_setup(uint16_t port_id,
> > uint16_t rx_queue_id,
> > >  		return -EINVAL;
> > >  	}
> > >
> > > +	if (dev->data->dev_started &&
> > > +		!(dev_info.deferred_queue_config_capa &
> > > +			DEV_DEFERRED_RX_QUEUE_SETUP))
> > > +		return -EINVAL;
> > > +
> >
> > I think now you have to check here that the queue is stopped.
> > Otherwise you might attempt to reconfigure running queue.
> 
> I'm not sure if it's necessary to let application use different API sequence for a deferred configure and deferred re-configure.
> Can we just call dev_ops->rx_queue_stop before rx_queue_release here

I don't follow you here.
Let say now inside queue_start() we do check:

if (dev->data->rx_queue_state[rx_queue_id] != RTE_ETH_QUEUE_STATE_STOPPED)

Right now it is not possible to call queue_setup() without dev_stop() before it -
that's why we have check if (dev->data->dev_started) in queue_setup() right now.
Though with your patch it not the case anymore - user is able to call queue_setup()
without stopping the whole device.
But he still has to stop the queue. 

> 
> >
> >
> > >  	rxq = dev->data->rx_queues;
> > >  	if (rxq[rx_queue_id]) {
> > >  		RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_release,
> > >  					-ENOTSUP);
> >
> > I don't think it is *that* straightforward.
> > rx_queue_setup() parameters can imply different rx function (and related dev
> > icesettings) that are already setuped by previous queue_setup()/dev_start.
> > So I think you need to do one of 2 things:
> > 1. rework ethdev layer to introduce a separate rx function (and related
> > settings) for each queue.
> > 2. at rx_queue_setup() if it is invoked after dev_start - check that given
> > queue settings wouldn't contradict with current device settings  (rx function,
> > etc.).
> > If they do - return an error.
> Yes, I think what we have is option 2 here, the dev_ops->rx_queue_setup will return fail if conflict with previous setting

Hmm and what makes you think that?
As I know it is not the case  right now.
Let say I do:
    ....
   rx_queue_setup(port=0,queue=0, mp=mb_size_2048);
   dev_start(port=0);
   ...
   rx_queue_setup(port=0,queue=1,mp=mb_size_1024);
   
 If current rx function doesn't support multi-segs then second rx_queue_setup() should fail.
 Though I don't think that would happen with the current implementation. 

Same story for TX offloads, though it probably not that critical, as for most Intel PMDs HW TX offloads will become per port in 18.05.

As I can see you do have either of these options implemented right  now - that's the problem.

> I'm also thinking about option 1, the idea is to move per queue rx/tx function into driver layer, so it will not break existing API.
> 
> 1. driver can expose the capability like per_queue_rx or per_queue_tx
> 2. application can enable this capability by dev_config with rte_eth_conf
> 3, if per_queue_rx is not enable, nothing change, so we are at option 2
> 4. if per_queue_rx is enabled, driver will set rx_pkt_burst with a hook function which redirect to an function ptr in a per queue rx function
> tables ( I guess performance is impacted somehow, but this is the cost if you want different offload for different queue)

I don't think we need to overcomplicate things here.
It should be transparent to the user - user just calls queue_setup() - based on its input parameters
PMD selects a function that fits best.
Pretty much what we have right now, just possibly have an array of functions (one per queue).

> 
> >
> > From my perspective - 1) is a better choice though it required more work,
> > and possibly ABI breakage.
> > I did some work in that direction as RFC:
> > http://dpdk.org/dev/patchwork/patch/31866/
> 
> I will learn this, thanks for the heads up.
> >
> > 2) might be also possible, but looks a bit clumsy as rx_queue_setup() might
> > now fail even with valid parameters - all depends on previous queue
> > configurations.
> >
> > Same story applies for TX.
> >
> >
> > > +		if (dev->data->dev_started &&
> > > +			!(dev_info.deferred_queue_config_capa &
> > > +				DEV_DEFERRED_RX_QUEUE_RELEASE))
> > > +			return -EINVAL;
> > >  		(*dev->dev_ops->rx_queue_release)(rxq[rx_queue_id]);
> > >  		rxq[rx_queue_id] = NULL;
> > >  	}
> > > @@ -1573,12 +1576,6 @@ rte_eth_tx_queue_setup(uint16_t port_id,
> > uint16_t tx_queue_id,
> > >  		return -EINVAL;
> > >  	}
> > >
> > > -	if (dev->data->dev_started) {
> > > -		RTE_PMD_DEBUG_TRACE(
> > > -		    "port %d must be stopped to allow configuration\n", port_id);
> > > -		return -EBUSY;
> > > -	}
> > > -
> > >  	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_infos_get,
> > -ENOTSUP);
> > >  	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_queue_setup,
> > -ENOTSUP);
> > >
> > > @@ -1596,10 +1593,19 @@ rte_eth_tx_queue_setup(uint16_t port_id,
> > uint16_t tx_queue_id,
> > >  		return -EINVAL;
> > >  	}
> > >
> > > +	if (dev->data->dev_started &&
> > > +		!(dev_info.deferred_queue_config_capa &
> > > +			DEV_DEFERRED_TX_QUEUE_SETUP))
> > > +		return -EINVAL;
> > > +
> > >  	txq = dev->data->tx_queues;
> > >  	if (txq[tx_queue_id]) {
> > >  		RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_queue_release,
> > >  					-ENOTSUP);
> > > +		if (dev->data->dev_started &&
> > > +			!(dev_info.deferred_queue_config_capa &
> > > +				DEV_DEFERRED_TX_QUEUE_RELEASE))
> > > +			return -EINVAL;
> > >  		(*dev->dev_ops->tx_queue_release)(txq[tx_queue_id]);
> > >  		txq[tx_queue_id] = NULL;
> > >  	}
> > > diff --git a/lib/librte_ether/rte_ethdev.h
> > > b/lib/librte_ether/rte_ethdev.h index 036153306..410e58c50 100644
> > > --- a/lib/librte_ether/rte_ethdev.h
> > > +++ b/lib/librte_ether/rte_ethdev.h
> > > @@ -981,6 +981,15 @@ struct rte_eth_conf {
> > >   */
> > >  #define DEV_TX_OFFLOAD_SECURITY         0x00020000
> > >
> > > +#define DEV_DEFERRED_RX_QUEUE_SETUP 0x00000001 /**< Deferred
> > setup rx
> > > +queue */ #define DEV_DEFERRED_TX_QUEUE_SETUP 0x00000002 /**<
> > Deferred
> > > +setup tx queue */ #define DEV_DEFERRED_RX_QUEUE_RELEASE
> > 0x00000004
> > > +/**< Deferred release rx queue */ #define
> > > +DEV_DEFERRED_TX_QUEUE_RELEASE 0x00000008 /**< Deferred release
> > tx
> > > +queue */
> > > +
> >
> > I don't think we do need flags for both setup a and release.
> > If runtime setup is supported - surely dynamic release should be supported
> > too.
> > Also probably RUNTIME_RX_QUEUE_SETUP sounds a bit better.
> 
> Agree
> 
> Thanks
> Qi
> 
> >
> > Konstantin
> >
> > >  /*
> > >   * If new Tx offload capabilities are defined, they also must be
> > >   * mentioned in rte_tx_offload_names in rte_ethdev.c file.
> > > @@ -1029,6 +1038,8 @@ struct rte_eth_dev_info {
> > >  	/** Configured number of rx/tx queues */
> > >  	uint16_t nb_rx_queues; /**< Number of RX queues. */
> > >  	uint16_t nb_tx_queues; /**< Number of TX queues. */
> > > +	uint64_t deferred_queue_config_capa;
> > > +	/**< queues can be setup/release after dev_start (DEV_DEFERRED_). */
> > >  };
> > >
> > >  /**
> > > --
> > > 2.13.6



More information about the dev mailing list