[dpdk-dev] [PATCH v2 1/4] ether: support deferred queue setup

Zhang, Qi Z qi.z.zhang at intel.com
Fri Mar 16 01:42:16 CET 2018



> -----Original Message-----
> From: Ananyev, Konstantin
> Sent: Thursday, March 15, 2018 11:39 PM
> To: Zhang, Qi Z <qi.z.zhang at intel.com>; thomas at monjalon.net
> Cc: dev at dpdk.org; Xing, Beilei <beilei.xing at intel.com>; Wu, Jingjing
> <jingjing.wu at intel.com>; Lu, Wenzhuo <wenzhuo.lu at intel.com>
> Subject: RE: [dpdk-dev] [PATCH v2 1/4] ether: support deferred queue setup
> 
> 
> 
> > -----Original Message-----
> > From: Zhang, Qi Z
> > Sent: Thursday, March 15, 2018 3:09 PM
> > To: Ananyev, Konstantin <konstantin.ananyev at intel.com>;
> > thomas at monjalon.net
> > Cc: dev at dpdk.org; Xing, Beilei <beilei.xing at intel.com>; Wu, Jingjing
> > <jingjing.wu at intel.com>; Lu, Wenzhuo <wenzhuo.lu at intel.com>
> > Subject: RE: [dpdk-dev] [PATCH v2 1/4] ether: support deferred queue
> > setup
> >
> >
> >
> > > -----Original Message-----
> > > From: Ananyev, Konstantin
> > > Sent: Thursday, March 15, 2018 9:17 PM
> > > To: Zhang, Qi Z <qi.z.zhang at intel.com>; thomas at monjalon.net
> > > Cc: dev at dpdk.org; Xing, Beilei <beilei.xing at intel.com>; Wu, Jingjing
> > > <jingjing.wu at intel.com>; Lu, Wenzhuo <wenzhuo.lu at intel.com>
> > > Subject: RE: [dpdk-dev] [PATCH v2 1/4] ether: support deferred queue
> > > setup
> > >
> > > Hi Qi,
> > >
> > > > -----Original Message-----
> > > > From: Zhang, Qi Z
> > > > Sent: Thursday, March 15, 2018 3:14 AM
> > > > To: Ananyev, Konstantin <konstantin.ananyev at intel.com>;
> > > > thomas at monjalon.net
> > > > Cc: dev at dpdk.org; Xing, Beilei <beilei.xing at intel.com>; Wu,
> > > > Jingjing <jingjing.wu at intel.com>; Lu, Wenzhuo
> > > > <wenzhuo.lu at intel.com>
> > > > Subject: RE: [dpdk-dev] [PATCH v2 1/4] ether: support deferred
> > > > queue setup
> > > >
> > > > Hi Konstantin:
> > > >
> > > > > -----Original Message-----
> > > > > From: Ananyev, Konstantin
> > > > > Sent: Wednesday, March 14, 2018 8:32 PM
> > > > > To: Zhang, Qi Z <qi.z.zhang at intel.com>; thomas at monjalon.net
> > > > > Cc: dev at dpdk.org; Xing, Beilei <beilei.xing at intel.com>; Wu,
> > > > > Jingjing <jingjing.wu at intel.com>; Lu, Wenzhuo
> > > > > <wenzhuo.lu at intel.com>; Zhang, Qi Z <qi.z.zhang at intel.com>
> > > > > Subject: RE: [dpdk-dev] [PATCH v2 1/4] ether: support deferred
> > > > > queue setup
> > > > >
> > > > > Hi Qi,
> > > > >
> > > > > >
> > > > > > The patch let etherdev driver expose the capability flag
> > > > > > through rte_eth_dev_info_get when it support deferred queue
> > > > > > configuraiton, then base on the flag
> > > > > > rte_eth_[rx|tx]_queue_setup could decide continue to setup the
> > > > > > queue or just return fail when device already started.
> > > > > >
> > > > > > Signed-off-by: Qi Zhang <qi.z.zhang at intel.com>
> > > > > > ---
> > > > > >  doc/guides/nics/features.rst  |  8 ++++++++
> > > > > > lib/librte_ether/rte_ethdev.c | 30
> > > > > > ++++++++++++++++++------------ lib/librte_ether/rte_ethdev.h |
> > > > > > 11 +++++++++++
> > > > > >  3 files changed, 37 insertions(+), 12 deletions(-)
> > > > > >
> > > > > > diff --git a/doc/guides/nics/features.rst
> > > > > > b/doc/guides/nics/features.rst index 1b4fb979f..36ad21a1f
> > > > > > 100644
> > > > > > --- a/doc/guides/nics/features.rst
> > > > > > +++ b/doc/guides/nics/features.rst
> > > > > > @@ -892,7 +892,15 @@ Documentation describes performance
> > > values.
> > > > > >
> > > > > >  See ``dpdk.org/doc/perf/*``.
> > > > > >
> > > > > > +.. _nic_features_queue_deferred_setup_capabilities:
> > > > > >
> > > > > > +Queue deferred setup capabilities
> > > > > > +---------------------------------
> > > > > > +
> > > > > > +Supports queue setup / release after device started.
> > > > > > +
> > > > > > +* **[provides] rte_eth_dev_info**:
> > > > > >
> > > > >
> > >
> ``deferred_queue_config_capa:DEV_DEFERRED_RX_QUEUE_SETUP,DEV_DEFE
> > > > > RRED_
> > > > > > TX_QUEUE_SETUP,DEV_DEFERRED_RX_QUEUE_RELE
> > > > > > ASE,DEV_DEFERRED_TX_QUEUE_RELEASE``.
> > > > > > +* **[related]  API**: ``rte_eth_dev_info_get()``.
> > > > > >
> > > > > >  .. _nic_features_other:
> > > > > >
> > > > > > diff --git a/lib/librte_ether/rte_ethdev.c
> > > > > > b/lib/librte_ether/rte_ethdev.c index a6ce2a5ba..6c906c4df
> > > > > > 100644
> > > > > > --- a/lib/librte_ether/rte_ethdev.c
> > > > > > +++ b/lib/librte_ether/rte_ethdev.c
> > > > > > @@ -1425,12 +1425,6 @@ rte_eth_rx_queue_setup(uint16_t
> > > > > > port_id,
> > > > > uint16_t rx_queue_id,
> > > > > >  		return -EINVAL;
> > > > > >  	}
> > > > > >
> > > > > > -	if (dev->data->dev_started) {
> > > > > > -		RTE_PMD_DEBUG_TRACE(
> > > > > > -		    "port %d must be stopped to allow configuration\n",
> > > port_id);
> > > > > > -		return -EBUSY;
> > > > > > -	}
> > > > > > -
> > > > > >  	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_infos_get,
> > > > > -ENOTSUP);
> > > > > >
> 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_setup,
> > > > > -ENOTSUP);
> > > > > >
> > > > > > @@ -1474,10 +1468,19 @@ rte_eth_rx_queue_setup(uint16_t
> > > port_id,
> > > > > uint16_t rx_queue_id,
> > > > > >  		return -EINVAL;
> > > > > >  	}
> > > > > >
> > > > > > +	if (dev->data->dev_started &&
> > > > > > +		!(dev_info.deferred_queue_config_capa &
> > > > > > +			DEV_DEFERRED_RX_QUEUE_SETUP))
> > > > > > +		return -EINVAL;
> > > > > > +
> > > > >
> > > > > I think now you have to check here that the queue is stopped.
> > > > > Otherwise you might attempt to reconfigure running queue.
> > > >
> > > > I'm not sure if it's necessary to let application use different
> > > > API sequence
> > > for a deferred configure and deferred re-configure.
> > > > Can we just call dev_ops->rx_queue_stop before rx_queue_release
> > > > here
> > >
> > > I don't follow you here.
> > > Let say now inside queue_start() we do check:
> > >
> > > if (dev->data->rx_queue_state[rx_queue_id] !=
> > > RTE_ETH_QUEUE_STATE_STOPPED)
> > >
> > > Right now it is not possible to call queue_setup() without
> > > dev_stop() before it - that's why we have check if
> > > (dev->data->dev_started) in queue_setup() right now.
> > > Though with your patch it not the case anymore - user is able to
> > > call
> > > queue_setup() without stopping the whole device.
> > > But he still has to stop the queue.
> >
> > >
> > > >
> > > > >
> > > > >
> > > > > >  	rxq = dev->data->rx_queues;
> > > > > >  	if (rxq[rx_queue_id]) {
> > > > > >
> > > 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_release,
> > > > > >  					-ENOTSUP);
> > > > >
> > > > > I don't think it is *that* straightforward.
> > > > > rx_queue_setup() parameters can imply different rx function (and
> > > > > related dev
> > > > > icesettings) that are already setuped by previous
> > > queue_setup()/dev_start.
> > > > > So I think you need to do one of 2 things:
> > > > > 1. rework ethdev layer to introduce a separate rx function (and
> > > > > related
> > > > > settings) for each queue.
> > > > > 2. at rx_queue_setup() if it is invoked after dev_start - check
> > > > > that given queue settings wouldn't contradict with current
> > > > > device settings  (rx function, etc.).
> > > > > If they do - return an error.
> > > > Yes, I think what we have is option 2 here, the
> > > > dev_ops->rx_queue_setup will return fail if conflict with previous
> > > > setting
> > >
> > > Hmm and what makes you think that?
> > > As I know it is not the case  right now.
> > > Let say I do:
> > >     ....
> > >    rx_queue_setup(port=0,queue=0, mp=mb_size_2048);
> > >    dev_start(port=0);
> > >    ...
> > >    rx_queue_setup(port=0,queue=1,mp=mb_size_1024);
> > >
> > >  If current rx function doesn't support multi-segs then second
> > > rx_queue_setup() should fail.
> > >  Though I don't think that would happen with the current
> implementation.
> >
> > Why you think that would not happen? dev_ops->rx_queue_setup can fail,
> right?
> > I mean it's the responsibility of low level driver (i40e) to check the conflict
> with current implementation.
> 
> Yes it is responsibility if the PMD because only it knows its own logic of rx/tx
> function selection.
> But I don't see such changes in i40e in your patch series.
> Probably I missed them?

OK, I think we are aligned on this patch, and I see the problem in i40e patch, will fix. 

> 
> > >
> > > Same story for TX offloads, though it probably not that critical, as
> > > for most Intel PMDs HW TX offloads will become per port in 18.05.
> > >
> > > As I can see you do have either of these options implemented right
> > > now - that's the problem.
> > >
> > > > I'm also thinking about option 1, the idea is to move per queue
> > > > rx/tx
> > > function into driver layer, so it will not break existing API.
> > > >
> > > > 1. driver can expose the capability like per_queue_rx or
> > > > per_queue_tx 2. application can enable this capability by
> > > > dev_config with rte_eth_conf 3, if per_queue_rx is not enable,
> > > > nothing change, so we are at option 2 4. if per_queue_rx is
> > > > enabled, driver will set rx_pkt_burst with a hook function which
> > > > redirect to an function ptr in a per queue rx function tables ( I
> > > > guess performance is impacted somehow, but this is the cost if you
> > > > want different offload for different queue)
> > >
> > > I don't think we need to overcomplicate things here.
> > > It should be transparent to the user - user just calls queue_setup()
> > > - based on its input parameters PMD selects a function that fits best.
> > > Pretty much what we have right now, just possibly have an array of
> > > functions (one per queue).
> >
> > If we don't introduce a new capability or something like, but just
> > take per queue functions as default way, does that mean, we need to
> change all drivers to adapt this?
> > Or do you mean below?
> >
> > If (dev->rx_pkt_burst)
> > 	/* default way */
> > else
> > 	/* per queue function */
> 
> For me either way seems ok.
> Second one probably a bit easier, as no changes from PMDs are required.
> But again - might be even rte_ethdev layer can fill queue's rx_pkt_burst[]
> array for the drivers that don't support it - just by copying dev->rx_pkt_burst
> into it.
> Konstantin

Ok, I will add this in v2

Thanks
Qi



More information about the dev mailing list