[dpdk-dev] [PATCH] ixgbe: fix checking for tx_free_thresh

Ananyev, Konstantin konstantin.ananyev at intel.com
Tue Jun 2 19:35:07 CEST 2015



> -----Original Message-----
> From: Zoltan Kiss [mailto:zoltan.kiss at linaro.org]
> Sent: Tuesday, June 02, 2015 4:08 PM
> To: Ananyev, Konstantin; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] ixgbe: fix checking for tx_free_thresh
> 
> 
> 
> On 02/06/15 14:31, Ananyev, Konstantin wrote:
> > Hi Zoltan,
> >
> >> -----Original Message-----
> >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Zoltan Kiss
> >> Sent: Monday, June 01, 2015 5:16 PM
> >> To: dev at dpdk.org
> >> Subject: Re: [dpdk-dev] [PATCH] ixgbe: fix checking for tx_free_thresh
> >>
> >> Hi,
> >>
> >> Anyone would like to review this patch? Venky sent a NAK, but I've
> >> explained to him why it is a bug.
> >
> >
> > Well, I think Venky is right here.
> I think the comments above rte_eth_tx_burst() definition are quite clear
> about what tx_free_thresh means, e1000 and i40e use it that way, but not
> ixgbe.
> 
> > Indeed that fix, will cause more often unsuccessful checks for DD bits and might cause a
> > slowdown for TX fast-path.
> Not if the applications set tx_free_thresh according to the definition
> of this value. But we can change the default value from 32 to something
> higher, e.g I'm using nb_desc/2, and it works out well.

Sure we can, as I said below, we can unify it one way or another.
One way would be to make fast-path TX to free TXDs when number of occupied TXDs raises above tx_free_thresh
(what rte_ethdev.h comments say and what full-featured TX is doing).
Though in that case we have to change default value for tx_free_thresh, and all existing apps that 
using tx_free_thresh==32 and fast-path TX will probably experience a slowdown.
Another way would be to make all TX functions to treat tx_conf->tx_free_thresh as fast-path TX functions do
(free TXDs when number of free TXDs drops below  tx_free_thresh) and update  rte_ethdev.h comments.

Though, I am not sure that it really worth all these changes.
>From one side, whatever tx_free_thresh would be,
the app should still assume that the worst case might happen,
and up to nb_tx_desc mbufs can be consumed by the queue.
>From other side, I think the default value should work well for most cases.
So I am still for graceful deprecation of that config parameter, see below.

> 
> > Anyway, with current PMD implementation, you can't guarantee that at any moment
> > TX queue wouldn't use more than tx_free_thresh mbufs.
> 
> 
> > There could be situations (low speed, or link is down for some short period, etc), when
> > much more than tx_free_thresh TXDs are in use and none of them could be freed by HW right now.
> > So your app better be prepared, that up to (nb_tx_desc * num_of_TX_queues) could be in use
> > by TX path at any given moment.
> >
> > Though yes,  there is an inconsistency how different ixgbe TX functions treat tx_conf->tx_free_thresh parameter.
> > That probably creates wrong expectations and confusion.
> Yes, ixgbe_xmit_pkts() use it the way it's defined, this two function
> doesn't.
> 
> > We might try to unify it's usage one way or another, but I personally don't see much point in it.
> > After all, tx_free_tresh seems like a driver internal choice (based on the nb_tx_desc and other parameters).
> > So I think a better way would be:
> > 1. Deprecate tx_conf->tx_free_thresh (and remove it in later releases) and make
> > each driver to use what it thinks would be the best value.
> But how does the driver knows what's the best for the applications
> traffic pattern? I think it's better to leave the possibility for the
> app to fine tune it.

My understanding is that for most cases the default value should do pretty well.
That default value, shouldn't be too small, so we avoid unnecessary & unsuccessful checks,
and probably shouldn't be too big, to prevent unnecessary mbufs consumption
(something between nb_tx_desc / 2 and 3 * nb_tx_desc / 4 probably).

But might be you have a good example, when such tuning is needed?
For what traffic patterns you would set tx_free_thresh to some different values,
and how will it impact performance?

Again, if there would be tx_free_pkts(), why someone would also need a tx_conf->tx_free_thresh?

Konstantin

> In the meantime we can improve the default selection as well, as I
> suggested above.
> 
> > 2. As you suggested in another mail, introduce an new function:
> > uint16_t rte_eth_tx_free_pkts(port_id, queue_id, nb_to_free).
> > That would give upper layer a better control of memory usage, and might be called by the upper layer at idle time,
> > so further tx_burst, don't need to spend time on freeing TXDs/packets.
> I agree.
> 
> >
> > Konstantin
> >
> >
> >>
> >> Regards,
> >>
> >> Zoltan
> >>
> >> On 27/05/15 21:12, Zoltan Kiss wrote:
> >>> This check doesn't do what's required by rte_eth_tx_burst:
> >>> "When the number of previously sent packets reached the "minimum transmit
> >>> packets to free" threshold"
> >>>
> >>> This can cause problems when txq->tx_free_thresh + [number of elements in the
> >>> pool] < txq->nb_tx_desc.
> >>>
> >>> Signed-off-by: Zoltan Kiss <zoltan.kiss at linaro.org>
> >>> ---
> >>>    drivers/net/ixgbe/ixgbe_rxtx.c     | 4 ++--
> >>>    drivers/net/ixgbe/ixgbe_rxtx_vec.c | 2 +-
> >>>    2 files changed, 3 insertions(+), 3 deletions(-)
> >>>
> >>> diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
> >>> index 4f9ab22..b70ed8c 100644
> >>> --- a/drivers/net/ixgbe/ixgbe_rxtx.c
> >>> +++ b/drivers/net/ixgbe/ixgbe_rxtx.c
> >>> @@ -250,10 +250,10 @@ tx_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
> >>>
> >>>    	/*
> >>>    	 * Begin scanning the H/W ring for done descriptors when the
> >>> -	 * number of available descriptors drops below tx_free_thresh.  For
> >>> +	 * number of in flight descriptors reaches tx_free_thresh. For
> >>>    	 * each done descriptor, free the associated buffer.
> >>>    	 */
> >>> -	if (txq->nb_tx_free < txq->tx_free_thresh)
> >>> +	if ((txq->nb_tx_desc - txq->nb_tx_free) > txq->tx_free_thresh)
> >>>    		ixgbe_tx_free_bufs(txq);
> >>>
> >>>    	/* Only use descriptors that are available */
> >>> diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec.c b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> >>> index abd10f6..f91c698 100644
> >>> --- a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> >>> +++ b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
> >>> @@ -598,7 +598,7 @@ ixgbe_xmit_pkts_vec(void *tx_queue, struct rte_mbuf **tx_pkts,
> >>>    	if (unlikely(nb_pkts > RTE_IXGBE_VPMD_TX_BURST))
> >>>    		nb_pkts = RTE_IXGBE_VPMD_TX_BURST;
> >>>
> >>> -	if (txq->nb_tx_free < txq->tx_free_thresh)
> >>> +	if ((txq->nb_tx_desc - txq->nb_tx_free) > txq->tx_free_thresh)
> >>>    		ixgbe_tx_free_bufs(txq);
> >>>
> >>>    	nb_commit = nb_pkts = (uint16_t)RTE_MIN(txq->nb_tx_free, nb_pkts);
> >>>


More information about the dev mailing list