[dpdk-dev] [PATCH v2 2/2] ethdev: introduce Tx queue offloads API

Jerin Jacob jerin.jacob at caviumnetworks.com
Tue Sep 12 06:01:09 CEST 2017


-----Original Message-----
> Date: Mon, 11 Sep 2017 11:02:07 +0000
> From: "Ananyev, Konstantin" <konstantin.ananyev at intel.com>
> To: Jerin Jacob <jerin.jacob at caviumnetworks.com>, Shahaf Shuler
>  <shahafs at mellanox.com>
> CC: Stephen Hemminger <stephen at networkplumber.org>, Thomas Monjalon
>  <thomas at monjalon.net>, "dev at dpdk.org" <dev at dpdk.org>, "Zhang, Helin"
>  <helin.zhang at intel.com>, "Wu, Jingjing" <jingjing.wu at intel.com>
> Subject: RE: [dpdk-dev] [PATCH v2 2/2] ethdev: introduce Tx queue offloads
>  API
> 
> 
> > > > >
> > > > > I don't understand.
> > > > > From the exact link above, you explicitly say that *you* will move this flags
> > > > once the series is integrated. Quoting:
> > > > >
> > > > > "
> > > > > > Please Jerin, could you work on moving these settings in a new API?
> > > > >
> > > > > Sure. Once the generic code is in place. We are committed to fix the
> > > > > PMDs by 18.02.
> > > >
> > > > Yes. I will take care of the PMD(nicvf) side of the changes. Not in ethdev or
> > > > mempool. Meaning, you need to decide how you are going to expose the
> > > > equivalent of these flags and enable the generic code for those flags in
> > > > ethdev or mempool. The drivers side of changes I can take care.
> > > >
> > >
> > > How about doing it a PMD option?
> > > Seems like nicvf is the only PMD which care about them.
> > 
> > Lets take flag by flag:
> > ETH_TXQ_FLAGS_NOMULTMEMP - I think, this should be removed. But we can have
> > common code in ethdev pmd to detect all pool being configured from on the same pool
> > as on the rx_configure() application passes the mempool.
> 
> 
> This is TX offloads, not RX.
> At tx_queue_setup() user doesn't have to provide the mempool pointer,
> and can pass mbuf from any mempool to the TX routine.
> BTW, how do you know one which particular mempool to use?
> Still read it from xmitted mbuf (At least first one), I presume?

Yes. Still it reads from xmitted mbuf for the first one.

> 
> > 
> > ETH_TXQ_FLAGS_NOREFCOUNT: This one has i40e and nicvf consumers.
> 
> About i40e - as far as I know, no-one use i40e PMD with this flag.
> As far as I remember, it was added purely for benchmarking purposes on some early stages.
> So my vote would be to remove it from i40e.
> Helin, Jingjing - what are your thoughts here.
> About nicvf - as I can see it is used only in conjunction with ETH_TXQ_FLAGS_NOMULTMEMP,
> never alone.
> My understanding is that current meaning of these flags
> is a promise for PMD that for that particular TX queue user would submit only mbufs that:
> - all belong to the same mempool
> - always would have refcount==1
>  - would always be a direct ones (no indirect mbufs)

Yes, only when ETH_TXQ_FLAGS_NOMULTMEMP and ETH_TXQ_FLAGS_NOREFCOUNT
selected at tx queue configuration.

> 
> So literally, yes it is not a TX HW offload, though I understand your intention to
> have such possibility - it might help to save some cycles. 

It not a few cycles. We could see ~24% drop on per core(with 64B) with 
testpmd and l3fwd on some SoCs. It is not very specific to nicvf HW, The
problem is with limited cache hierarchy in very low end arm64 machines.
For TX buffer recycling case, it need to touch the mbuf again to find out the
associated mempool to free. It is fine if application demands it but not
all the application demands it.

We have two category of arm64 machines, The high end machine where cache
hierarchy similar x86 server machine. The low end ones with very
limited cache resources. Unfortunately, we need to have the same binary on both
machines.


> Wonder would some new driver specific function would help in that case?
> nicvf_txq_pool_setup(portid, queueid, struct rte_mempool *txpool, uint32_t flags);
> or so?

It is possible, but how do we make such change in testpmd, l3fwd or
ipsec-gw in tree application which does need only NOMULTIMEMP &
NOREFCOUNT.

If there is concern about making it Tx queue level it is fine. We can
move from queue level to port level or global level.
IMO, Application should express in some form that it wants only
NOMULTIMEMP & NOREFCOUNT and thats is the case for l3fwd and ipsec-gw


> So the user can call it just before rte_eth_tx_queue_setup()?
> Konstantin
> 
> > And it is driven by the use case too. So it should available in some
> > form.	
> > 
> > >
> > > If there will be more PMDs later, we can think about which API is needed.
> > >


More information about the dev mailing list