[dpdk-dev] [PATCH v2] doc: announce ABI change for rte_eth_dev structure

Ananyev, Konstantin konstantin.ananyev at intel.com
Thu Jul 28 15:01:16 CEST 2016



> -----Original Message-----
> From: Jerin Jacob [mailto:jerin.jacob at caviumnetworks.com]
> Sent: Thursday, July 28, 2016 12:39 PM
> To: Ananyev, Konstantin <konstantin.ananyev at intel.com>
> Cc: Thomas Monjalon <thomas.monjalon at 6wind.com>; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2] doc: announce ABI change for rte_eth_dev structure
> 
> On Thu, Jul 28, 2016 at 10:36:07AM +0000, Ananyev, Konstantin wrote:
> > > If it does not cope up then it can skip tx'ing in the actual tx
> > > burst itself and move the "skipped" tx packets to end of the list in
> > > the tx burst so that application can take the action on "skipped"
> > > packet after the tx burst
> >
> > Sorry, that's too cryptic for me.
> > Can you reword it somehow?
> 
> OK.
> 1) lets say application requests 32 packets to send it using tx_burst.
> 2) packets are from p0 to p31
> 3) in driver due to some reason, it is not able to send the packets due to some constraints in the driver(say expect p2 and p16 everything
> else sent successfully by the driver)
> 4) driver can move p2 and p16 at pkt[0] and pkt[1] on tx_burst and return 30
> 5) application can take action on p2 and p16 based the return value of 30(ie 32-30 = 2 packets needs to handle at pkt[0] and pkt[1]

That would introduce packets reordering and unnecessary complicate the PMD TX functions.
Again it would require changes in *all* existing PMD tx functions.
So we don't plan to do things that way.

> 
> 
> >
> > >
> > >
> > > > Instead it just setups the ol_flags, fills tx_offload fields and calls tx_prep().
> > > > Please read the original Tomasz's patch, I think he explained
> > > > possible use-cases with lot of details.
> > >
> > > Sorry, it is not very clear in terms of use cases.
> >
> > Ok, what I meant to say:
> > Right now, if user wants to use HW TX cksum/TSO offloads he might have to:
> > - setup ipv4 header cksum field.
> > - calculate the pseudo header checksum
> > - setup tcp/udp cksum field.
> >
> > Rules how these calculations need to be done and which fields need to
> > be updated, may vary depending on HW underneath and requested offloads.
> > tx_prep() - supposed to hide all these nuances from user and allow him
> > to use TX HW offloads in a transparent way.
> 
> Not sure I understand it completely. Bit contradicting with below statement
> |We would document what tx_prep() supposed to do, and in what cases user
> |don't need it.

How that contradicts?
Right now to make HW TX offloads to work user is required to do particular actions:
1. set mbuf.ol_flags properly.
2. setup mbuf.tx_offload fields properly.
3. update L3/L4 header fields in a particular way.

We move #3 into tx_prep(), to hide that complexity from the user simplify things for him.
Though if he still prefers to do #3  by himself - that's ok too.
 
> 
> How about introducing a new ethdev generic eal command-line mode OR new ethdev_configure hint that PMD driver is in "tx_prep-
> >tx_burst" mode instead of just tx_burst? That way no fast-path performance degradation for the PMD that does not need it

User might want to send different packets over different devices,
or even over different queues over the same device.
Or even he might want to call tx_prep() for one group of packets,
but skip for different group of packets for the same TX queue. 
So I think we should allow user to decide when/where to call it.

> 
> 
> >
> > Another main purpose of tx_prep(): for multi-segment packets is to
> > check that number of segments doesn't exceed  HW limit.
> > Again right now users have to do that on their own.
> >
> > >
> > > In HW perspective, It it tries to avoid the illegal state. But not
> > > sure calling "back to back" tx prepare and then tx burst how does it
> > > improve the situation as the check illegal state check introduce in
> > > actual tx burst it self.
> > >
> > > In SW perspective, its try to avoid sending malformed packets. In my
> > > view the same can achieved with existing tx burst it self as PMD is
> > > the one finally send the packets on the wire.
> >
> > Ok, so your question is: why not to put that functionality into
> > tx_burst() itself, right?
> > For few reasons:
> > 1. putting that functionality into tx_burst() would introduce unnecessary
> >     slowdown for cases when that functionality is not needed
> >     (one segment per packet, no HW offloads).
> 
> These parameters can be configured on init time

No always, see above.

> 
> > 2. User might don't want to use tx_prep() - he/she might have its
> >     own analog, which he/she belives is faster/smarter,etc.
> 
> That's the current mode. Right?

Yes.

> > 3.  Having it a s separate function would allow user control when/where
> >       to call it, let say only for some packets, or probably call tx_prep()
> >       on one core, and do actual tx_burst() for these packets on the other.


> Why to process it under tx_prep() as application can always process the packet in one core

Because not every application wants to do it over the same core.
Some apps would like to do it on the same core, some apps would like to do it on different core.
With proposed API both models are possible.

> 
> > >
> > > proposal quote:
> > >
> > > 1. Introduce rte_eth_tx_prep() function to do necessary preparations of
> > >    packet burst to be safely transmitted on device for desired HW
> > >    offloads (set/reset checksum field according to the hardware
> > >    requirements) and check HW constraints (number of segments per
> > >    packet, etc).
> > >
> > >    While the limitations and requirements may differ for devices, it
> > >    requires to extend rte_eth_dev structure with new function pointer
> > >    "tx_pkt_prep" which can be implemented in the driver to prepare and
> > >    verify packets, in devices specific way, before burst, what should to
> > >    prevent application to send malformed packets.
> > >
> > >
> > > >
> > > > > and what if the PMD does not implement that callback then it is of waste cycles. Right?
> > > >
> > > > If you refer as lost cycles here something like:
> > > > RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_prep, -ENOTSUP); then
> > > > yes.
> > > > Though comparing to actual work need to be done for most HW TX
> > > > offloads, I think it is neglectable.
> > >
> > > Not sure.
> > >
> > > > Again, as I said before, it is totally voluntary for the application.
> > >
> > > Not according to proposal. It can't be too as application has no
> > > idea what PMD driver does with "prep" what is the implication on a
> > > HW if application does not
> >
> > Why application writer wouldn't have an idea?
> > We would document what tx_prep() supposed to do, and in what cases user don't need it.
> 
> But how he/she detect that on that run-time ?

By the application logic for example.
If let say is doing the l2fwd for that group of packets, it would know
that it doesn't need to do tx_prep().

To be honest, I don't understand what is your concern here.
That proposed change doesn't break any existing functionality,
doesn't introduce any new requirements to the existing API, 
and wouldn't introduce any performance regression for existing apps.
It is a an extension, and user is free not to use it, if it doesn't fit his needs.
>From other side there are users who are interested in that functionality,
and they do have use-cases for  it.
So what worries you?
Konstantin

> 
> > Then it would be up to the user:
> > - not to use it at all (one segment per packet, no HW TX offloads)
> 
> We already have TX flags for that
> 
> > - not to use tx_prep(), and make necessary preparations himself,
> >   that what people have to do now.
> > - use tx_prep()
> >
> > Konstantin
> >


More information about the dev mailing list