[dpdk-dev] [PATCH v2] doc: announce ABI change for rte_eth_dev structure

Ananyev, Konstantin konstantin.ananyev at intel.com
Thu Jul 28 12:36:07 CEST 2016
Previous message: [dpdk-dev] [PATCH v2] doc: announce ABI change for rte_eth_dev structure
Next message: [dpdk-dev] [PATCH v2] doc: announce ABI change for rte_eth_dev structure
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

> > > >
> > > > > -----Original Message-----
> > > > > From: Jerin Jacob [mailto:jerin.jacob at caviumnetworks.com]
> > > > > Sent: Wednesday, July 27, 2016 6:11 PM
> > > > > To: Thomas Monjalon <thomas.monjalon at 6wind.com>
> > > > > Cc: Kulasek, TomaszX <tomaszx.kulasek at intel.com>; dev at dpdk.org;
> > > > > Ananyev, Konstantin <konstantin.ananyev at intel.com>
> > > > > Subject: Re: [dpdk-dev] [PATCH v2] doc: announce ABI change for
> > > > > rte_eth_dev structure
> > > > >
> > > > > On Wed, Jul 27, 2016 at 01:59:01AM -0700, Thomas Monjalon wrote:
> > > > > > > > Signed-off-by: Tomasz Kulasek <tomaszx.kulasek at intel.com>
> > > > > > > > ---
> > > > > > > > +* In 16.11 ABI changes are plained: the ``rte_eth_dev``
> > > > > > > > +structure will be
> > > > > > > > +  extended with new function pointer ``tx_pkt_prep`` allowing
> > > > > > > > +verification
> > > > > > > > +  and processing of packet burst to meet HW specific
> > > > > > > > +requirements before
> > > > > > > > +  transmit. Also new fields will be added to the ``rte_eth_desc_lim`` structure:
> > > > > > > > +  ``nb_seg_max`` and ``nb_mtu_seg_max`` provideing
> > > > > > > > +information about number of
> > > > > > > > +  segments limit to be transmitted by device for TSO/non-TSO packets.
> > > > > > >
> > > > > > > Acked-by: Konstantin Ananyev <konstantin.ananyev at intel.com>
> > > > > >
> > > > > > I think I understand you want to split the TX processing:
> > > > > > 	1/ modify/write in mbufs
> > > > > > 	2/ write in HW
> > > > > > and let application decide:
> > > > > > 	- where the TX prep is done (which core)
> > > > >
> > > > > In what basics applications knows when and where to call tx_pkt_prep in fast path.
> > > > > if all the time it needs to call before tx_burst then the PMD won't
> > > > > have/don't need this callback waste cycles in fast path.Is this the expected behavior ?
> > > > > Anything think it as compile time to make other PMDs wont suffer because of this change.
> > > >
> > > > Not sure what suffering you are talking about...
> > > > Current model - i.e. when application does preparations (or doesn't if
> > > > none is required) on its own and just call tx_burst() would still be there.
> > > > If the app doesn't want to use tx_prep() by some reason - that still
> > > > ok, and decision is up to the particular app.
> > >
> > > So my question is in what basics application decides to call the preparation.
> > > Can you tell me the use case in application perspective?
> >
> > I suppose one most common use-case when application uses HW TX offloads,
> > and don't' to cope on its own which L3/L4 header fields need to be filled
> > for that particular dev_type/hw_offload combination.
> 
> If it does not cope up then it can skip tx'ing in the actual tx burst
> itself and move the "skipped" tx packets to end of the list in the tx
> burst so that application can take the action on "skipped" packet after
> the tx burst

Sorry, that's too cryptic for me.
Can you reword it somehow?

> 
> 
> > Instead it just setups the ol_flags, fills tx_offload fields and calls tx_prep().
> > Please read the original Tomasz's patch, I think he explained possible use-cases
> > with lot of details.
> 
> Sorry, it is not very clear in terms of use cases.

Ok, what I meant to say:
Right now, if user wants to use HW TX cksum/TSO offloads he might have to:
- setup ipv4 header cksum field.
- calculate the pseudo header checksum
- setup tcp/udp cksum field.

Rules how these calculations need to be done and which fields need to be updated,
may vary depending on HW underneath and requested offloads.
tx_prep() - supposed to hide all these nuances from user and allow him to use TX HW offloads
in a transparent way.

Another main purpose of tx_prep(): for multi-segment packets is to check
that number of segments doesn't exceed  HW limit.
Again right now users have to do that on their own.

> 
> In HW perspective, It it tries to avoid the illegal state. But not sure
> calling "back to back" tx prepare and then tx burst how does it improve the
> situation as the check illegal state check introduce in actual tx burst
> it self.
> 
> In SW perspective, its try to avoid sending malformed packets. In my
> view the same can achieved with existing tx burst it self as PMD is the
> one finally send the packets on the wire.

Ok, so your question is: why not to put that functionality into
tx_burst() itself, right?
For few reasons:
1. putting that functionality into tx_burst() would introduce unnecessary
    slowdown for cases when that functionality is not needed
    (one segment per packet, no HW offloads).
2. User might don't want to use tx_prep() - he/she might have its
    own analog, which he/she belives is faster/smarter,etc.
3.  Having it a s separate function would allow user control when/where
      to call it, let say only for some packets, or probably call tx_prep()
      on one core, and do actual tx_burst() for these packets on the other. 
       
> 
> proposal quote:
> 
> 1. Introduce rte_eth_tx_prep() function to do necessary preparations of
>    packet burst to be safely transmitted on device for desired HW
>    offloads (set/reset checksum field according to the hardware
>    requirements) and check HW constraints (number of segments per
>    packet, etc).
> 
>    While the limitations and requirements may differ for devices, it
>    requires to extend rte_eth_dev structure with new function pointer
>    "tx_pkt_prep" which can be implemented in the driver to prepare and
>    verify packets, in devices specific way, before burst, what should to
>    prevent application to send malformed packets.
> 
> 
> >
> > > and what if the PMD does not implement that callback then it is of waste cycles. Right?
> >
> > If you refer as lost cycles here something like:
> > RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_prep, -ENOTSUP);
> > then yes.
> > Though comparing to actual work need to be done for most HW TX offloads,
> > I think it is neglectable.
> 
> Not sure.
> 
> > Again, as I said before, it is totally voluntary for the application.
> 
> Not according to proposal. It can't be too as application has no idea
> what PMD driver does with "prep" what is the implication on a HW if
> application does not

Why application writer wouldn't have an idea? 
We would document what tx_prep() supposed to do, and in what cases user don't need it.
Then it would be up to the user:
- not to use it at all (one segment per packet, no HW TX offloads)
- not to use tx_prep(), and make necessary preparations himself,
  that what people have to do now.
- use tx_prep()

Konstantin
Previous message: [dpdk-dev] [PATCH v2] doc: announce ABI change for rte_eth_dev structure
Next message: [dpdk-dev] [PATCH v2] doc: announce ABI change for rte_eth_dev structure
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the dev mailing list