[dpdk-dev] [PATCH v6 0/6] add Tx preparation
    Ananyev, Konstantin 
    konstantin.ananyev at intel.com
       
    Tue Oct 18 14:28:52 CEST 2016
    
    
  
> 
> As discussed in that thread:
> 
> http://dpdk.org/ml/archives/dev/2015-September/023603.html
> 
> Different NIC models depending on HW offload requested might impose different requirements on packets to be TX-ed in terms of:
> 
>  - Max number of fragments per packet allowed
>  - Max number of fragments per TSO segments
>  - The way pseudo-header checksum should be pre-calculated
>  - L3/L4 header fields filling
>  - etc.
> 
> 
> MOTIVATION:
> -----------
> 
> 1) Some work cannot (and didn't should) be done in rte_eth_tx_burst.
>    However, this work is sometimes required, and now, it's an
>    application issue.
> 
> 2) Different hardware may have different requirements for TX offloads,
>    other subset can be supported and so on.
> 
> 3) Some parameters (e.g. number of segments in ixgbe driver) may hung
>    device. These parameters may be vary for different devices.
> 
>    For example i40e HW allows 8 fragments per packet, but that is after
>    TSO segmentation. While ixgbe has a 38-fragment pre-TSO limit.
> 
> 4) Fields in packet may require different initialization (like e.g. will
>    require pseudo-header checksum precalculation, sometimes in a
>    different way depending on packet type, and so on). Now application
>    needs to care about it.
> 
> 5) Using additional API (rte_eth_tx_prep) before rte_eth_tx_burst let to
>    prepare packet burst in acceptable form for specific device.
> 
> 6) Some additional checks may be done in debug mode keeping tx_burst
>    implementation clean.
> 
> 
> PROPOSAL:
> ---------
> 
> To help user to deal with all these varieties we propose to:
> 
> 1) Introduce rte_eth_tx_prep() function to do necessary preparations of
>    packet burst to be safely transmitted on device for desired HW
>    offloads (set/reset checksum field according to the hardware
>    requirements) and check HW constraints (number of segments per
>    packet, etc).
> 
>    While the limitations and requirements may differ for devices, it
>    requires to extend rte_eth_dev structure with new function pointer
>    "tx_pkt_prep" which can be implemented in the driver to prepare and
>    verify packets, in devices specific way, before burst, what should to
>    prevent application to send malformed packets.
> 
> 2) Also new fields will be introduced in rte_eth_desc_lim:
>    nb_seg_max and nb_mtu_seg_max, providing an information about max
>    segments in TSO and non-TSO packets acceptable by device.
> 
>    This information is useful for application to not create/limit
>    malicious packet.
> 
> 
> APPLICATION (CASE OF USE):
> --------------------------
> 
> 1) Application should to initialize burst of packets to send, set
>    required tx offload flags and required fields, like l2_len, l3_len,
>    l4_len, and tso_segsz
> 
> 2) Application passes burst to the rte_eth_tx_prep to check conditions
>    required to send packets through the NIC.
> 
> 3) The result of rte_eth_tx_prep can be used to send valid packets
>    and/or restore invalid if function fails.
> 
> e.g.
> 
> 	for (i = 0; i < nb_pkts; i++) {
> 
> 		/* initialize or process packet */
> 
> 		bufs[i]->tso_segsz = 800;
> 		bufs[i]->ol_flags = PKT_TX_TCP_SEG | PKT_TX_IPV4
> 				| PKT_TX_IP_CKSUM;
> 		bufs[i]->l2_len = sizeof(struct ether_hdr);
> 		bufs[i]->l3_len = sizeof(struct ipv4_hdr);
> 		bufs[i]->l4_len = sizeof(struct tcp_hdr);
> 	}
> 
> 	/* Prepare burst of TX packets */
> 	nb_prep = rte_eth_tx_prep(port, 0, bufs, nb_pkts);
> 
> 	if (nb_prep < nb_pkts) {
> 		printf("tx_prep failed\n");
> 
> 		/* nb_prep indicates here first invalid packet. rte_eth_tx_prep
> 		 * can be used on remaining packets to find another ones.
> 		 */
> 
> 	}
> 
> 	/* Send burst of TX packets */
> 	nb_tx = rte_eth_tx_burst(port, 0, bufs, nb_prep);
> 
> 	/* Free any unsent packets. */
> 
> 
> v5 changes:
>  - rebased csum engine modification
>  - added information to the csum engine about performance tests
>  - some performance improvements
> 
> v4 changes:
>  - tx_prep is now set to default behavior (NULL) for simple/vector path
>    in fm10k, i40e and ixgbe drivers to increase performance, when
>    Tx offloads are not intentionally available
> 
> v3 changes:
>  - reworked csum testpmd engine instead adding new one,
>  - fixed checksum initialization procedure to include also outer
>    checksum offloads,
>  - some minor formattings and optimalizations
> 
> v2 changes:
>  - rte_eth_tx_prep() returns number of packets when device doesn't
>    support tx_prep functionality,
>  - introduced CONFIG_RTE_ETHDEV_TX_PREP allowing to turn off tx_prep
> 
> 
> Tomasz Kulasek (6):
>   ethdev: add Tx preparation
>   e1000: add Tx preparation
>   fm10k: add Tx preparation
>   i40e: add Tx preparation
>   ixgbe: add Tx preparation
>   testpmd: use Tx preparation in csum engine
> 
>  app/test-pmd/csumonly.c          |   36 ++++------
>  config/common_base               |    1 +
>  drivers/net/e1000/e1000_ethdev.h |   11 +++
>  drivers/net/e1000/em_ethdev.c    |    5 +-
>  drivers/net/e1000/em_rxtx.c      |   48 ++++++++++++-
>  drivers/net/e1000/igb_ethdev.c   |    4 ++
>  drivers/net/e1000/igb_rxtx.c     |   52 ++++++++++++++-
>  drivers/net/fm10k/fm10k.h        |    6 ++
>  drivers/net/fm10k/fm10k_ethdev.c |    5 ++
>  drivers/net/fm10k/fm10k_rxtx.c   |   50 +++++++++++++-
>  drivers/net/i40e/i40e_ethdev.c   |    3 +
>  drivers/net/i40e/i40e_rxtx.c     |   72 +++++++++++++++++++-
>  drivers/net/i40e/i40e_rxtx.h     |    8 +++
>  drivers/net/ixgbe/ixgbe_ethdev.c |    3 +
>  drivers/net/ixgbe/ixgbe_ethdev.h |    5 +-
>  drivers/net/ixgbe/ixgbe_rxtx.c   |   58 +++++++++++++++-
>  drivers/net/ixgbe/ixgbe_rxtx.h   |    2 +
>  lib/librte_ether/rte_ethdev.h    |   85 +++++++++++++++++++++++
>  lib/librte_mbuf/rte_mbuf.h       |    9 +++
>  lib/librte_net/Makefile          |    3 +-
>  lib/librte_net/rte_pkt.h         |  137 ++++++++++++++++++++++++++++++++++++++
>  21 files changed, 572 insertions(+), 31 deletions(-)
>  create mode 100644 lib/librte_net/rte_pkt.h
> 
> --
Acked-by: Konstantin Ananyev <konstantin.ananyev at intel.com>
> 1.7.9.5
    
    
More information about the dev
mailing list