[dpdk-dev] [PATCH v8 0/4] ethdev: introduce configurable flexible item
    Ferruh Yigit 
    ferruh.yigit at intel.com
       
    Wed Oct 20 19:05:26 CEST 2021
    
    
  
On 10/20/2021 4:14 PM, Viacheslav Ovsiienko wrote:
> 1. Introduction and Retrospective
> 
> Nowadays the networks are evolving fast and wide, the network
> structures are getting more and more complicated, the new
> application areas are emerging. To address these challenges
> the new network protocols are continuously being developed,
> considered by technical communities, adopted by industry and,
> eventually implemented in hardware and software. The DPDK
> framework follows the common trends and if we bother
> to glance at the RTE Flow API header we see the multiple
> new items were introduced during the last years since
> the initial release.
> 
> The new protocol adoption and implementation process is
> not straightforward and takes time, the new protocol passes
> development, consideration, adoption, and implementation
> phases. The industry tries to mitigate and address the
> forthcoming network protocols, for example, many hardware
> vendors are implementing flexible and configurable network
> protocol parsers. As DPDK developers, could we anticipate
> the near future in the same fashion and introduce the similar
> flexibility in RTE Flow API?
> 
> Let's check what we already have merged in our project, and
> we see the nice raw item (rte_flow_item_raw). At the first
> glance, it looks superior and we can try to implement a flow
> matching on the header of some relatively new tunnel protocol,
> say on the GENEVE header with variable length options. And,
> under further consideration, we run into the raw item
> limitations:
> 
> - only fixed size network header can be represented
> - the entire network header pattern of fixed format
>    (header field offsets are fixed) must be provided
> - the search for patterns is not robust (the wrong matches
>    might be triggered), and actually is not supported
>    by existing PMDs
> - no explicitly specified relations with preceding
>    and following items
> - no tunnel hint support
> 
> As the result, implementing the support for tunnel protocols
> like aforementioned GENEVE with variable extra protocol option
> with flow raw item becomes very complicated and would require
> multiple flows and multiple raw items chained in the same
> flow (by the way, there is no support found for chained raw
> items in implemented drivers).
> 
> This RFC introduces the dedicated flex item (rte_flow_item_flex)
> to handle matches with existing and new network protocol headers
> in a unified fashion.
> 
> 2. Flex Item Life Cycle
> 
> Let's assume there are the requirements to support the new
> network protocol with RTE Flows. What is given within protocol
> specification:
> 
>    - header format
>    - header length, (can be variable, depending on options)
>    - potential presence of extra options following or included
>      in the header the header
>    - the relations with preceding protocols. For example,
>      the GENEVE follows UDP, eCPRI can follow either UDP
>      or L2 header
>    - the relations with following protocols. For example,
>      the next layer after tunnel header can be L2 or L3
>    - whether the new protocol is a tunnel and the header
>      is a splitting point between outer and inner layers
> 
> The supposed way to operate with flex item:
> 
>    - application defines the header structures according to
>      protocol specification
> 
>    - application calls rte_flow_flex_item_create() with desired
>      configuration according to the protocol specification, it
>      creates the flex item object over specified ethernet device
>      and prepares PMD and underlying hardware to handle flex
>      item. On item creation call PMD backing the specified
>      ethernet device returns the opaque handle identifying
>      the object has been created
> 
>    - application uses the rte_flow_item_flex with obtained handle
>      in the flows, the values/masks to match with fields in the
>      header are specified in the flex item per flow as for regular
>      items (except that pattern buffer combines all fields)
> 
>    - flows with flex items match with packets in a regular fashion,
>      the values and masks for the new protocol header match are
>      taken from the flex items in the flows
> 
>    - application destroys flows with flex items
> 
>    - application calls rte_flow_flex_item_release() as part of
>      ethernet device API and destroys the flex item object in
>      PMD and releases the engaged hardware resources
> 
> 3. Flex Item Structure
> 
> The flex item structure is intended to be used as part of the flow
> pattern like regular RTE flow items and provides the mask and
> value to match with fields of the protocol item was configured
> for.
> 
>    struct rte_flow_item_flex {
>      void *handle;
>      uint32_t length;
>      const uint8_t* pattern;
>    };
> 
> The handle is some opaque object maintained on per device basis
> by underlying driver.
> 
> The protocol header fields are considered as bit fields, all
> offsets and widths are expressed in bits. The pattern is the
> buffer containing the bit concatenation of all the fields
> presented at item configuration time, in the same order and
> same amount. If byte boundary alignment is needed an application
> can use a dummy type field, this is just some kind of gap filler.
> 
> The length field specifies the pattern buffer length in bytes
> and is needed to allow rte_flow_copy() operations. The approach
> of multiple pattern pointers and lengths (per field) was
> considered and found clumsy - it seems to be much suitable for
> the application to maintain the single structure within the
> single pattern buffer.
> 
> 4. Flex Item Configuration
> 
> The flex item configuration consists of the following parts:
> 
>    - header field descriptors:
>      - next header
>      - next protocol
>      - sample to match
>    - input link descriptors
>    - output link descriptors
> 
> The field descriptors tell the driver and hardware what data should
> be extracted from the packet and then control the packet handling
> in the flow engine. Besides this, sample fields can be presented
> to match with patterns in the flows. Each field is a bit pattern.
> It has width, offset from the header beginning, mode of offset
> calculation, and offset related parameters.
> 
> The next header field is special, no data are actually taken
> from the packet, but its offset is used as a pointer to the next
> header in the packet, in other words the next header offset
> specifies the size of the header being parsed by flex item.
> 
> There is one more special field - next protocol, it specifies
> where the next protocol identifier is contained and packet data
> sampled from this field will be used to determine the next
> protocol header type to continue packet parsing. The next
> protocol field is like eth_type field in MAC2, or proto field
> in IPv4/v6 headers.
> 
> The sample fields are used to represent the data be sampled
> from the packet and then matched with established flows.
> 
> There are several methods supposed to calculate field offset
> in runtime depending on configuration and packet content:
> 
>    - FIELD_MODE_FIXED - fixed offset. The bit offset from
>      header beginning is permanent and defined by field_base
>      configuration parameter.
> 
>    - FIELD_MODE_OFFSET - the field bit offset is extracted
>      from other header field (indirect offset field). The
>      resulting field offset to match is calculated from as:
> 
>    field_base + (*offset_base & offset_mask) << offset_shift
> 
>      This mode is useful to sample some extra options following
>      the main header with field containing main header length.
>      Also, this mode can be used to calculate offset to the
>      next protocol header, for example - IPv4 header contains
>      the 4-bit field with IPv4 header length expressed in dwords.
>      One more example - this mode would allow us to skip GENEVE
>      header variable length options.
> 
>    - FIELD_MODE_BITMASK - the field bit offset is extracted
>      from other header field (indirect offset field), the latter
>      is considered as bitmask containing some number of one bits,
>      the resulting field offset to match is calculated as:
> 
>    field_base + bitcount(*offset_base & offset_mask) << offset_shift
> 
>      This mode would be useful to skip the GTP header and its
>      extra options with specified flags.
> 
>    - FIELD_MODE_DUMMY - dummy field, optionally used for byte
>      boundary alignment in pattern. Pattern mask and data are
>      ignored in the match. All configuration parameters besides
>      field size and offset are ignored.
> 
>    Note:  "*" - means the indirect field offset is calculated
>    and actual data are extracted from the packet by this
>    offset (like data are fetched by pointer *p from memory).
> 
> The offset mode list can be extended by vendors according to
> hardware supported options.
> 
> The input link configuration section tells the driver after
> what protocols and at what conditions the flex item can follow.
> Input link specified the preceding header pattern, for example
> for GENEVE it can be UDP item specifying match on destination
> port with value 6081. The flex item can follow multiple header
> types and multiple input links should be specified. At flow
> creation time the item with one of the input link types should
> precede the flex item and driver will select the correct flex
> item settings, depending on the actual flow pattern.
> 
> The output link configuration section tells the driver how
> to continue packet parsing after the flex item protocol.
> If multiple protocols can follow the flex item header the
> flex item should contain the field with the next protocol
> identifier and the parsing will be continued depending
> on the data contained in this field in the actual packet.
> 
> The flex item fields can participate in RSS hash calculation,
> the dedicated flag is present in the field description to specify
> what fields should be provided for hashing.
> 
> 5. Flex Item Chaining
> 
> If there are multiple protocols supposed to be supported with
> flex items in chained fashion - two or more flex items within
> the same flow and these ones might be neighbors in the pattern,
> it means the flex items are mutual referencing.  In this case,
> the item that occurred first should be created with empty
> output link list or with the list including existing items,
> and then the second flex item should be created referencing
> the first flex item as input arc, drivers should adjust
> the item confgiuration.
> 
> Also, the hardware resources used by flex items to handle
> the packet can be limited. If there are multiple flex items
> that are supposed to be used within the same flow it would
> be nice to provide some hint for the driver that these two
> or more flex items are intended for simultaneous usage.
> The fields of items should be assigned with hint indices
> and these indices from two or more flex items supposed
> to be provided within the same flow should be the same
> as well. In other words, the field hint index specifies
> the group of fields that can be matched simultaneously
> within a single flow. If hint indices are specified,
> the driver will try to engage not overlapping hardware
> resources and provide independent handling of the field
> groups with unique indices. If the hint index is zero
> the driver assigns resources on its own.
> 
> 6. Example of New Protocol Handling
> 
> Let's suppose we have the requirements to handle the new tunnel
> protocol that follows UDP header with destination port 0xFADE
> and is followed by MAC header. Let the new protocol header format
> be like this:
> 
>    struct new_protocol_header {
>      rte_be32 header_length; /* length in dwords, including options */
>      rte_be32 specific0;     /* some protocol data, no intention */
>      rte_be32 specific1;     /* to match in flows on these fields */
>      rte_be32 crucial;       /* data of interest, match is needed */
>      rte_be32 options[0];    /* optional protocol data, variable length */
>    };
> 
> The supposed flex item configuration:
> 
>    struct rte_flow_item_flex_field field0 = {
>      .field_mode = FIELD_MODE_DUMMY,  /* Affects match pattern only */
>      .field_size = 96,                /* three dwords from the beginning */
>    };
>    struct rte_flow_item_flex_field field1 = {
>      .field_mode = FIELD_MODE_FIXED,
>      .field_size = 32,       /* Field size is one dword */
>      .field_base = 96,       /* Skip three dwords from the beginning */
>    };
>    struct rte_flow_item_udp spec0 = {
>      .hdr = {
>        .dst_port = RTE_BE16(0xFADE),
>      }
>    };
>    struct rte_flow_item_udp mask0 = {
>      .hdr = {
>        .dst_port = RTE_BE16(0xFFFF),
>      }
>    };
>    struct rte_flow_item_flex_link link0 = {
>      .item = {
>         .type = RTE_FLOW_ITEM_TYPE_UDP,
>         .spec = &spec0,
>         .mask = &mask0,
>    };
> 
>    struct rte_flow_item_flex_conf conf = {
>      .next_header = {
>        .tunnel = FLEX_TUNNEL_MODE_SINGLE,
>        .field_mode = FIELD_MODE_OFFSET,
>        .field_base = 0,
>        .offset_base = 0,
>        .offset_mask = 0xFFFFFFFF,
>        .offset_shift = 2	   /* Expressed in dwords, shift left by 2 */
>      },
>      .sample = {
>         &field0,
>         &field1,
>      },
>      .nb_samples = 2,
>      .input_link[0] = &link0,
>      .nb_inputs = 1
>    };
> 
> Let's suppose we have created the flex item successfully, and PMD
> returned the handle 0x123456789A. We can use the following item
> pattern to match the crucial field in the packet with value 0x00112233:
> 
>    struct new_protocol_header spec_pattern =
>    {
>      .crucial = RTE_BE32(0x00112233),
>    };
>    struct new_protocol_header mask_pattern =
>    {
>      .crucial = RTE_BE32(0xFFFFFFFF),
>    };
>    struct rte_flow_item_flex spec_flex = {
>      .handle = 0x123456789A
>      .length = sizeiof(struct new_protocol_header),
>      .pattern = &spec_pattern,
>    };
>    struct rte_flow_item_flex mask_flex = {
>      .length = sizeof(struct new_protocol_header),
>      .pattern = &mask_pattern,
>    };
>    struct rte_flow_item item_to_match = {
>      .type = RTE_FLOW_ITEM_TYPE_FLEX,
>      .spec = &spec_flex,
>      .mask = &mask_flex,
>    };
> 
> 7. Notes:
> 
>   - v7:  http://patches.dpdk.org/project/dpdk/patch/20211020150621.16517-2-viacheslavo@nvidia.com/
>   - v6:  http://patches.dpdk.org/project/dpdk/cover/20211018180252.14106-1-viacheslavo@nvidia.com/
>   - v5:  http://patches.dpdk.org/project/dpdk/patch/20211012125433.31647-2-viacheslavo@nvidia.com/
>   - v4:  http://patches.dpdk.org/project/dpdk/patch/20211012113235.24975-2-viacheslavo@nvidia.com/
>   - v3:  http://patches.dpdk.org/project/dpdk/cover/20211011181528.517-1-viacheslavo@nvidia.com/
>   - v2:  http://patches.dpdk.org/project/dpdk/patch/20211001193415.23288-2-viacheslavo@nvidia.com/
>   - v1:  http://patches.dpdk.org/project/dpdk/patch/20210922180418.20663-2-viacheslavo@nvidia.com/
>   - RFC: http://patches.dpdk.org/project/dpdk/patch/20210806085624.16497-1-viacheslavo@nvidia.com/
> 
>   - v7 -> v8:
>     - fixed the first commit Author (was ocasionally altered due to resplit)
> 
>   - v6 -> v7:
>     - series resplitted and patches reorderered, code is the same
>     - documentation fixes
> 
>   - v5 -> v6:
>     - flex item command moved to dedicated file cmd_flex_item.c
> 
>   - v4 -> v5:
>     - comments addressed
>     - testpmd compilation issue fixed
> 
>   - v3 -> v4:
>     - comments addressed
>     - testpmd compilation issues fixed
>     - typos fixed
> 
>   - v2 -> v3:
>     - comments addressed
>     - flex item update removed as not supported
>     - RSS over flex item fields removed as not supported and non-complete
>       API
>     - tunnel mode configuration refactored
>     - testpmd updated
>     - documentation updated
>     - PMD patches are removed temporarily (updating WIP, be presented in rc2)
> 
>   - v1 -> v2:
>     - testpmd CLI to handle flex item is provided
>     - draft PMD code is introduced
> 
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo at nvidia.com>
> 
> Gregory Etelson (3):
>    ethdev: support flow elements with variable length
>    app/testpmd: add dedicated flow command parsing routine
>    app/testpmd: add flex item CLI commands
> 
> Viacheslav Ovsiienko (1):
>    ethdev: introduce configurable flexible item
> 
This is a nice feature but not reviewed throughout by maintainers/community,
I will proceed based on Ori's ack, but perhaps we should discuss this on
release retrospective.
minor meson whitespace issue fixed while merging:
$ ./devtools/check-meson.py
Error: Incorrect indent at app/test-pmd/meson.build:13
Series applied to dpdk-next-net/main, thanks.
    
    
More information about the dev
mailing list