[dpdk-dev] [PATCH v4 1/5] ethdev: introduce configurable flexible item
Ori Kam
orika at nvidia.com
Tue Oct 12 13:42:14 CEST 2021
Hi Slava,
> -----Original Message-----
> From: dev <dev-bounces at dpdk.org> On Behalf Of Viacheslav Ovsiienko
> Sent: Tuesday, October 12, 2021 2:33 PM
> Subject: [dpdk-dev] [PATCH v4 1/5] ethdev: introduce configurable flexible item
>
> 1. Introduction and Retrospective
>
> Nowadays the networks are evolving fast and wide, the network structures are getting more and more
> complicated, the new application areas are emerging. To address these challenges the new network
> protocols are continuously being developed, considered by technical communities, adopted by industry
> and, eventually implemented in hardware and software. The DPDK framework follows the common
> trends and if we bother to glance at the RTE Flow API header we see the multiple new items were
> introduced during the last years since the initial release.
>
> The new protocol adoption and implementation process is not straightforward and takes time, the new
> protocol passes development, consideration, adoption, and implementation phases. The industry tries to
> mitigate and address the forthcoming network protocols, for example, many hardware vendors are
> implementing flexible and configurable network protocol parsers. As DPDK developers, could we
> anticipate the near future in the same fashion and introduce the similar flexibility in RTE Flow API?
>
> Let's check what we already have merged in our project, and we see the nice raw item
> (rte_flow_item_raw). At the first glance, it looks superior and we can try to implement a flow matching on
> the header of some relatively new tunnel protocol, say on the GENEVE header with variable length
> options. And, under further consideration, we run into the raw item
> limitations:
>
> - only fixed size network header can be represented
> - the entire network header pattern of fixed format
> (header field offsets are fixed) must be provided
> - the search for patterns is not robust (the wrong matches
> might be triggered), and actually is not supported
> by existing PMDs
> - no explicitly specified relations with preceding
> and following items
> - no tunnel hint support
>
> As the result, implementing the support for tunnel protocols like aforementioned GENEVE with variable
> extra protocol option with flow raw item becomes very complicated and would require multiple flows and
> multiple raw items chained in the same flow (by the way, there is no support found for chained raw items
> in implemented drivers).
>
> This RFC introduces the dedicated flex item (rte_flow_item_flex) to handle matches with existing and new
> network protocol headers in a unified fashion.
>
> 2. Flex Item Life Cycle
>
> Let's assume there are the requirements to support the new network protocol with RTE Flows. What is
> given within protocol
> specification:
>
> - header format
> - header length, (can be variable, depending on options)
> - potential presence of extra options following or included
> in the header the header
> - the relations with preceding protocols. For example,
> the GENEVE follows UDP, eCPRI can follow either UDP
> or L2 header
> - the relations with following protocols. For example,
> the next layer after tunnel header can be L2 or L3
> - whether the new protocol is a tunnel and the header
> is a splitting point between outer and inner layers
>
> The supposed way to operate with flex item:
>
> - application defines the header structures according to
> protocol specification
>
> - application calls rte_flow_flex_item_create() with desired
> configuration according to the protocol specification, it
> creates the flex item object over specified ethernet device
> and prepares PMD and underlying hardware to handle flex
> item. On item creation call PMD backing the specified
> ethernet device returns the opaque handle identifying
> the object has been created
>
> - application uses the rte_flow_item_flex with obtained handle
> in the flows, the values/masks to match with fields in the
> header are specified in the flex item per flow as for regular
> items (except that pattern buffer combines all fields)
>
> - flows with flex items match with packets in a regular fashion,
> the values and masks for the new protocol header match are
> taken from the flex items in the flows
>
> - application destroys flows with flex items
>
> - application calls rte_flow_flex_item_release() as part of
> ethernet device API and destroys the flex item object in
> PMD and releases the engaged hardware resources
>
> 3. Flex Item Structure
>
> The flex item structure is intended to be used as part of the flow pattern like regular RTE flow items and
> provides the mask and value to match with fields of the protocol item was configured for.
>
> struct rte_flow_item_flex {
> void *handle;
> uint32_t length;
> const uint8_t* pattern;
> };
>
> The handle is some opaque object maintained on per device basis by underlying driver.
>
> The protocol header fields are considered as bit fields, all offsets and widths are expressed in bits. The
> pattern is the buffer containing the bit concatenation of all the fields presented at item configuration time,
> in the same order and same amount. If byte boundary alignment is needed an application can use a
> dummy type field, this is just some kind of gap filler.
>
> The length field specifies the pattern buffer length in bytes and is needed to allow rte_flow_copy()
> operations. The approach of multiple pattern pointers and lengths (per field) was considered and found
> clumsy - it seems to be much suitable for the application to maintain the single structure within the single
> pattern buffer.
>
> 4. Flex Item Configuration
>
> The flex item configuration consists of the following parts:
>
> - header field descriptors:
> - next header
> - next protocol
> - sample to match
> - input link descriptors
> - output link descriptors
>
> The field descriptors tell the driver and hardware what data should be extracted from the packet and then
> control the packet handling in the flow engine. Besides this, sample fields can be presented to match with
> patterns in the flows. Each field is a bit pattern.
> It has width, offset from the header beginning, mode of offset calculation, and offset related parameters.
>
> The next header field is special, no data are actually taken from the packet, but its offset is used as a
> pointer to the next header in the packet, in other words the next header offset specifies the size of the
> header being parsed by flex item.
>
> There is one more special field - next protocol, it specifies where the next protocol identifier is contained
> and packet data sampled from this field will be used to determine the next protocol header type to
> continue packet parsing. The next protocol field is like eth_type field in MAC2, or proto field in IPv4/v6
> headers.
>
> The sample fields are used to represent the data be sampled from the packet and then matched with
> established flows.
>
> There are several methods supposed to calculate field offset in runtime depending on configuration and
> packet content:
>
> - FIELD_MODE_FIXED - fixed offset. The bit offset from
> header beginning is permanent and defined by field_base
> configuration parameter.
>
> - FIELD_MODE_OFFSET - the field bit offset is extracted
> from other header field (indirect offset field). The
> resulting field offset to match is calculated from as:
>
> field_base + (*offset_base & offset_mask) << offset_shift
>
> This mode is useful to sample some extra options following
> the main header with field containing main header length.
> Also, this mode can be used to calculate offset to the
> next protocol header, for example - IPv4 header contains
> the 4-bit field with IPv4 header length expressed in dwords.
> One more example - this mode would allow us to skip GENEVE
> header variable length options.
>
> - FIELD_MODE_BITMASK - the field bit offset is extracted
> from other header field (indirect offset field), the latter
> is considered as bitmask containing some number of one bits,
> the resulting field offset to match is calculated as:
>
> field_base + bitcount(*offset_base & offset_mask) << offset_shift
>
> This mode would be useful to skip the GTP header and its
> extra options with specified flags.
>
> - FIELD_MODE_DUMMY - dummy field, optionally used for byte
> boundary alignment in pattern. Pattern mask and data are
> ignored in the match. All configuration parameters besides
> field size and offset are ignored.
>
> Note: "*" - means the indirect field offset is calculated
> and actual data are extracted from the packet by this
> offset (like data are fetched by pointer *p from memory).
>
> The offset mode list can be extended by vendors according to hardware supported options.
>
> The input link configuration section tells the driver after what protocols and at what conditions the flex
> item can follow.
> Input link specified the preceding header pattern, for example for GENEVE it can be UDP item specifying
> match on destination port with value 6081. The flex item can follow multiple header types and multiple
> input links should be specified. At flow creation time the item with one of the input link types should
> precede the flex item and driver will select the correct flex item settings, depending on the actual flow
> pattern.
>
> The output link configuration section tells the driver how to continue packet parsing after the flex item
> protocol.
> If multiple protocols can follow the flex item header the flex item should contain the field with the next
> protocol identifier and the parsing will be continued depending on the data contained in this field in the
> actual packet.
>
> The flex item fields can participate in RSS hash calculation, the dedicated flag is present in the field
> description to specify what fields should be provided for hashing.
>
> 5. Flex Item Chaining
>
> If there are multiple protocols supposed to be supported with flex items in chained fashion - two or more
> flex items within the same flow and these ones might be neighbors in the pattern, it means the flex items
> are mutual referencing. In this case, the item that occurred first should be created with empty output link
> list or with the list including existing items, and then the second flex item should be created referencing the
> first flex item as input arc, drivers should adjust the item configuration.
>
> Also, the hardware resources used by flex items to handle the packet can be limited. If there are multiple
> flex items that are supposed to be used within the same flow it would be nice to provide some hint for the
> driver that these two or more flex items are intended for simultaneous usage.
> The fields of items should be assigned with hint indices and these indices from two or more flex items
> supposed to be provided within the same flow should be the same as well. In other words, the field hint
> index specifies the group of fields that can be matched simultaneously within a single flow. If hint indices
> are specified, the driver will try to engage not overlapping hardware resources and provide independent
> handling of the field groups with unique indices. If the hint index is zero the driver assigns resources on its
> own.
>
> 6. Example of New Protocol Handling
>
> Let's suppose we have the requirements to handle the new tunnel protocol that follows UDP header with
> destination port 0xFADE and is followed by MAC header. Let the new protocol header format be like this:
>
> struct new_protocol_header {
> rte_be32 header_length; /* length in dwords, including options */
> rte_be32 specific0; /* some protocol data, no intention */
> rte_be32 specific1; /* to match in flows on these fields */
> rte_be32 crucial; /* data of interest, match is needed */
> rte_be32 options[0]; /* optional protocol data, variable length */
> };
>
> The supposed flex item configuration:
>
> struct rte_flow_item_flex_field field0 = {
> .field_mode = FIELD_MODE_DUMMY, /* Affects match pattern only */
> .field_size = 96, /* three dwords from the beginning */
> };
> struct rte_flow_item_flex_field field1 = {
> .field_mode = FIELD_MODE_FIXED,
> .field_size = 32, /* Field size is one dword */
> .field_base = 96, /* Skip three dwords from the beginning */
> };
> struct rte_flow_item_udp spec0 = {
> .hdr = {
> .dst_port = RTE_BE16(0xFADE),
> }
> };
> struct rte_flow_item_udp mask0 = {
> .hdr = {
> .dst_port = RTE_BE16(0xFFFF),
> }
> };
> struct rte_flow_item_flex_link link0 = {
> .item = {
> .type = RTE_FLOW_ITEM_TYPE_UDP,
> .spec = &spec0,
> .mask = &mask0,
> };
>
> struct rte_flow_item_flex_conf conf = {
> .next_header = {
> .tunnel = FLEX_TUNNEL_MODE_SINGLE,
> .field_mode = FIELD_MODE_OFFSET,
> .field_base = 0,
> .offset_base = 0,
> .offset_mask = 0xFFFFFFFF,
> .offset_shift = 2 /* Expressed in dwords, shift left by 2 */
> },
> .sample = {
> &field0,
> &field1,
> },
> .nb_samples = 2,
> .input_link[0] = &link0,
> .nb_inputs = 1
> };
>
> Let's suppose we have created the flex item successfully, and PMD returned the handle 0x123456789A.
> We can use the following item pattern to match the crucial field in the packet with value 0x00112233:
>
> struct new_protocol_header spec_pattern =
> {
> .crucial = RTE_BE32(0x00112233),
> };
> struct new_protocol_header mask_pattern =
> {
> .crucial = RTE_BE32(0xFFFFFFFF),
> };
> struct rte_flow_item_flex spec_flex = {
> .handle = 0x123456789A
> .length = sizeiof(struct new_protocol_header),
> .pattern = &spec_pattern,
> };
> struct rte_flow_item_flex mask_flex = {
> .length = sizeof(struct new_protocol_header),
> .pattern = &mask_pattern,
> };
> struct rte_flow_item item_to_match = {
> .type = RTE_FLOW_ITEM_TYPE_FLEX,
> .spec = &spec_flex,
> .mask = &mask_flex,
> };
>
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo at nvidia.com>
> ---
Acked-by: Ori Kam <orika at nvidia.com>
Thanks,
Ori
More information about the dev
mailing list