[dpdk-dev] [PATCH v4] ethdev: support flow aging

Matan Azrad matan at mellanox.com
Thu Apr 30 09:36:08 CEST 2020


Hi Tom

From: Tom Barbette
> Great news!
> 
> - I can understand why there is no timeout unit. But that's calling for user
> nightmare. Eg I could only get from the code (and not from documentation
> yet? ) of the following mlx5 driver patch that the value should be in tenth of
> seconds. If I build an application that is supposed to work with "any NIC",
> what can I do? We'd need a way to query the timeout unit (have it in
> dev_info probably).

Please see the new age action structure in rte_flow.h
You can see comments there that timeout units is in seconds....

> - It's not totally clear if the rule is automatically removed or not. is this a
> helper or an OpenFlow-like notification?

Only notification, the aged-out flow should be destroyed (or other action) by the application according to the application needs...

> - Find a typo and grammar fix inline.
> - Recently, Mellanox introduced the ability to create 330K flows/s. Any
> performance considerations if those flow "expire" at the same rate?

We didn't see performance impact (should be same rate like count action).

> 
> Hope it's helpfull,
> 
> Tom
> 
> Le 21/04/2020 à 12:11, Bill Zhou a écrit :
> > From: Dong Zhou <dongz at mellanox.com>
> >
> > One of the reasons to destroy a flow is the fact that no packet
> > matches the flow for "timeout" time.
> > For example, when TCP\UDP sessions are suddenly closed.
> >
> > Currently, there is not any DPDK mechanism for flow aging and the
> > applications use their own ways to detect and destroy aged-out flows.
> >
> > The flow aging implementation need include:
> > - A new rte_flow action: RTE_FLOW_ACTION_TYPE_AGE to set the timeout
> and
> >    the application flow context for each flow.
> > - A new ethdev event: RTE_ETH_EVENT_FLOW_AGED for the driver to
> report
> >    that there are new aged-out flows.
> > - A new rte_flow API: rte_flow_get_aged_flows to get the aged-out flows
> >    contexts from the port.
> > - Support input flow aging command line in Testpmd.
> >
> > The new event type addition in the enum is flagged as an ABI breakage,
> > so an ignore rule is added for these reasons:
> > - It is not changing value of existing types (except MAX)
> > - The new value is not used by existing API if the event is not
> > registered In general, it is safe adding new ethdev event types at the
> > end of the enum, because of event callback registration mechanism.
> >
> > Signed-off-by: Dong Zhou <dongz at mellanox.com>
> > ---
> > v2: Removing "* Added support for flow Aging mechanism base on
> counter."
> > this line from doc/guides/rel_notes/release_20_05.rst, this patch does
> > not include this support.
> >
> > v3: Update file libabigail.abignore, add one new suppressed
> > enumeration type for RTE_ETH_EVENT_MAX.
> >
> > v4: Add justification in devtools/libabigail.abignore and in the
> > commit log about the modification of v3.
> > ---
> >   app/test-pmd/cmdline_flow.c              | 26 ++++++++++
> >   devtools/libabigail.abignore             |  6 +++
> >   doc/guides/prog_guide/rte_flow.rst       | 22 +++++++++
> >   doc/guides/rel_notes/release_20_05.rst   | 11 +++++
> >   lib/librte_ethdev/rte_ethdev.h           |  1 +
> >   lib/librte_ethdev/rte_ethdev_version.map |  3 ++
> >   lib/librte_ethdev/rte_flow.c             | 18 +++++++
> >   lib/librte_ethdev/rte_flow.h             | 62 ++++++++++++++++++++++++
> >   lib/librte_ethdev/rte_flow_driver.h      |  6 +++
> >   9 files changed, 155 insertions(+)
> >
> > diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
> > index e6ab8ff2f7..45bcff3cf5 100644
> > --- a/app/test-pmd/cmdline_flow.c
> > +++ b/app/test-pmd/cmdline_flow.c
> > @@ -343,6 +343,8 @@ enum index {
> >   	ACTION_SET_IPV4_DSCP_VALUE,
> >   	ACTION_SET_IPV6_DSCP,
> >   	ACTION_SET_IPV6_DSCP_VALUE,
> > +	ACTION_AGE,
> > +	ACTION_AGE_TIMEOUT,
> >   };
> >
> >   /** Maximum size for pattern in struct rte_flow_item_raw. */ @@
> > -1145,6 +1147,7 @@ static const enum index next_action[] = {
> >   	ACTION_SET_META,
> >   	ACTION_SET_IPV4_DSCP,
> >   	ACTION_SET_IPV6_DSCP,
> > +	ACTION_AGE,
> >   	ZERO,
> >   };
> >
> > @@ -1370,6 +1373,13 @@ static const enum index action_set_ipv6_dscp[]
> = {
> >   	ZERO,
> >   };
> >
> > +static const enum index action_age[] = {
> > +	ACTION_AGE,
> > +	ACTION_AGE_TIMEOUT,
> > +	ACTION_NEXT,
> > +	ZERO,
> > +};
> > +
> >   static int parse_set_raw_encap_decap(struct context *, const struct
> token *,
> >   				     const char *, unsigned int,
> >   				     void *, unsigned int);
> > @@ -3694,6 +3704,22 @@ static const struct token token_list[] = {
> >   			     (struct rte_flow_action_set_dscp, dscp)),
> >   		.call = parse_vc_conf,
> >   	},
> > +	[ACTION_AGE] = {
> > +		.name = "age",
> > +		.help = "set a specific metadata header",
> > +		.next = NEXT(action_age),
> > +		.priv = PRIV_ACTION(AGE,
> > +			sizeof(struct rte_flow_action_age)),
> > +		.call = parse_vc,
> > +	},
> > +	[ACTION_AGE_TIMEOUT] = {
> > +		.name = "timeout",
> > +		.help = "flow age timeout value",
> > +		.args = ARGS(ARGS_ENTRY_BF(struct rte_flow_action_age,
> > +					   timeout, 24)),
> > +		.next = NEXT(action_age, NEXT_ENTRY(UNSIGNED)),
> > +		.call = parse_vc_conf,
> > +	},
> >   };
> >
> >   /** Remove and return last entry from argument stack. */ diff --git
> > a/devtools/libabigail.abignore b/devtools/libabigail.abignore index
> > a59df8f135..c047adbd79 100644
> > --- a/devtools/libabigail.abignore
> > +++ b/devtools/libabigail.abignore
> > @@ -11,3 +11,9 @@
> >           type_kind = enum
> >           name = rte_crypto_asym_xform_type
> >           changed_enumerators =
> RTE_CRYPTO_ASYM_XFORM_TYPE_LIST_END
> > +; Ignore ethdev event enum update because new event cannot be ;
> > +received if not registered [suppress_type]
> > +        type_kind = enum
> > +        name = rte_eth_event_type
> > +        changed_enumerators = RTE_ETH_EVENT_MAX
> > diff --git a/doc/guides/prog_guide/rte_flow.rst
> > b/doc/guides/prog_guide/rte_flow.rst
> > index 41c147913c..cf4368e1c4 100644
> > --- a/doc/guides/prog_guide/rte_flow.rst
> > +++ b/doc/guides/prog_guide/rte_flow.rst
> > @@ -2616,6 +2616,28 @@ Otherwise, RTE_FLOW_ERROR_TYPE_ACTION
> error will be returned.
> >      | ``dscp``  | DSCP in low 6 bits, rest ignore |
> >      +-----------+---------------------------------+
> >
> > +Action: ``AGE``
> > +^^^^^^^^^^^^^^^^^^^^^^^^^
> > +
> > +Set ageing timeout configuration to a flow.
> > +
> > +Event RTE_ETH_EVENT_FLOW_AGED will be reported if timeout passed
> > +without any matching on the flow.
> > +
> > +.. _table_rte_flow_action_age:
> > +
> > +.. table:: AGE
> > +
> > +   +--------------+---------------------------------+
> > +   | Field        | Value                           |
> > +   +==============+=================================+
> > +   | ``timeout``  | 24 bits timeout value           |
> > +   +--------------+---------------------------------+
> > +   | ``reserved`` | 8 bits reserved, must be zero   |
> > +   +--------------+---------------------------------+
> > +   | ``context``  | user input flow context         |
> > +   +--------------+---------------------------------+
> > +
> >   Negative types
> >   ~~~~~~~~~~~~~~
> >
> > diff --git a/doc/guides/rel_notes/release_20_05.rst
> > b/doc/guides/rel_notes/release_20_05.rst
> > index bacd4c65a2..ff0cf9f1d6 100644
> > --- a/doc/guides/rel_notes/release_20_05.rst
> > +++ b/doc/guides/rel_notes/release_20_05.rst
> > @@ -135,6 +135,17 @@ New Features
> >     by making use of the event device capabilities. The event mode currently
> supports
> >     only inline IPsec protocol offload.
> >
> > +* **Added flow Aging Support.**
> > +
> > +  Added flow Aging support to detect and report aged-out flows,
> including:
> > +
> > +  * Added new action: RTE_FLOW_ACTION_TYPE_AGE to set the timeout
> and the
> > +    application flow context for each flow.
> > +  * Added new event: RTE_ETH_EVENT_FLOW_AGED for the driver to
> report that
> > +    there are new aged-out flows.
> > +  * Added new API: rte_flow_get_aged_flows to get the aged-out flows
> contexts
> > +    from the port.
> > +
> >
> >   Removed Items
> >   -------------
> > diff --git a/lib/librte_ethdev/rte_ethdev.h
> > b/lib/librte_ethdev/rte_ethdev.h index 8d69b88f9e..00cc7b4052 100644
> > --- a/lib/librte_ethdev/rte_ethdev.h
> > +++ b/lib/librte_ethdev/rte_ethdev.h
> > @@ -3018,6 +3018,7 @@ enum rte_eth_event_type {
> >   	RTE_ETH_EVENT_NEW,      /**< port is probed */
> >   	RTE_ETH_EVENT_DESTROY,  /**< port is released */
> >   	RTE_ETH_EVENT_IPSEC,    /**< IPsec offload related event */
> > +	RTE_ETH_EVENT_FLOW_AGED,/**< New aged-out flows is detected
> */
> >   	RTE_ETH_EVENT_MAX       /**< max value of this enum */
> >   };
> >
> > diff --git a/lib/librte_ethdev/rte_ethdev_version.map
> > b/lib/librte_ethdev/rte_ethdev_version.map
> > index 3f32fdecf7..fa4b5816be 100644
> > --- a/lib/librte_ethdev/rte_ethdev_version.map
> > +++ b/lib/librte_ethdev/rte_ethdev_version.map
> > @@ -230,4 +230,7 @@ EXPERIMENTAL {
> >
> >   	# added in 20.02
> >   	rte_flow_dev_dump;
> > +
> > +	# added in 20.05
> > +	rte_flow_get_aged_flows;
> >   };
> > diff --git a/lib/librte_ethdev/rte_flow.c
> > b/lib/librte_ethdev/rte_flow.c index a5ac1c7fbd..3699edce49 100644
> > --- a/lib/librte_ethdev/rte_flow.c
> > +++ b/lib/librte_ethdev/rte_flow.c
> > @@ -172,6 +172,7 @@ static const struct rte_flow_desc_data
> rte_flow_desc_action[] = {
> >   	MK_FLOW_ACTION(SET_META, sizeof(struct
> rte_flow_action_set_meta)),
> >   	MK_FLOW_ACTION(SET_IPV4_DSCP, sizeof(struct
> rte_flow_action_set_dscp)),
> >   	MK_FLOW_ACTION(SET_IPV6_DSCP, sizeof(struct
> > rte_flow_action_set_dscp)),
> > +	MK_FLOW_ACTION(AGE, sizeof(struct rte_flow_action_age)),
> >   };
> >
> >   int
> > @@ -1232,3 +1233,20 @@ rte_flow_dev_dump(uint16_t port_id, FILE *file,
> struct rte_flow_error *error)
> >   				  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
> >   				  NULL, rte_strerror(ENOSYS));
> >   }
> > +
> > +int
> > +rte_flow_get_aged_flows(uint16_t port_id, void **contexts,
> > +		    uint32_t nb_contexts, struct rte_flow_error *error) {
> > +	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
> > +	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
> > +
> > +	if (unlikely(!ops))
> > +		return -rte_errno;
> > +	if (likely(!!ops->get_aged_flows))
> > +		return flow_err(port_id, ops->get_aged_flows(dev,
> contexts,
> > +				nb_contexts, error), error);
> > +	return rte_flow_error_set(error, ENOTSUP,
> > +				  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
> > +				  NULL, rte_strerror(ENOTSUP));
> > +}
> > diff --git a/lib/librte_ethdev/rte_flow.h
> > b/lib/librte_ethdev/rte_flow.h index 7f3e08fad3..fab44f6c0b 100644
> > --- a/lib/librte_ethdev/rte_flow.h
> > +++ b/lib/librte_ethdev/rte_flow.h
> > @@ -2081,6 +2081,16 @@ enum rte_flow_action_type {
> >   	 * See struct rte_flow_action_set_dscp.
> >   	 */
> >   	RTE_FLOW_ACTION_TYPE_SET_IPV6_DSCP,
> > +
> > +	/**
> > +	 * Report as aged flow if timeout passed without any matching on the
> > +	 * flow.
> > +	 *
> > +	 * See struct rte_flow_action_age.
> > +	 * See function rte_flow_get_aged_flows
> > +	 * see enum RTE_ETH_EVENT_FLOW_AGED
> > +	 */
> > +	RTE_FLOW_ACTION_TYPE_AGE,
> >   };
> >
> >   /**
> > @@ -2122,6 +2132,25 @@ struct rte_flow_action_queue {
> >   	uint16_t index; /**< Queue index to use. */
> >   };
> >
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this structure may change without prior notice
> > + *
> > + * RTE_FLOW_ACTION_TYPE_AGE
> > + *
> > + * Report flow as aged-out if timeout passed without any matching
> > + * on the flow. RTE_ETH_EVENT_FLOW_AGED event is triggered when a
> > + * port detects new aged-out flows.
> > + *
> > + * The flow context and the flow handle will be reported by the
> > + * rte_flow_get_aged_flows API.
> > + */
> > +struct rte_flow_action_age {
> > +	uint32_t timeout:24; /**< Time in seconds. */
> > +	uint32_t reserved:8; /**< Reserved, must be zero. */
> > +	void *context;
> > +		/**< The user flow context, NULL means the rte_flow
> pointer. */ };
> >
> >   /**
> >    * @warning
> > @@ -3254,6 +3283,39 @@ rte_flow_conv(enum rte_flow_conv_op op,
> >   	      const void *src,
> >   	      struct rte_flow_error *error);
> >
> > +/**
> > + * Get aged-out flows of a given port.
> > + *
> > + * RTE_ETH_EVENT_FLOW_AGED event will be triggered when at least
> one
> > +new aged
> > + * out flow was detected after the last call to rte_flow_get_aged_flows.
> > + * This function can be called to get the aged flows usynchronously
> > +from the
> usynchronously
> > + * event callback or synchronously regardless the event.
> > + * This is not safe to call rte_flow_get_aged_flows function with
> > + other flow
> It is not safe to
> > + * functions from multiple threads simultaneously.
> > + *
> > + * @param port_id
> > + *   Port identifier of Ethernet device.
> > + * @param[in, out] contexts
> > + *   The address of an array of pointers to the aged-out flows contexts.
> > + * @param[in] nb_contexts
> > + *   The length of context array pointers.
> > + * @param[out] error
> > + *   Perform verbose error reporting if not NULL. Initialized in case of
> > + *   error only.
> > + *
> > + * @return
> > + *   if nb_contexts is 0, return the amount of all aged contexts.
> > + *   if nb_contexts is not 0 , return the amount of aged flows reported
> > + *   in the context array, otherwise negative errno value.
> > + *
> > + * @see rte_flow_action_age
> > + * @see RTE_ETH_EVENT_FLOW_AGED
> > + */
> > +__rte_experimental
> > +int
> > +rte_flow_get_aged_flows(uint16_t port_id, void **contexts,
> > +			uint32_t nb_contexts, struct rte_flow_error *error);
> > +
> >   #ifdef __cplusplus
> >   }
> >   #endif
> > diff --git a/lib/librte_ethdev/rte_flow_driver.h
> > b/lib/librte_ethdev/rte_flow_driver.h
> > index 51a9a57a0f..881cc469b7 100644
> > --- a/lib/librte_ethdev/rte_flow_driver.h
> > +++ b/lib/librte_ethdev/rte_flow_driver.h
> > @@ -101,6 +101,12 @@ struct rte_flow_ops {
> >   		(struct rte_eth_dev *dev,
> >   		 FILE *file,
> >   		 struct rte_flow_error *error);
> > +	/** See rte_flow_get_aged_flows() */
> > +	int (*get_aged_flows)
> > +		(struct rte_eth_dev *dev,
> > +		 void **context,
> > +		 uint32_t nb_contexts,
> > +		 struct rte_flow_error *err);
> >   };
> >
> >   /**
> >


More information about the dev mailing list