[dpdk-dev] [PATCH 01/22] ethdev: introduce generic flow API

Adrien Mazarguil adrien.mazarguil at 6wind.com
Thu Dec 8 18:07:15 CET 2016


On Fri, Dec 02, 2016 at 09:06:42PM +0000, Kevin Traynor wrote:
> On 12/01/2016 08:36 AM, Adrien Mazarguil wrote:
> > Hi Kevin,
> > 
> > On Wed, Nov 30, 2016 at 05:47:17PM +0000, Kevin Traynor wrote:
> >> Hi Adrien,
> >>
> >> On 11/16/2016 04:23 PM, Adrien Mazarguil wrote:
> >>> This new API supersedes all the legacy filter types described in
> >>> rte_eth_ctrl.h. It is slightly higher level and as a result relies more on
> >>> PMDs to process and validate flow rules.
> >>>
> >>> Benefits:
> >>>
> >>> - A unified API is easier to program for, applications do not have to be
> >>>   written for a specific filter type which may or may not be supported by
> >>>   the underlying device.
> >>>
> >>> - The behavior of a flow rule is the same regardless of the underlying
> >>>   device, applications do not need to be aware of hardware quirks.
> >>>
> >>> - Extensible by design, API/ABI breakage should rarely occur if at all.
> >>>
> >>> - Documentation is self-standing, no need to look up elsewhere.
> >>>
> >>> Existing filter types will be deprecated and removed in the near future.
> >>
> >> I'd suggest to add a deprecation notice to deprecation.rst, ideally with
> >> a target release.
> > 
> > Will do, not a sure about the target release though. It seems a bit early
> > since no PMD really supports this API yet.
> > 
> > [...]
> >>> diff --git a/lib/librte_ether/rte_flow.c b/lib/librte_ether/rte_flow.c
> >>> new file mode 100644
> >>> index 0000000..064963d
> >>> --- /dev/null
> >>> +++ b/lib/librte_ether/rte_flow.c
> >>> @@ -0,0 +1,159 @@
> >>> +/*-
> >>> + *   BSD LICENSE
> >>> + *
> >>> + *   Copyright 2016 6WIND S.A.
> >>> + *   Copyright 2016 Mellanox.
> >>
> >> There's Mellanox copyright but you are the only signed-off-by - is that
> >> right?
> > 
> > Yes, I'm the primary maintainer for Mellanox PMDs and this API was designed
> > on their behalf to expose several features from mlx4/mlx5 as the existing
> > filter types had too many limitations.
> > 
> > [...]
> >>> +/* Get generic flow operations structure from a port. */
> >>> +const struct rte_flow_ops *
> >>> +rte_flow_ops_get(uint8_t port_id, struct rte_flow_error *error)
> >>> +{
> >>> +	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
> >>> +	const struct rte_flow_ops *ops;
> >>> +	int code;
> >>> +
> >>> +	if (unlikely(!rte_eth_dev_is_valid_port(port_id)))
> >>> +		code = ENODEV;
> >>> +	else if (unlikely(!dev->dev_ops->filter_ctrl ||
> >>> +			  dev->dev_ops->filter_ctrl(dev,
> >>> +						    RTE_ETH_FILTER_GENERIC,
> >>> +						    RTE_ETH_FILTER_GET,
> >>> +						    &ops) ||
> >>> +			  !ops))
> >>> +		code = ENOTSUP;
> >>> +	else
> >>> +		return ops;
> >>> +	rte_flow_error_set(error, code, RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
> >>> +			   NULL, rte_strerror(code));
> >>> +	return NULL;
> >>> +}
> >>> +
> >>
> >> Is it expected that the application or pmd will provide locking between
> >> these functions if required? I think it's going to have to be the app.
> > 
> > Locking is indeed expected to be performed by applications. This API only
> > documents places where locking would make sense if necessary and expected
> > behavior.
> > 
> > Like all control path APIs, this one assumes a single control thread.
> > Applications must take the necessary precautions.
> 
> If you look at OVS now it's quite possible that you have 2 rx queues
> serviced by different threads, that would also install the flow rules in
> the software flow caches - possibly that could extend to adding hardware
> flows. There could also be another thread that is querying for stats. So
> anything that can be done to minimise the locking would be helpful -
> maybe query() could be atomic and not require any locking?

I think we need basic functions with as few constraints as possible on PMDs
first, this API being somewhat complex to implement on their side. That
covers the common use case where applications have a single control thread
or otherwise perform locking on their own.

Once the basics are there for most PMDs, we may add new functions, items,
properties and actions that provide additional constraints (timing,
multi-threading and so on), which remain to be defined according to
feedback. It is designed to be extended without causing ABI breakage.

As for query(), let's see how PMDs handle it first. A race between query()
and create() on a given device is almost unavoidable without locking, same
for queries that reset counters in a given flow rule. Basic parallel queries
should not cause any harm otherwise, although this cannot be guaranteed yet.

> > [...]
> >>> +/**
> >>> + * Flow rule attributes.
> >>> + *
> >>> + * Priorities are set on two levels: per group and per rule within groups.
> >>> + *
> >>> + * Lower values denote higher priority, the highest priority for both levels
> >>> + * is 0, so that a rule with priority 0 in group 8 is always matched after a
> >>> + * rule with priority 8 in group 0.
> >>> + *
> >>> + * Although optional, applications are encouraged to group similar rules as
> >>> + * much as possible to fully take advantage of hardware capabilities
> >>> + * (e.g. optimized matching) and work around limitations (e.g. a single
> >>> + * pattern type possibly allowed in a given group).
> >>> + *
> >>> + * Group and priority levels are arbitrary and up to the application, they
> >>> + * do not need to be contiguous nor start from 0, however the maximum number
> >>> + * varies between devices and may be affected by existing flow rules.
> >>> + *
> >>> + * If a packet is matched by several rules of a given group for a given
> >>> + * priority level, the outcome is undefined. It can take any path, may be
> >>> + * duplicated or even cause unrecoverable errors.
> >>
> >> I get what you are trying to do here wrt supporting multiple
> >> pmds/hardware implementations and it's a good idea to keep it flexible.
> >>
> >> Given that the outcome is undefined, it would be nice that the
> >> application has a way of finding the specific effects for verification
> >> and debugging.
> > 
> > Right, however it was deemed a bit difficult to manage in many cases hence
> > the vagueness.
> > 
> > For example, suppose two rules with the same group and priority, one
> > matching any IPv4 header, the other one any UDP header:
> > 
> > - TCPv4 packets => rule #1.
> > - UDPv6 packets => rule #2.
> > - UDPv4 packets => both?
> > 
> > That last one is perhaps invalid, checking that some unspecified protocol
> > combination does not overlap is expensive and may miss corner cases, even
> > assuming this is not an issue, what if the application guarantees that no
> > UDPv4 packets can ever hit that rule?
> 
> that's fine - I don't expect the software to be able to know what the
> hardware will do with those rules. It's more about trying to get a dump
> from the hardware if something goes wrong. Anyway covered in comment later.
> 
> > 
> > Suggestions are welcome though, perhaps we can refine the description
> > 
> >>> + *
> >>> + * Note that support for more than a single group and priority level is not
> >>> + * guaranteed.
> >>> + *
> >>> + * Flow rules can apply to inbound and/or outbound traffic (ingress/egress).
> >>> + *
> >>> + * Several pattern items and actions are valid and can be used in both
> >>> + * directions. Those valid for only one direction are described as such.
> >>> + *
> >>> + * Specifying both directions at once is not recommended but may be valid in
> >>> + * some cases, such as incrementing the same counter twice.
> >>> + *
> >>> + * Not specifying any direction is currently an error.
> >>> + */
> >>> +struct rte_flow_attr {
> >>> +	uint32_t group; /**< Priority group. */
> >>> +	uint32_t priority; /**< Priority level within group. */
> >>> +	uint32_t ingress:1; /**< Rule applies to ingress traffic. */
> >>> +	uint32_t egress:1; /**< Rule applies to egress traffic. */
> >>> +	uint32_t reserved:30; /**< Reserved, must be zero. */
> >>> +};
> > [...]
> >>> +/**
> >>> + * RTE_FLOW_ITEM_TYPE_VF
> >>> + *
> >>> + * Matches packets addressed to a virtual function ID of the device.
> >>> + *
> >>> + * If the underlying device function differs from the one that would
> >>> + * normally receive the matched traffic, specifying this item prevents it
> >>> + * from reaching that device unless the flow rule contains a VF
> >>> + * action. Packets are not duplicated between device instances by default.
> >>> + *
> >>> + * - Likely to return an error or never match any traffic if this causes a
> >>> + *   VF device to match traffic addressed to a different VF.
> >>> + * - Can be specified multiple times to match traffic addressed to several
> >>> + *   specific VFs.
> >>> + * - Can be combined with a PF item to match both PF and VF traffic.
> >>> + *
> >>> + * A zeroed mask can be used to match any VF.
> >>
> >> can you refer explicitly to id
> > 
> > If you mean "VF" to "VF ID" then yes, will do it for v2.
> > 
> >>> + */
> >>> +struct rte_flow_item_vf {
> >>> +	uint32_t id; /**< Destination VF ID. */
> >>> +};
> > [...]
> >>> +/**
> >>> + * Matching pattern item definition.
> >>> + *
> >>> + * A pattern is formed by stacking items starting from the lowest protocol
> >>> + * layer to match. This stacking restriction does not apply to meta items
> >>> + * which can be placed anywhere in the stack with no effect on the meaning
> >>> + * of the resulting pattern.
> >>> + *
> >>> + * A stack is terminated by a END item.
> >>> + *
> >>> + * The spec field should be a valid pointer to a structure of the related
> >>> + * item type. It may be set to NULL in many cases to use default values.
> >>> + *
> >>> + * Optionally, last can point to a structure of the same type to define an
> >>> + * inclusive range. This is mostly supported by integer and address fields,
> >>> + * may cause errors otherwise. Fields that do not support ranges must be set
> >>> + * to the same value as their spec counterparts.
> >>> + *
> >>> + * By default all fields present in spec are considered relevant.* This
> >>
> >> typo "*"
> > 
> > No, that's an asterisk for a footnote below. Perhaps it is a bit unusual,
> > would something like "[1]" look better?
> 
> oh, I thought it was the start of a comment line gone astray. Maybe "See
> note below", no big deal though.

OK, will change it anyway for clarity.

> >>> + * behavior can be altered by providing a mask structure of the same type
> >>> + * with applicable bits set to one. It can also be used to partially filter
> >>> + * out specific fields (e.g. as an alternate mean to match ranges of IP
> >>> + * addresses).
> >>> + *
> >>> + * Note this is a simple bit-mask applied before interpreting the contents
> >>> + * of spec and last, which may yield unexpected results if not used
> >>> + * carefully. For example, if for an IPv4 address field, spec provides
> >>> + * 10.1.2.3, last provides 10.3.4.5 and mask provides 255.255.0.0, the
> >>> + * effective range is 10.1.0.0 to 10.3.255.255.
> >>> + *
> > 
> > See footnote below:
> > 
> >>> + * * The defaults for data-matching items such as IPv4 when mask is not
> >>> + *   specified actually depend on the underlying implementation since only
> >>> + *   recognized fields can be taken into account.
> >>> + */
> >>> +struct rte_flow_item {
> >>> +	enum rte_flow_item_type type; /**< Item type. */
> >>> +	const void *spec; /**< Pointer to item specification structure. */
> >>> +	const void *last; /**< Defines an inclusive range (spec to last). */
> >>> +	const void *mask; /**< Bit-mask applied to spec and last. */
> >>> +};
> >>> +
> >>> +/**
> >>> + * Action types.
> >>> + *
> >>> + * Each possible action is represented by a type. Some have associated
> >>> + * configuration structures. Several actions combined in a list can be
> >>> + * affected to a flow rule. That list is not ordered.
> >>> + *
> >>> + * They fall in three categories:
> >>> + *
> >>> + * - Terminating actions (such as QUEUE, DROP, RSS, PF, VF) that prevent
> >>> + *   processing matched packets by subsequent flow rules, unless overridden
> >>> + *   with PASSTHRU.
> >>> + *
> >>> + * - Non terminating actions (PASSTHRU, DUP) that leave matched packets up
> >>> + *   for additional processing by subsequent flow rules.
> >>> + *
> >>> + * - Other non terminating meta actions that do not affect the fate of
> >>> + *   packets (END, VOID, MARK, FLAG, COUNT).
> >>> + *
> >>> + * When several actions are combined in a flow rule, they should all have
> >>> + * different types (e.g. dropping a packet twice is not possible). The
> >>> + * defined behavior is for PMDs to only take into account the last action of
> >>> + * a given type found in the list. PMDs still perform error checking on the
> >>> + * entire list.
> >>
> >> why do you define that the pmd will interpret multiple same type rules
> >> in this way...would it not make more sense for the pmd to just return
> >> EINVAL for an invalid set of rules? It seems more transparent for the
> >> application.
> > 
> > Well, I had to define something as a default. The reason is that any number
> > of VOID actions may specified and did not want that to be a special case in
> > order to keep PMD parsers as simple as possible. I'll settle for EINVAL (or
> > some other error) if at least one PMD maintainer other than Nelio who
> > intends to implement this API is not convinced by this explanation, all
> > right?
> 
> From an API perspective I think it's cleaner to pass or fail with the
> input rather than change it. But yes, please take pmd maintainers input
> as to what is reasonable to check also.
> 
> > 
> > [...]
> >>> +/**
> >>> + * RTE_FLOW_ACTION_TYPE_MARK
> >>> + *
> >>> + * Attaches a 32 bit value to packets.
> >>> + *
> >>> + * This value is arbitrary and application-defined. For compatibility with
> >>> + * FDIR it is returned in the hash.fdir.hi mbuf field. PKT_RX_FDIR_ID is
> >>> + * also set in ol_flags.
> >>> + */
> >>> +struct rte_flow_action_mark {
> >>> +	uint32_t id; /**< 32 bit value to return with packets. */
> >>> +};
> >>
> >> One use case I thought we would be able to do for OVS is classification
> >> in hardware and the unique flow id is sent with the packet to software.
> >> But in OVS the ufid is 128 bits, so it means we can't and there is still
> >> the miniflow extract overhead. I'm not sure if there is a practical way
> >> around this.
> >>
> >> Sugesh (cc'd) has looked at this before and may be able to comment or
> >> correct me.
> > 
> > Yes, we settled on 32 bit because currently no known hardware implementation
> > supports more than this. If that changes, another action with a larger type
> > shall be provided (no ABI breakage).
> > 
> > Also since even 64 bit would not be enough for the use case you mention,
> > there is no choice but use this as an indirect value (such as an array or
> > hash table index/value).
> 
> ok, cool. I think Sugesh has other ideas anyway!
> 
> > 
> > [...]
> >>> +/**
> >>> + * RTE_FLOW_ACTION_TYPE_RSS
> >>> + *
> >>> + * Similar to QUEUE, except RSS is additionally performed on packets to
> >>> + * spread them among several queues according to the provided parameters.
> >>> + *
> >>> + * Note: RSS hash result is normally stored in the hash.rss mbuf field,
> >>> + * however it conflicts with the MARK action as they share the same
> >>> + * space. When both actions are specified, the RSS hash is discarded and
> >>> + * PKT_RX_RSS_HASH is not set in ol_flags. MARK has priority. The mbuf
> >>> + * structure should eventually evolve to store both.
> >>> + *
> >>> + * Terminating by default.
> >>> + */
> >>> +struct rte_flow_action_rss {
> >>> +	const struct rte_eth_rss_conf *rss_conf; /**< RSS parameters. */
> >>> +	uint16_t queues; /**< Number of entries in queue[]. */
> >>> +	uint16_t queue[]; /**< Queues indices to use. */
> >>
> >> I'd try and avoid queue and queues - someone will say "huh?" when
> >> reading code. s/queues/num ?
> > 
> > Agreed, will update for v2.
> > 
> >>> +};
> >>> +
> >>> +/**
> >>> + * RTE_FLOW_ACTION_TYPE_VF
> >>> + *
> >>> + * Redirects packets to a virtual function (VF) of the current device.
> >>> + *
> >>> + * Packets matched by a VF pattern item can be redirected to their original
> >>> + * VF ID instead of the specified one. This parameter may not be available
> >>> + * and is not guaranteed to work properly if the VF part is matched by a
> >>> + * prior flow rule or if packets are not addressed to a VF in the first
> >>> + * place.
> >>
> >> Not clear what you mean by "not guaranteed to work if...". Please return
> >> fail when this action is used if this is not going to work.
> > 
> > Again, this is a case where it is difficult for a PMD to determine if the
> > entire list of flow rules makes sense. Perhaps it does, perhaps whatever
> > goes through has already been filtered out of possible issues.
> > 
> > Here the documentation states the precautions an application should take to
> > guarantee it will work as intended. Perhaps it can be reworded (any
> > suggestion?), but a PMD can certainly not provide any strong guarantee.
> 
> I see your point. Maybe for easy check things the pmd would return fail,
> but for more complex I agree it's too difficult.
> 
> > 
> >>> + *
> >>> + * Terminating by default.
> >>> + */
> >>> +struct rte_flow_action_vf {
> >>> +	uint32_t original:1; /**< Use original VF ID if possible. */
> >>> +	uint32_t reserved:31; /**< Reserved, must be zero. */
> >>> +	uint32_t id; /**< VF ID to redirect packets to. */
> >>> +};
> > [...]
> >>> +/**
> >>> + * Check whether a flow rule can be created on a given port.
> >>> + *
> >>> + * While this function has no effect on the target device, the flow rule is
> >>> + * validated against its current configuration state and the returned value
> >>> + * should be considered valid by the caller for that state only.
> >>> + *
> >>> + * The returned value is guaranteed to remain valid only as long as no
> >>> + * successful calls to rte_flow_create() or rte_flow_destroy() are made in
> >>> + * the meantime and no device parameter affecting flow rules in any way are
> >>> + * modified, due to possible collisions or resource limitations (although in
> >>> + * such cases EINVAL should not be returned).
> >>> + *
> >>> + * @param port_id
> >>> + *   Port identifier of Ethernet device.
> >>> + * @param[in] attr
> >>> + *   Flow rule attributes.
> >>> + * @param[in] pattern
> >>> + *   Pattern specification (list terminated by the END pattern item).
> >>> + * @param[in] actions
> >>> + *   Associated actions (list terminated by the END action).
> >>> + * @param[out] error
> >>> + *   Perform verbose error reporting if not NULL.
> >>> + *
> >>> + * @return
> >>> + *   0 if flow rule is valid and can be created. A negative errno value
> >>> + *   otherwise (rte_errno is also set), the following errors are defined:
> >>> + *
> >>> + *   -ENOSYS: underlying device does not support this functionality.
> >>> + *
> >>> + *   -EINVAL: unknown or invalid rule specification.
> >>> + *
> >>> + *   -ENOTSUP: valid but unsupported rule specification (e.g. partial
> >>> + *   bit-masks are unsupported).
> >>> + *
> >>> + *   -EEXIST: collision with an existing rule.
> >>> + *
> >>> + *   -ENOMEM: not enough resources.
> >>> + *
> >>> + *   -EBUSY: action cannot be performed due to busy device resources, may
> >>> + *   succeed if the affected queues or even the entire port are in a stopped
> >>> + *   state (see rte_eth_dev_rx_queue_stop() and rte_eth_dev_stop()).
> >>> + */
> >>> +int
> >>> +rte_flow_validate(uint8_t port_id,
> >>> +		  const struct rte_flow_attr *attr,
> >>> +		  const struct rte_flow_item pattern[],
> >>> +		  const struct rte_flow_action actions[],
> >>> +		  struct rte_flow_error *error);
> >>
> >> Why not just use rte_flow_create() and get an error? Is it less
> >> disruptive to do a validate and find the rule cannot be created, than
> >> using a create directly?
> > 
> > The rationale can be found in the original RFC, which I'll convert to actual
> > documentation in v2. In short:
> > 
> > - Calling rte_flow_validate() before rte_flow_create() is useless since
> >   rte_flow_create() also performs validation.
> > 
> > - We cannot possibly express a full static set of allowed flow rules, even
> >   if we could, it usually depends on the current hardware configuration
> >   therefore would not be static.
> > 
> > - rte_flow_validate() is thus provided as a replacement for capability
> >   flags. It can be used to determine during initialization if the underlying
> >   device can support the typical flow rules an application might want to
> >   provide later and do something useful with that information (e.g. always
> >   use software fallback due to HW limitations).
> > 
> > - rte_flow_validate() being a subset of rte_flow_create(), it is essentially
> >   free to expose.
> 
> make sense now, thanks.
> 
> > 
> >>> +
> >>> +/**
> >>> + * Create a flow rule on a given port.
> >>> + *
> >>> + * @param port_id
> >>> + *   Port identifier of Ethernet device.
> >>> + * @param[in] attr
> >>> + *   Flow rule attributes.
> >>> + * @param[in] pattern
> >>> + *   Pattern specification (list terminated by the END pattern item).
> >>> + * @param[in] actions
> >>> + *   Associated actions (list terminated by the END action).
> >>> + * @param[out] error
> >>> + *   Perform verbose error reporting if not NULL.
> >>> + *
> >>> + * @return
> >>> + *   A valid handle in case of success, NULL otherwise and rte_errno is set
> >>> + *   to the positive version of one of the error codes defined for
> >>> + *   rte_flow_validate().
> >>> + */
> >>> +struct rte_flow *
> >>> +rte_flow_create(uint8_t port_id,
> >>> +		const struct rte_flow_attr *attr,
> >>> +		const struct rte_flow_item pattern[],
> >>> +		const struct rte_flow_action actions[],
> >>> +		struct rte_flow_error *error);
> >>
> >> General question - are these functions threadsafe? In the OVS example
> >> you could have several threads wanting to create flow rules at the same
> >> time for same or different ports.
> > 
> > No they aren't, applications have to perform their own locking. The RFC (to
> > be converted to actual documentation in v2) says that:
> > 
> > - API operations are synchronous and blocking (``EAGAIN`` cannot be
> >   returned).
> > 
> > - There is no provision for reentrancy/multi-thread safety, although nothing
> >   should prevent different devices from being configured at the same
> >   time. PMDs may protect their control path functions accordingly.
> 
> other comment above wrt locking.
> 
> > 
> >>> +
> >>> +/**
> >>> + * Destroy a flow rule on a given port.
> >>> + *
> >>> + * Failure to destroy a flow rule handle may occur when other flow rules
> >>> + * depend on it, and destroying it would result in an inconsistent state.
> >>> + *
> >>> + * This function is only guaranteed to succeed if handles are destroyed in
> >>> + * reverse order of their creation.
> >>
> >> How can the application find this information out on error?
> > 
> > Without maintaining a list, they cannot. The specified case is the only
> > possible guarantee. That does not mean PMDs should not do their best to
> > destroy flow rules, only that ordering must remain consistent in case of
> > inability to destroy one.
> > 
> > What do you suggest?
> 
> I think if the app cannot remove a specific rule it may want to remove
> all rules and deal with flows in software for a time. So once the app
> knows it fails that should be enough.

OK, then since destruction may return an error already, is it fine?
Applications may call rte_flow_flush() (not supposed to fail unless there is
a serious issue, abort() in that case) and switch to SW fallback.

> >>> + *
> >>> + * @param port_id
> >>> + *   Port identifier of Ethernet device.
> >>> + * @param flow
> >>> + *   Flow rule handle to destroy.
> >>> + * @param[out] error
> >>> + *   Perform verbose error reporting if not NULL.
> >>> + *
> >>> + * @return
> >>> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> >>> + */
> >>> +int
> >>> +rte_flow_destroy(uint8_t port_id,
> >>> +		 struct rte_flow *flow,
> >>> +		 struct rte_flow_error *error);
> >>> +
> >>> +/**
> >>> + * Destroy all flow rules associated with a port.
> >>> + *
> >>> + * In the unlikely event of failure, handles are still considered destroyed
> >>> + * and no longer valid but the port must be assumed to be in an inconsistent
> >>> + * state.
> >>> + *
> >>> + * @param port_id
> >>> + *   Port identifier of Ethernet device.
> >>> + * @param[out] error
> >>> + *   Perform verbose error reporting if not NULL.
> >>> + *
> >>> + * @return
> >>> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> >>> + */
> >>> +int
> >>> +rte_flow_flush(uint8_t port_id,
> >>> +	       struct rte_flow_error *error);
> >>
> >> rte_flow_destroy_all() would be more descriptive (but breaks your style)
> > 
> > There are enough underscores as it is. I like flush, if enough people
> > complain we'll change it but it has to occur before the first public
> > release.
> > 
> >>> +
> >>> +/**
> >>> + * Query an existing flow rule.
> >>> + *
> >>> + * This function allows retrieving flow-specific data such as counters.
> >>> + * Data is gathered by special actions which must be present in the flow
> >>> + * rule definition.
> >>
> >> re last sentence, it would be good if you can put a link to
> >> RTE_FLOW_ACTION_TYPE_COUNT
> > 
> > Will do, I did not know how until very recently.
> > 
> >>> + *
> >>> + * @param port_id
> >>> + *   Port identifier of Ethernet device.
> >>> + * @param flow
> >>> + *   Flow rule handle to query.
> >>> + * @param action
> >>> + *   Action type to query.
> >>> + * @param[in, out] data
> >>> + *   Pointer to storage for the associated query data type.
> >>
> >> can this be anything other than rte_flow_query_count?
> > 
> > Likely in the future. I've only defined this one as a counterpart for
> > existing API functionality and because we wanted to expose it in mlx5.
> > 
> >>> + * @param[out] error
> >>> + *   Perform verbose error reporting if not NULL.
> >>> + *
> >>> + * @return
> >>> + *   0 on success, a negative errno value otherwise and rte_errno is set.
> >>> + */
> >>> +int
> >>> +rte_flow_query(uint8_t port_id,
> >>> +	       struct rte_flow *flow,
> >>> +	       enum rte_flow_action_type action,
> >>> +	       void *data,
> >>> +	       struct rte_flow_error *error);
> >>> +
> >>> +#ifdef __cplusplus
> >>> +}
> >>> +#endif
> >>
> >> I don't see a way to dump all the rules for a port out. I think this is
> >> neccessary for degbugging. You could have a look through dpif.h in OVS
> >> and see how dpif_flow_dump_next() is used, it might be a good reference.
> > 
> > DPDK does not maintain flow rules and, depending on hardware capabilities
> > and level of compliance, PMDs do not necessarily do it either, particularly
> > since it requires space and application probably have a better method to
> > store these pointers for their own needs.
> 
> understood
> 
> > 
> > What you see here is only a PMD interface. Depending on applications needs,
> > generic helper functions built on top of these may be added to manage flow
> > rules in the future.
> 
> I'm thinking of the case where something goes wrong and I want to get a
> dump of all the flow rules from hardware, not query the rules I think I
> have. I don't see a way to do it or something to build a helper on top of?

Generic helper functions would exist on top of this API and would likely
maintain a list of flow rules themselves. The dump in that case would be
entirely implemented in software. I think that recovering flow rules from HW
may be complicated in many cases (even without taking storage allocation and
rules conversion issues into account), therefore if there is really a need
for it, we could perhaps add a dump() function that PMDs are free to
implement later.

-- 
Adrien Mazarguil
6WIND


More information about the dev mailing list