[dpdk-dev] [PATCH v2] ethdev: extend flow metadata

Olivier Matz olivier.matz at 6wind.com
Thu Oct 24 11:22:49 CEST 2019


Hi Slava,

On Thu, Oct 24, 2019 at 06:49:41AM +0000, Slava Ovsiienko wrote:
> Hi, Olivier
> 
> > > [snip]
> > >
> > > > > +int
> > > > > +rte_flow_dynf_metadata_register(void)
> > > > > +{
> > > > > +	int offset;
> > > > > +	int flag;
> > > > > +
> > > > > +	static const struct rte_mbuf_dynfield desc_offs = {
> > > > > +		.name = MBUF_DYNF_METADATA_NAME,
> > > > > +		.size = MBUF_DYNF_METADATA_SIZE,
> > > > > +		.align = MBUF_DYNF_METADATA_ALIGN,
> > > > > +		.flags = MBUF_DYNF_METADATA_FLAGS,
> > > > > +	};
> > > > > +	static const struct rte_mbuf_dynflag desc_flag = {
> > > > > +		.name = MBUF_DYNF_METADATA_NAME,
> > > > > +	};
> > > >
> > > > I don't see think we need #defines.
> > > > You can directly use the name, sizeof() and __alignof__() here.
> > > > If the information is used externally, the structure shall be made
> > > > global non- static.
> > >
> > > The intention was to gather all dynamic fields definitions in one
> > > place (in rte_mbuf_dyn.h).
> > 
> > If the dynamic field is only going to be used inside rte_flow, I think there is no
> > need to expose it in rte_mbuf_dyn.h.
> > The other reason is I think the #define are just "passthrough", and do not
> > really bring added value, just an indirection.
> > 
> > > It would be easy to see all fields in one sight (some might be shared,
> > > some might be mutual exclusive, estimate mbuf space, required by
> > > various features, etc.). So, we can't just fill structure fields with
> > > simple sizeof() and alignof() instead of definitions (the field
> > > parameters must be defined once).
> > >
> > > I do not see the reasons to make table global. I would prefer the
> > definitions.
> > > - the definitions are compile time processing (table fields are
> > > runtime), it provides code optimization and better performance.
> > 
> > There is indeed no need to make the table global if the field is private to
> > rte_flow. About better performance, my understanding is that it would only
> > impact registration, am I missing something?
> 
> OK, I thought about some opportunity to allow application to register
> field directly, bypassing rte_flow_dynf_metadata_register(). So either
> definitions or field description table was supposed to be global. 
> I agree, let's do not complicate the matter, I'll will make global the
> metadata field name definition only - in the rte_mbuf_dyn.h in order
> just to have some centralizing point.

By reading your mail, things are also clearer to me about which
parts need access to this field.

To summarize what I understand:
- dyn field registration is done in rte_flow lib when configuring
  a flow using META
- the dynamic field will never be get/set in a mbuf by a PMD or rte_flow
  before a flow using META is added

One question then: why would you need the dyn field name to be exported?
Does the PMD need to know if the field is registered with a lookup or
something like that? If yes, can you detail why?


> 
> > >
> > > > > +
> > > > > +	offset = rte_mbuf_dynfield_register(&desc_offs);
> > > > > +	if (offset < 0)
> > > > > +		goto error;
> > > > > +	flag = rte_mbuf_dynflag_register(&desc_flag);
> > > > > +	if (flag < 0)
> > > > > +		goto error;
> > > > > +	rte_flow_dynf_metadata_offs = offset;
> > > > > +	rte_flow_dynf_metadata_mask = (1ULL << flag);
> > > > > +	return 0;
> > > > > +
> > > > > +error:
> > > > > +	rte_flow_dynf_metadata_offs = -1;
> > > > > +	rte_flow_dynf_metadata_mask = 0ULL;
> > > > > +	return -rte_errno;
> > > > > +}
> > > > > +
> > > > >  static int
> > > > >  flow_err(uint16_t port_id, int ret, struct rte_flow_error *error)
> > > > > { diff --git a/lib/librte_ethdev/rte_flow.h
> > > > > b/lib/librte_ethdev/rte_flow.h index 391a44a..a27e619 100644
> > > > > --- a/lib/librte_ethdev/rte_flow.h
> > > > > +++ b/lib/librte_ethdev/rte_flow.h
> > > > > @@ -27,6 +27,8 @@
> > > > >  #include <rte_udp.h>
> > > > >  #include <rte_byteorder.h>
> > > > >  #include <rte_esp.h>
> > > > > +#include <rte_mbuf.h>
> > > > > +#include <rte_mbuf_dyn.h>
> > > > >
> > > > >  #ifdef __cplusplus
> > > > >  extern "C" {
> > > > > @@ -417,7 +419,8 @@ enum rte_flow_item_type {
> > > > >  	/**
> > > > >  	 * [META]
> > > > >  	 *
> > > > > -	 * Matches a metadata value specified in mbuf metadata field.
> > > > > +	 * Matches a metadata value.
> > > > > +	 *
> > > > >  	 * See struct rte_flow_item_meta.
> > > > >  	 */
> > > > >  	RTE_FLOW_ITEM_TYPE_META,
> > > > > @@ -1213,9 +1216,17 @@ struct
> > rte_flow_item_icmp6_nd_opt_tla_eth {
> > > > > #endif
> > > > >
> > > > >  /**
> > > > > - * RTE_FLOW_ITEM_TYPE_META.
> > > > > + * @warning
> > > > > + * @b EXPERIMENTAL: this structure may change without prior
> > > > > + notice
> > > > >   *
> > > > > - * Matches a specified metadata value.
> > > > > + * RTE_FLOW_ITEM_TYPE_META
> > > > > + *
> > > > > + * Matches a specified metadata value. On egress, metadata can be
> > > > > + set either by
> > > > > + * mbuf tx_metadata field with PKT_TX_METADATA flag or
> > > > > + * RTE_FLOW_ACTION_TYPE_SET_META. On ingress,
> > > > > + RTE_FLOW_ACTION_TYPE_SET_META sets
> > > > > + * metadata for a packet and the metadata will be reported via
> > > > > + mbuf metadata
> > > > > + * dynamic field with PKT_RX_DYNF_METADATA flag. The dynamic
> > mbuf
> > > > > + field must be
> > > > > + * registered in advance by rte_flow_dynf_metadata_register().
> > > > >   */
> > > > >  struct rte_flow_item_meta {
> > > > >  	rte_be32_t data;
> > > > > @@ -1813,6 +1824,13 @@ enum rte_flow_action_type {
> > > > >  	 * undefined behavior.
> > > > >  	 */
> > > > >  	RTE_FLOW_ACTION_TYPE_DEC_TCP_ACK,
> > > > > +
> > > > > +	/**
> > > > > +	 * Set metadata on ingress or egress path.
> > > > > +	 *
> > > > > +	 * See struct rte_flow_action_set_meta.
> > > > > +	 */
> > > > > +	RTE_FLOW_ACTION_TYPE_SET_META,
> > > > >  };
> > > > >
> > > > >  /**
> > > > > @@ -2300,6 +2318,43 @@ struct rte_flow_action_set_mac {
> > > > >  	uint8_t mac_addr[RTE_ETHER_ADDR_LEN];  };
> > > > >
> > > > > +/**
> > > > > + * @warning
> > > > > + * @b EXPERIMENTAL: this structure may change without prior
> > > > > +notice
> > > > > + *
> > > > > + * RTE_FLOW_ACTION_TYPE_SET_META
> > > > > + *
> > > > > + * Set metadata. Metadata set by mbuf tx_metadata field with
> > > > > + * PKT_TX_METADATA flag on egress will be overridden by this action.
> > > > > +On
> > > > > + * ingress, the metadata will be carried by mbuf metadata dynamic
> > > > > +field
> > > > > + * with PKT_RX_DYNF_METADATA flag if set.  The dynamic mbuf field
> > > > > +must be
> > > > > + * registered in advance by rte_flow_dynf_metadata_register().
> > > > > + *
> > > > > + * Altering partial bits is supported with mask. For bits which
> > > > > +have never
> > > > > + * been set, unpredictable value will be seen depending on driver
> > > > > + * implementation. For loopback/hairpin packet, metadata set on
> > > > > +Rx/Tx may
> > > > > + * or may not be propagated to the other path depending on HW
> > > > capability.
> > > > > + *
> > > > > + * RTE_FLOW_ITEM_TYPE_META matches metadata.
> > > > > + */
> > > > > +struct rte_flow_action_set_meta {
> > > > > +	rte_be32_t data;
> > > > > +	rte_be32_t mask;
> > > > > +};
> > > > > +
> > > > > +/* Mbuf dynamic field offset for metadata. */ extern int
> > > > > +rte_flow_dynf_metadata_offs;
> > > > > +
> > > > > +/* Mbuf dynamic field flag mask for metadata. */ extern uint64_t
> > > > > +rte_flow_dynf_metadata_mask;
> > > > > +
> > > > > +/* Mbuf dynamic field pointer for metadata. */ #define
> > > > > +RTE_FLOW_DYNF_METADATA(m) \
> > > > > +	RTE_MBUF_DYNFIELD((m), rte_flow_dynf_metadata_offs, uint32_t
> > > > *)
> > > > > +
> > > > > +/* Mbuf dynamic flag for metadata. */ #define
> > > > > +PKT_RX_DYNF_METADATA
> > > > > +(rte_flow_dynf_metadata_mask)
> > > > > +
> > > >
> > > > I wonder if helpers like this wouldn't be better, because they
> > > > combine the flag and the field:
> > > >
> > > > /**
> > > >  * Set metadata dynamic field and flag in mbuf.
> > > >  *
> > > >  * rte_flow_dynf_metadata_register() must have been called first.
> > > >  */
> > > > __rte_experimental
> > > > static inline void rte_mbuf_dyn_metadata_set(struct rte_mbuf *m,
> > > >                                        uint32_t metadata) {
> > > >        *RTE_MBUF_DYNFIELD(m, rte_flow_dynf_metadata_offs,
> > > >                        uint32_t *) = metadata;
> > > >        m->ol_flags |= rte_flow_dynf_metadata_mask; }
> > > Setting flag looks redundantly.
> > > What if driver just replaces the metadata and flag is already set?
> > > The other option - the flags (for set of fields) might be set in combinations.
> > > mbuf field is supposed to be engaged in datapath, performance is very
> > > critical, adding one more abstraction layer seems not to be relevant.
> > 
> > Ok, that was just a suggestion. Let's use your accessors if you fear a
> > performance impact.
> The simple example - mlx5 PMD has the rx_burst routine implemented
> with vector instructions, and it processes four packets at once. No need
> to check field availability four times, and the storing the metadata
> is the subject for further optimization with vector instructions.
> It is a bit difficult to provide common helpers to handle the metadata
> field due to extremely high optimization requirements.

ok, got it

> > Nevertheless I suggest to use static inline functions in place of macros if
> > possible. For RTE_MBUF_DYNFIELD(), I used a macro because it's the only
> > way to provide a type to cast the result. But in your case, you know it's a
> > uint32_t *.
> What If one needs to specify the address of field? Macro allows to do that,
> inline functions - do not. Packets may be processed in bizarre ways,
> for example in a batch, with vector instructions. OK, I'll provide
> the set/get routines, but I'm not sure whether will use ones in mlx5 code.
> In my opinion it just obscures the field nature. Field is just a field, AFAIU, 
> it is main idea of your patch, the way to handle dynamic field should be close
> to handling usual static fields, I think. Macro pointer follows this approach,
> routines - does not.

Well, I just think that:
  rte_mbuf_set_timestamp(m, 1234);
is more readable than:
  *RTE_MBUF_TIMESTAMP(m) = 1234;

Anyway, in your case, if you need to use vector instructions in the PMD,
I guess you will directly use the offset.

> > > Also, metadata is not feature of mbuf. It should have rte_flow prefix.
> > 
> > Yes, sure. The example derives from a test I've done, and I forgot to change
> > it.
> > 
> > 
> > > > /**
> > > >  * Get metadata dynamic field value in mbuf.
> > > >  *
> > > >  * rte_flow_dynf_metadata_register() must have been called first.
> > > >  */
> > > > __rte_experimental
> > > > static inline int rte_mbuf_dyn_metadata_get(const struct rte_mbuf *m,
> > > >                                        uint32_t *metadata) {
> > > >        if ((m->ol_flags & rte_flow_dynf_metadata_mask) == 0)
> > > >                return -1;
> > > What if metadata is 0xFFFFFFFF ?
> > > The checking of availability might embrace larger code block, so this
> > > might be not the best place to check availability.
> > >
> > > >        *metadata = *RTE_MBUF_DYNFIELD(m,
> > rte_flow_dynf_metadata_offs,
> > > >                                uint32_t *);
> > > >        return 0;
> > > > }
> > > >
> > > > /**
> > > >  * Delete the metadata dynamic flag in mbuf.
> > > >  *
> > > >  * rte_flow_dynf_metadata_register() must have been called first.
> > > >  */
> > > > __rte_experimental
> > > > static inline void rte_mbuf_dyn_metadata_del(struct rte_mbuf *m) {
> > > >        m->ol_flags &= ~rte_flow_dynf_metadata_mask; }
> > > >
> > > Sorry, I do not see the practical usecase for these helpers. In my opinion it
> > is just some kind of obscuration.
> > > They do replace the very simple code and introduce some risk of
> > performance impact.
> > >
> > > >
> > > > >  /*
> > > > >   * Definition of a single action.
> > > > >   *
> > > > > @@ -2533,6 +2588,32 @@ enum rte_flow_conv_op {  };
> > > > >
> > > > >  /**
> > > > > + * Check if mbuf dynamic field for metadata is registered.
> > > > > + *
> > > > > + * @return
> > > > > + *   True if registered, false otherwise.
> > > > > + */
> > > > > +__rte_experimental
> > > > > +static inline int
> > > > > +rte_flow_dynf_metadata_avail(void) {
> > > > > +	return !!rte_flow_dynf_metadata_mask; }
> > > >
> > > > _registered() instead of _avail() ?
> > > Accepted, sounds better.
> 
> Hmm, I changed my opinion - we already have
> rte_flow_dynf_metadata_register(void). Is it OK to have
> rte_flow_dynf_metadata_registerED(void) ?
> It would be easy to mistype. 

what about xxx_is_registered() ?
if you feel it's too long, ok, let's keep avail()

> 
> > >
> > > >
> > > > > +
> > > > > +/**
> > > > > + * Register mbuf dynamic field and flag for metadata.
> > > > > + *
> > > > > + * This function must be called prior to use SET_META action in
> > > > > +order to
> > > > > + * register the dynamic mbuf field. Otherwise, the data cannot be
> > > > > +delivered to
> > > > > + * application.
> > > > > + *
> > > > > + * @return
> > > > > + *   0 on success, a negative errno value otherwise and rte_errno is
> > set.
> > > > > + */
> > > > > +__rte_experimental
> > > > > +int
> > > > > +rte_flow_dynf_metadata_register(void);
> > > > > +
> > > > > +/**
> > > > >   * Check whether a flow rule can be created on a given port.
> > > > >   *
> > > > >   * The flow rule is validated for correctness and whether it
> > > > > could be accepted diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h
> > > > > b/lib/librte_mbuf/rte_mbuf_dyn.h index 6e2c816..4ff33ac 100644
> > > > > --- a/lib/librte_mbuf/rte_mbuf_dyn.h
> > > > > +++ b/lib/librte_mbuf/rte_mbuf_dyn.h
> > > > > @@ -160,4 +160,12 @@ int rte_mbuf_dynflag_lookup(const char
> > *name,
> > > > >   */
> > > > >  #define RTE_MBUF_DYNFIELD(m, offset, type) ((type)((uintptr_t)(m)
> > > > > +
> > > > > (offset)))
> > > > >
> > > > > +/**
> > > > > + * Flow metadata dynamic field definitions.
> > > > > + */
> > > > > +#define MBUF_DYNF_METADATA_NAME "flow-metadata"
> > > > > +#define MBUF_DYNF_METADATA_SIZE sizeof(uint32_t) #define
> > > > > +MBUF_DYNF_METADATA_ALIGN __alignof__(uint32_t) #define
> > > > > +MBUF_DYNF_METADATA_FLAGS 0
> > > >
> > > > If this flag is only to be used in rte_flow, it can stay in rte_flow.
> > > > The name should follow the function name conventions, I suggest
> > > > "rte_flow_metadata".
> > >
> > > The definitions:
> > > MBUF_DYNF_METADATA_NAME,
> > > MBUF_DYNF_METADATA_SIZE,
> > > MBUF_DYNF_METADATA_ALIGN
> > > are global. rte_flow proposes only minimal set tyo check and access
> > > the metadata. By knowing the field names applications would have the
> > > more flexibility in processing the fields, for example it allows to
> > > optimize the handling of multiple dynamic fields . The definition of
> > > metadata size allows to generate optimized code:
> > > #if MBUF_DYNF_METADATA_SIZE == sizeof(uint32)
> > > 	*RTE_MBUF_DYNFIELD(m) = get_metadata_32bit() #else
> > > 	*RTE_MBUF_DYNFIELD(m) = get_metadata_64bit() #endif
> > 
> > I don't see any reason why the same dynamic field could have different sizes,
> > I even think it could be dangerous. Your accessors suppose that the
> > metadata is a uint32_t. Having a compile-time option for that does not look
> > desirable.
> 
> I tried to provide maximal flexibility and It was just an example of the thing
> we could do with global definitions. If you think we do not need it - OK,
> let's do things simpler.
> 
> > 
> > Just a side note: we have to take care when adding a new *public* dynamic
> > field that it won't change in the future: the accessors are macros or static
> > inline functions, so they are embedded in the binaries.
> > This is probably something we should discuss and may not be when updating
> > the dpdk (as shared lib).
> 
> Yes, agree, defines just will not work correct in correct way and even break an ABI.
> As we decided - global metadata defines MBUF_DYNF_METADATA_xxxx
> should be removed.
> 
> > 
> > > MBUF_DYNF_METADATA_FLAGS flag is not used by rte_flow, this flag is
> > > related exclusively to dynamic mbuf  " Reserved for future use, must be 0".
> > > Would you like to drop this definition?
> > >
> > > >
> > > > If the flag is going to be used in several places in dpdk (rte_flow,
> > > > pmd, app, ...), I wonder if it shouldn't be defined it in rte_mbuf_dyn.c. I
> > mean:
> > > >
> > > > ====
> > > > /* rte_mbuf_dyn.c */
> > > > const struct rte_mbuf_dynfield rte_mbuf_dynfield_flow_metadata = {
> > > >    ...
> > > > };
> > > In this case we would make this descriptor global.
> > > It is no needed, because there Is no supposed any usage, but by
> > > rte_flow_dynf_metadata_register() only. The
> > 
> > Yes, in my example I wasn't sure it was going to be private to rte_flow (see
> > "If the flag is going to be used in several places in dpdk (rte_flow, pmd, app,
> > ...)").
> > 
> > So yes, I agree the struct should remain private.
> OK.
> 
> > 
> > 
> > > > int rte_mbuf_dynfield_flow_metadata_offset = -1; const struct
> > > > rte_mbuf_dynflag rte_mbuf_dynflag_flow_metadata = {
> > > >    ...
> > > > };
> > > > int rte_mbuf_dynflag_flow_metadata_bitnum = -1;
> > > >
> > > > int rte_mbuf_dyn_flow_metadata_register(void)
> > > > {
> > > > ...
> > > > }
> > > >
> > > > /* rte_mbuf_dyn.h */
> > > > extern const struct rte_mbuf_dynfield
> > > > rte_mbuf_dynfield_flow_metadata; extern int
> > > > rte_mbuf_dynfield_flow_metadata_offset;
> > > > extern const struct rte_mbuf_dynflag rte_mbuf_dynflag_flow_metadata;
> > > > extern int rte_mbuf_dynflag_flow_metadata_bitnum;
> > > >
> > > > ...helpers to set/get metadata...
> > > > ===
> > > >
> > > > Centralizing the definitions of non-private dynamic fields/flags in
> > > > rte_mbuf_dyn may help other people to reuse a field that is well
> > > > described if it match their use-case.
> > >
> > > Yes, centralizing is important, that's why MBUF_DYNF_METADATA_xxx
> > > placed in rte_mbuf_dyn.h. Do you think we should share the descriptors
> > either?
> > > I have no idea why someone (but rte_flow_dynf_metadata_register())
> > > might register metadata field directly.
> > 
> > If the field is private to rte_flow, yes, there is no need to share the "struct
> > rte_mbuf_dynfield". Even the rte_flow_dynf_metadata_register() could be
> > marked as internal, right?
> rte_flow_dynf_metadata_register() is intended to be called by application.
> Some applications might wish to engage metadata feature, some ones - not.
> 
> > 
> > One more question: I see the registration is done by
> > parse_vc_action_set_meta(). My understanding is that this function is not in
> > datapath, and is called when configuring rte_flow. Do you confirm?
> Rather it is called to configure application in general. If user sets metadata 
> (by issuing the appropriate command) it is assumed he/she would like
> the metadata on Rx side either. This is just for test purposes and it is not brilliant
> example of rte_flow_dynf_metadata_register() use case.
> 
> 
> > 
> > > > In your case, what is carried by metadata? Could it be reused by
> > > > others? I think some more description is needed.
> > > In my case, metadata is just opaquie rte_flow related 32-bit unsigned
> > > value provided by
> > > mlx5 hardrware in rx datapath. I have no guess whether someone wishes
> > to reuse.
> > 
> > What is the user supposed to do with this value? If it is hw-specific data, I
> > think the name of the mbuf field should include "MLX", and it should be
> > described.
> 
> Metadata are not HW specific at all - they neither control nor are produced
> by HW (abstracting from the flow engine is implemented in HW).
> Metadata are some opaque data, it is some kind of link between data
> path and flow space.  With metadata application may provide some per packet
> information to flow engine and get back some information from flow engine.
> it is generic concept, supposed to be neither HW-related nor vendor specific.

ok, understood, it's like a mark or tag.

> > Are these rte_flow actions somehow specific to mellanox drivers ?
> 
> AFAIK, currently it is going to be supported by mlx5 PMD only,
> but concept is common and is not vendor specific.
> 
> > 
> > > Brief summary of you comment (just to make sure I understood your
> > proposal in correct way):
> > > 1. drop all definitions MBUF_DYNF_METADATA_xxx, leave
> > > MBUF_DYNF_METADATA_NAME only 2. move the descriptor const struct
> > > rte_mbuf_dynfield desc_offs = {} to rte_mbuf_dyn.c and make it global
> > > 3. provide helpers to access metadata
> > >
> > > [1] and [2] look OK in general. Although I think these ones make code less
> > flexible, restrict the potential compile time options.
> > > For now it is rather theoretical question, if you insist on your
> > > approach - please, let me know, I'll address [1] and [2] and update.my
> > patch.
> > 
> > [1] I think the #define only adds an indirection, and I didn't see any
> >     perf constraint here.
> > [2] My previous comment was surely not clear, sorry. The code can stay
> >     in rte_flow.
> > 
> > > As for [3] - IMHO, the extra abstraction layer is not useful, and might be
> > even harmful.
> > > I tend not to complicate the code, at least, for now.
> > 
> > [3] ok for me
> > 
> > 
> > Thanks,
> > Olivier
> 
> With best regards, Slava

Thanks


More information about the dev mailing list