[PATCH v7 2/4] mbuf: remove rte marker fields
Morten Brørup
mb at smartsharesystems.com
Thu Mar 21 17:19:25 CET 2024
> From: Tyler Retzlaff [mailto:roretzla at linux.microsoft.com]
> Sent: Thursday, 21 March 2024 16.31
>
> On Thu, Mar 21, 2024 at 10:32:02AM +0000, Bruce Richardson wrote:
> > On Wed, Mar 20, 2024 at 03:01:36PM -0700, Tyler Retzlaff wrote:
> > > RTE_MARKER typedefs are a GCC extension unsupported by MSVC. Remove
> > > RTE_MARKER fields from rte_mbuf struct.
> > >
> > > Maintain alignment of fields after removed cacheline1 marker by placing
> > > C11 alignas(RTE_CACHE_LINE_MIN_SIZE).
> > >
> > > Provide new rearm_data and rx_descriptor_fields1 fields in anonymous
> > > unions as single element arrays of with types matching the original
> > > markers to maintain API compatibility.
> > >
> > > Signed-off-by: Tyler Retzlaff <roretzla at linux.microsoft.com>
> > > ---
> > > doc/guides/rel_notes/release_24_03.rst | 2 +
> > > lib/mbuf/rte_mbuf.h | 4 +-
> > > lib/mbuf/rte_mbuf_core.h | 188 ++++++++++++++++++----------
> -----
> > > 3 files changed, 104 insertions(+), 90 deletions(-)
> > >
> > > diff --git a/doc/guides/rel_notes/release_24_03.rst
> b/doc/guides/rel_notes/release_24_03.rst
> > > index 14826ea..4f18cca 100644
> > > --- a/doc/guides/rel_notes/release_24_03.rst
> > > +++ b/doc/guides/rel_notes/release_24_03.rst
> > > @@ -216,6 +216,8 @@ Removed Items
> > >
> > > * acc101: Removed obsolete code for non productized HW variant.
> > >
> > > +* mbuf: ``RTE_MARKER`` fields ``cacheline0`` and ``cacheline1``
> > > + have been removed from ``struct rte_mbuf``.
> > >
> > > API Changes
> > > -----------
> > > diff --git a/lib/mbuf/rte_mbuf.h b/lib/mbuf/rte_mbuf.h
> > > index 286b32b..4c4722e 100644
> > > --- a/lib/mbuf/rte_mbuf.h
> > > +++ b/lib/mbuf/rte_mbuf.h
> > > @@ -108,7 +108,7 @@
> > > static inline void
> > > rte_mbuf_prefetch_part1(struct rte_mbuf *m)
> > > {
> > > - rte_prefetch0(&m->cacheline0);
> > > + rte_prefetch0(m);
> > > }
> > >
> > > /**
> > > @@ -126,7 +126,7 @@
> > > rte_mbuf_prefetch_part2(struct rte_mbuf *m)
> > > {
> > > #if RTE_CACHE_LINE_SIZE == 64
> > > - rte_prefetch0(&m->cacheline1);
> > > + rte_prefetch0(RTE_PTR_ADD(m, RTE_CACHE_LINE_MIN_SIZE));
> > > #else
> > > RTE_SET_USED(m);
> > > #endif
> > > diff --git a/lib/mbuf/rte_mbuf_core.h b/lib/mbuf/rte_mbuf_core.h
> > > index 9f58076..665213c 100644
> > > --- a/lib/mbuf/rte_mbuf_core.h
> > > +++ b/lib/mbuf/rte_mbuf_core.h
> > > @@ -465,8 +465,6 @@ enum {
> > > * The generic rte_mbuf, containing a packet mbuf.
> > > */
> > > struct __rte_cache_aligned rte_mbuf {
> > > - RTE_MARKER cacheline0;
> > > -
> > > void *buf_addr; /**< Virtual address of segment buffer. */
> > > #if RTE_IOVA_IN_MBUF
> > > /**
> > > @@ -488,116 +486,130 @@ struct __rte_cache_aligned rte_mbuf {
> > > #endif
> > >
> > > /* next 8 bytes are initialised on RX descriptor rearm */
> > > - RTE_MARKER64 rearm_data;
> > > - uint16_t data_off;
> > > -
> > > - /**
> > > - * Reference counter. Its size should at least equal to the size
> > > - * of port field (16 bits), to support zero-copy broadcast.
> > > - * It should only be accessed using the following functions:
> > > - * rte_mbuf_refcnt_update(), rte_mbuf_refcnt_read(), and
> > > - * rte_mbuf_refcnt_set(). The functionality of these functions (atomic,
> > > - * or non-atomic) is controlled by the RTE_MBUF_REFCNT_ATOMIC flag.
> > > - */
> > > - RTE_ATOMIC(uint16_t) refcnt;
> > > + union {
> > > + uint64_t rearm_data[1];
> > > + __extension__
> > > + struct {
> > > + uint16_t data_off;
> > > +
> > > + /**
> > > + * Reference counter. Its size should at least equal to
> the size
> > > + * of port field (16 bits), to support zero-copy
> broadcast.
> > > + * It should only be accessed using the following
> functions:
> > > + * rte_mbuf_refcnt_update(), rte_mbuf_refcnt_read(), and
> > > + * rte_mbuf_refcnt_set(). The functionality of these
> functions (atomic,
> > > + * or non-atomic) is controlled by the
> RTE_MBUF_REFCNT_ATOMIC flag.
> > > + */
> > > + RTE_ATOMIC(uint16_t) refcnt;
> > >
> > > - /**
> > > - * Number of segments. Only valid for the first segment of an mbuf
> > > - * chain.
> > > - */
> > > - uint16_t nb_segs;
> > > + /**
> > > + * Number of segments. Only valid for the first segment of
> an mbuf
> > > + * chain.
> > > + */
> > > + uint16_t nb_segs;
> > >
> > > - /** Input port (16 bits to support more than 256 virtual ports).
> > > - * The event eth Tx adapter uses this field to specify the output port.
> > > - */
> > > - uint16_t port;
> > > + /** Input port (16 bits to support more than 256 virtual
> ports).
> > > + * The event eth Tx adapter uses this field to specify the
> output port.
> > > + */
> > > + uint16_t port;
> > > + };
> > > + };
> > >
> > > uint64_t ol_flags; /**< Offload features. */
> > >
> > > /* remaining bytes are set on RX when pulling packet from descriptor */
> > > - RTE_MARKER rx_descriptor_fields1;
> > > -
> > > - /*
> > > - * The packet type, which is the combination of outer/inner L2, L3, L4
> > > - * and tunnel types. The packet_type is about data really present in the
> > > - * mbuf. Example: if vlan stripping is enabled, a received vlan packet
> > > - * would have RTE_PTYPE_L2_ETHER and not RTE_PTYPE_L2_VLAN because the
> > > - * vlan is stripped from the data.
> > > - */
> > > union {
> > > - uint32_t packet_type; /**< L2/L3/L4 and tunnel information. */
> > > + void *rx_descriptor_fields1[1];
> >
> > Can we make this array the actual size of all the fields, rather than just
> > an 8-byte value? That would allow the right think to be done if assigning
> > the descriptor fields from one mbuf to another, or when using memset or
> > memcpy on them.
>
> Morten pointed out in a previous version that the marker being an array of
> void * was a bug to begin with.
>
> The other field of the union is 24 bytes. I suppose it would be possible
> to conditionally compile the array to be either 3 or 6 elements. I guess
> this would be an improvement over what the marker is doing now.
Agree; it would be an improvement to give it the same size as the other struct in the union.
>
> Just a reminder that we cannot 'correct' the type since that would
> require adaptation of calling code.
I considered the following:
Only drivers should be using rx_descriptor_fields1.
We could probably change it to an array of uint64_t (or uint32_t, or even uint8_t) without breaking anything, because the drivers should only be using the address of rx_descriptor_fields1, not the value of it.
However, keeping it an array of void* is certain to avoid any 32/64 bit CPU alignment related issues, because void* is the natural size to any CPU.
>
> What do others think? Keep it as a single element array or conditional
> compile based on sizeof(void *)?
Going for an array of 3/6 void pointers should be safe. I'm in favor of this.
At your discretion, if you think it clarifies anything, consider adding a comment that the type void* is used for historical reasons (or something similar).
More information about the dev
mailing list