[dpdk-dev] [PATCH 15/15] mbuf: move pool pointer in hotter first half
Ananyev, Konstantin
konstantin.ananyev at intel.com
Thu Oct 29 15:15:41 CET 2020
>
> 29/10/2020 11:50, Andrew Rybchenko:
> > On 10/29/20 12:27 PM, Thomas Monjalon wrote:
> > > The mempool pointer in the mbuf struct is moved
> > > from the second to the first half.
> > > It should increase performance on most systems having 64-byte cache line,
> > > i.e. mbuf is split in two cache lines.
> > > On such system, the first half (also called first cache line) is hotter
> > > than the second one where the pool pointer was.
> > >
> > > Moving this field gives more space to dynfield1.
> > >
> > > This is how the mbuf layout looks like (pahole-style):
> > >
> > > word type name byte size
> > > 0 void * buf_addr; /* 0 + 8 */
> > > 1 rte_iova_t buf_iova /* 8 + 8 */
> > > /* --- RTE_MARKER64 rearm_data; */
> > > 2 uint16_t data_off; /* 16 + 2 */
> > > uint16_t refcnt; /* 18 + 2 */
> > > uint16_t nb_segs; /* 20 + 2 */
> > > uint16_t port; /* 22 + 2 */
> > > 3 uint64_t ol_flags; /* 24 + 8 */
> > > /* --- RTE_MARKER rx_descriptor_fields1; */
> > > 4 uint32_t union packet_type; /* 32 + 4 */
> > > uint32_t pkt_len; /* 36 + 4 */
> > > 5 uint16_t data_len; /* 40 + 2 */
> > > uint16_t vlan_tci; /* 42 + 2 */
> > > 5.5 uint64_t union hash; /* 44 + 8 */
> > > 6.5 uint16_t vlan_tci_outer; /* 52 + 2 */
> > > uint16_t buf_len; /* 54 + 2 */
> > > 7 struct rte_mempool * pool; /* 56 + 8 */
> > > /* --- RTE_MARKER cacheline1; */
> > > 8 struct rte_mbuf * next; /* 64 + 8 */
> > > 9 uint64_t union tx_offload; /* 72 + 8 */
> > > 10 uint16_t priv_size; /* 80 + 2 */
> > > uint16_t timesync; /* 82 + 2 */
> > > uint32_t seqn; /* 84 + 4 */
> > > 11 struct rte_mbuf_ext_shared_info * shinfo; /* 88 + 8 */
> > > 12 uint64_t dynfield1[4]; /* 96 + 32 */
> > > 16 /* --- END 128 */
> > >
> > > Signed-off-by: Thomas Monjalon <thomas at monjalon.net>
> >
> > I'd like to understand why pool is chosen instead of, for
> > example, next pointer.
> >
> > Pool is used on housekeeping when driver refills Rx ring or
> > free completed Tx mbufs. Free thresholds try to avoid it on
> > every Rx/Tx burst (if possible).
> >
> > Next is used for multi-segment Tx and scattered (and buffer
> > split) Rx. IMHO the key question here is we consider these
> > use cases as common and priority to optimize. If yes, I'd
> > vote to have next on the first cacheline.
Between these two I also would probably lean towards *next*
(after all _free_ also has to access/update next).
As another alternative to consider: tx_offload.
It is also used quite widely.
> >
> > I'm not sure. Just trying to hear a bit more about it.
> .
> That's a good question.
> Clearly pool and next are good options.
> The best would be to have some benchmarks.
> If one use case shows no benefit, the decision is easier.
>
> If you prefer, we can leave this last patch for -rc3.
>
More information about the dev
mailing list