[dpdk-dev] [PATCH 15/15] mbuf: move pool pointer in hotter first half

Thomas Monjalon thomas at monjalon.net
Thu Oct 29 11:56:51 CET 2020


29/10/2020 11:50, Andrew Rybchenko:
> On 10/29/20 12:27 PM, Thomas Monjalon wrote:
> > The mempool pointer in the mbuf struct is moved
> > from the second to the first half.
> > It should increase performance on most systems having 64-byte cache line,
> > i.e. mbuf is split in two cache lines.
> > On such system, the first half (also called first cache line) is hotter
> > than the second one where the pool pointer was.
> > 
> > Moving this field gives more space to dynfield1.
> > 
> > This is how the mbuf layout looks like (pahole-style):
> > 
> > word  type                              name                byte  size
> >  0    void *                            buf_addr;         /*   0 +  8 */
> >  1    rte_iova_t                        buf_iova          /*   8 +  8 */
> >       /* --- RTE_MARKER64               rearm_data;                   */
> >  2    uint16_t                          data_off;         /*  16 +  2 */
> >       uint16_t                          refcnt;           /*  18 +  2 */
> >       uint16_t                          nb_segs;          /*  20 +  2 */
> >       uint16_t                          port;             /*  22 +  2 */
> >  3    uint64_t                          ol_flags;         /*  24 +  8 */
> >       /* --- RTE_MARKER                 rx_descriptor_fields1;        */
> >  4    uint32_t             union        packet_type;      /*  32 +  4 */
> >       uint32_t                          pkt_len;          /*  36 +  4 */
> >  5    uint16_t                          data_len;         /*  40 +  2 */
> >       uint16_t                          vlan_tci;         /*  42 +  2 */
> >  5.5  uint64_t             union        hash;             /*  44 +  8 */
> >  6.5  uint16_t                          vlan_tci_outer;   /*  52 +  2 */
> >       uint16_t                          buf_len;          /*  54 +  2 */
> >  7    struct rte_mempool *              pool;             /*  56 +  8 */
> >       /* --- RTE_MARKER                 cacheline1;                   */
> >  8    struct rte_mbuf *                 next;             /*  64 +  8 */
> >  9    uint64_t             union        tx_offload;       /*  72 +  8 */
> > 10    uint16_t                          priv_size;        /*  80 +  2 */
> >       uint16_t                          timesync;         /*  82 +  2 */
> >       uint32_t                          seqn;             /*  84 +  4 */
> > 11    struct rte_mbuf_ext_shared_info * shinfo;           /*  88 +  8 */
> > 12    uint64_t                          dynfield1[4];     /*  96 + 32 */
> > 16    /* --- END                                             128      */
> > 
> > Signed-off-by: Thomas Monjalon <thomas at monjalon.net>
> 
> I'd like to understand why pool is chosen instead of, for
> example, next pointer.
> 
> Pool is used on housekeeping when driver refills Rx ring or
> free completed Tx mbufs. Free thresholds try to avoid it on
> every Rx/Tx burst (if possible).
> 
> Next is used for multi-segment Tx and scattered (and buffer
> split) Rx. IMHO the key question here is we consider these
> use cases as common and priority to optimize. If yes, I'd
> vote to have next on the first cacheline.
> 
> I'm not sure. Just trying to hear a bit more about it.

That's a good question.
Clearly pool and next are good options.
The best would be to have some benchmarks.
If one use case shows no benefit, the decision is easier.

If you prefer, we can leave this last patch for -rc3.




More information about the dev mailing list