[dpdk-dev] [PATCH] rte_mbuf: scattered pktmbufs freeing optimization

Vadim Suraev vadim.suraev at gmail.com
Fri Feb 27 18:09:32 CET 2015


Hi, Olivier, Hi Konstantin,

>Indeed, this function looks useful, and I also have a work in progress
>on this topic, but currently it is not well tested.
I'm sorry, I didn't know. I'll not interfere with my patch))

>About the inlining, I have no objection now, although Stephen may be
>right. I think we can consider un-inline some functions, based on
>performance measurements.
I've also noticed in many cases it makes no difference. Seems to be some
trade-off.

>- clarify the difference between raw_alloc/raw_free and
>  mempool_get/mempool_put: For instance, I think that the reference
>  counter initialization should be managed by rte_pktmbuf_reset() like
>  the rest of the fields, therefore raw_alloc/raw_free could be replaced
It looks useful to me since not all of the fields are used in every
particular application so
if the allocation is decoupled from reset, one can save some cycles.

Regards,
 Vadim.

On Fri, Feb 27, 2015 at 3:20 PM, Olivier MATZ <olivier.matz at 6wind.com>
wrote:

> Hi Vadim, Hi Konstantin,
>
> On 02/27/2015 01:18 PM, Vadim Suraev wrote:
> > Hi, Konstantin,
> >
> >> Seems really useful.
>
> Indeed, this function looks useful, and I also have a work in progress
> on this topic, but currently it is not well tested.
>
> As we are on the mbuf subject, for 2.1, I would like to discuss some
> points of mbuf API:
>
> - support mbuf and segment bulk allocation/free
> - clarify the difference between raw_alloc/raw_free and
>   mempool_get/mempool_put: For instance, I think that the reference
>   counter initialization should be managed by rte_pktmbuf_reset() like
>   the rest of the fields, therefore raw_alloc/raw_free could be replaced
>   by mempool functions (this would imply some modifications in drivers).
> - clarify which functions are internal and which are not. Some could be
>   made public and hence should be documented: from what I see, the
>   vhost example currently uses internal APIs and I suspect it may
>   not work when MBUF_DEBUG=y.
> - factorize rte_rxmbuf_alloc() which is duplicated in almost any driver
> - discuss if driver should be calling pktmbuf_reset().
>   It would avoid some bugs where fields are not properly initialized.
> - check if rte_pktmbuf_free() is called on mbufs with m->next or
>   refcount not initialized
> - support clones of clones
> - support attach/detach of mbufs embedding a private part
> - move m->next in the first cache line as it's initialized in rx
> - rename "__rte_pktmbuf_prefree_seg" in something clearer like "decref".
>
> To avoid doing the work twice, please note that I already have an
> implementation for:
> - rte_mbuf_raw_free_bulk(): each mbuf return in its own pool
> - rte_mbuf_free_seg_bulk(): similar to call rte_pktmbuf_free_seg()
>   on each mbuf
> but they are not tested at all now.
>
> I will submit a RFC series in the coming weeks that addresses these
> issues. It could be used as a basis for a discussion.
>
> Sorry Vadim for polluting the topic, but I think the subjects are
> quite linked together. Please find some comments on your patch below.
>
> >> One thought - why to introduce the limitation that all mbufs have to be
> > from the same mempool?
> >> I think you can reorder it a bit, so it can handle situation when
> chained
> > mbufs belong to different mempools.
> >
> > I had a doubt, my concern was how practical is that (multiple mempools)
> > case?
>
> I think having several mempool is common on multi-socket machines.
>
> > Do you think there should be two versions: lightweight (with the
> > restriction) and generic?
>
> I think having the version that supports different mempools is more
> important. Checking if the mempool is the same may not cost too much
> compared to the rest of the work. But if we find a good use case
> where having the lightweight version is preferable, we could have
> 2 versions (ex: if performance is really different).
>
> >>> +/* This macro defines the size of max bulk of mbufs to free for
> >> rte_pktmbuf_free_bulk */
> >>> +#define MAX_MBUF_FREE_SIZE 32
>
> I wonder if it's required to have this macro.
>
> In my implementation, I use a dynamic-sized table:
>
> static inline void
> rte_mbuf_free_seg_bulk(struct rte_mbuf **m_table, unsigned n)
> {
>         void *obj_table[n];
> [...]
>
>
> >>> + *
> >>> + * @param head
> >>> + *   The head of mbufs to be freed chain
> >>> + */
> >>> +
> >>> +static inline void __attribute__((always_inline))
> >>> +rte_pktmbuf_free_bulk(struct rte_mbuf *head)
>
> About the inlining, I have no objection now, although Stephen may be
> right. I think we can consider un-inline some functions, based on
> performance measurements.
>
> >>> +{
> >>> +    void *mbufs[MAX_MBUF_FREE_SIZE];
> >>> +    unsigned mbufs_count = 0;
> >>> +    struct rte_mbuf *next;
> >>> +
> >>> +    RTE_MBUF_MEMPOOL_CHECK1(head);
> >>> +
> >>> +    while(head) {
> >>> +        next = head->next;
> >>> +        head->next = NULL;
> >>> +        if(__rte_pktmbuf_prefree_seg(head)) {
> >>> +            RTE_MBUF_ASSERT(rte_mbuf_refcnt_read(head) == 0);
> >>> +            RTE_MBUF_MEMPOOL_CHECK2(head);
> >>> +            mbufs[mbufs_count++] = head;
> >>> +        }
> >>> +        head = next;
> >>> +        if(mbufs_count == MAX_MBUF_FREE_SIZE) {
> >>> +            rte_mempool_put_bulk(((struct rte_mbuf
> >> *)mbufs[0])->pool,mbufs,mbufs_count);
> >>> +            mbufs_count = 0;
> >>> +        }
> >>> +    }
> >>> +    if(mbufs_count > 0) {
> >>> +        rte_mempool_put_bulk(((struct rte_mbuf
> >> *)mbufs[0])->pool,mbufs,mbufs_count);
> >>> +    }
> >>> +}
> >>> +
>
> Also, I think a unit test should be added.
>
> Thanks,
> Olivier
>


More information about the dev mailing list