[dpdk-dev] [PATCH 2/2] Adding the routines rte_pktmbuf_alloc_bulk() and rte_pktmbuf_free_bulk()

Ananyev, Konstantin konstantin.ananyev at intel.com
Tue Oct 7 17:42:47 CEST 2014


Hi Keith,

> -----Original Message-----
> From: Wiles, Roger Keith [mailto:keith.wiles at windriver.com]
> Sent: Tuesday, October 07, 2014 3:22 PM
> To: Ananyev, Konstantin
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 2/2] Adding the routines rte_pktmbuf_alloc_bulk() and rte_pktmbuf_free_bulk()
> 
> 
> On Oct 7, 2014, at 4:09 AM, Ananyev, Konstantin <konstantin.ananyev at intel.com> wrote:
> 
> >
> >
> >> -----Original Message-----
> >> From: Wiles, Roger Keith [mailto:keith.wiles at windriver.com]
> >> Sent: Monday, October 06, 2014 9:08 PM
> >> To: Ananyev, Konstantin
> >> Cc: dev at dpdk.org
> >> Subject: Re: [dpdk-dev] [PATCH 2/2] Adding the routines rte_pktmbuf_alloc_bulk() and rte_pktmbuf_free_bulk()
> >>
> >> Attaching to the list does not work. If you want the code let me know it is only about 5K in size.
> >>
> >> On Oct 6, 2014, at 2:45 PM, Wiles, Roger Keith <keith.wiles at windriver.com> wrote:
> >>
> >>>
> >>> On Oct 6, 2014, at 11:13 AM, Wiles, Roger Keith <keith.wiles at windriver.com> wrote:
> >>>
> >>>>
> >>>> On Oct 6, 2014, at 10:54 AM, Ananyev, Konstantin <konstantin.ananyev at intel.com> wrote:
> >>>>
> >>>>>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bruce Richardson
> >>>>>> Sent: Monday, October 06, 2014 3:54 PM
> >>>>>> To: Wiles, Roger Keith (Wind River)
> >>>>>> Cc: dev at dpdk.org
> >>>>>> Subject: Re: [dpdk-dev] [PATCH 2/2] Adding the routines rte_pktmbuf_alloc_bulk() and rte_pktmbuf_free_bulk()
> >>>>>>
> >>>>>> On Mon, Oct 06, 2014 at 03:50:38PM +0100, Wiles, Roger Keith wrote:
> >>>>>>> Hi Bruce,
> >>>>>>>
> >>>>>>> Do I need to reject the for the new routines or just make sure the vector driver does not get updated to use those routines?
> >>>>>>>
> >>>>>>
> >>>>>> The new routines are probably useful in the general case. I see no issue
> >>>>>> with having them in the code, so long as the vector driver is not modified
> >>>>>> to use them.
> >>>>>
> >>>>> I 'd say the same thing for non-vector RX/TX PMD code-paths too.
> >>>>>
> >>>>> BTW, are the new functions comments valid?
> >>>>>
> >>>>> + * @return
> >>>>> + *   - 0 if the number of mbufs allocated was ok
> >>>>> + *   - <0 is an ERROR.
> >>>>> + */
> >>>>> +static inline int __rte_mbuf_raw_alloc_bulk(
> >>>>>
> >>>>> Though, as I can see __rte_mbuf_raw_alloc_bulk() returns either:
> >>>>> - number of  allocated mbuf (cnt)
> >>>>> - negative error code
> >>>>
> >>>> Let me fix up the comments.
> >>>>>
> >>>>> And:
> >>>>> + * @return
> >>>>> + *   - The number of valid mbufs pointers in the m_list array.
> >>>>> + *   - Zero if the request cnt could not be allocated.
> >>>>> + */
> >>>>> +static inline int __attribute__((always_inline))
> >>>>> +rte_pktmbuf_alloc_bulk(struct rte_mempool *mp, struct rte_mbuf *m_list[], int16_t cnt)
> >>>>> +{
> >>>>> +     return __rte_mbuf_raw_alloc_bulk(mp, m_list, cnt);
> >>>>> +}
> >>>>>
> >>>>> Shouldn't be "less than zero if the request cnt could not be allocated."?
> >>>>>
> >>>>> BTW, is there any point to have __rte_mbuf_raw_alloc_bulk() at all?
> >>>>> After all, as you are calling rte_pktmbuf_reset() inside it, it doesn't look __raw__ any more.
> >>>>> Might be just put its content into rte_pktmbuf_alloc_bulk() and get rid of it.
> >>>>>
> >>>> I was just following the non-bulk routine style __rte_mbuf_raw_alloc(), but I can pull that into a single routine.
> >>>>
> >>>>> Also wonder, what is the advantage of having multiple counters inside the same loop?
> >>>>> i.e:
> >>>>> +             for(i = 0; i < cnt; i++) {
> >>>>> +                     m = *m_list++;
> >>>>>
> >>>>> Why not just:
> >>>>>
> >>>>> for(i = 0; i < cnt; i++) {
> >>>>> m = &m_list[i];
> >>>>>
> >>>>> Same for free:
> >>>>> +     while(npkts--)
> >>>>> +             rte_pktmbuf_free(*m_list++);
> >>>>>
> >>>>> While not just:
> >>>>> for (i = 0; i < npkts; i++)
> >>>>>   rte_pktmbuf_free(&m_list[i]);
> >>>>
> >>>> Maybe I have it wrong or the compilers are doing the right thing now, but at one point the &m_list[i] would cause the compiler
> to
> >> generate a shift or multiple of 'i' and then add it to the base of m_list. If that is not the case anymore then I can update the code as
> >> you suggested. Using the *m_list++ just adds the size of a pointer to a register and continues.
> >>>
> >>> I compared the clang assembler (.s file) output from an example test code I wrote to see if we have any differences in the code
> >> using the two styles and I found no difference and the code looked the same. I am not a Intel assembler expert and I would
> suggest
> >> someone else determine if it generates different code. I tried to compare the GCC outputs and it did look the same to me.
> >
> > That's was my question:
> > Modern compilers are able to generate a good code for a simple loop as above.
> > So what's the point to use 2 iterators inside the loop, when just one is enough?
> > Nothing wrong technically, but makes code a bit harder to follow.
> > Plus, in general, it is a good practise to minimise number of iterators inside the loop, when possible.
> >
> > Konstantin
> 
> Hi Konstantin,
> 
> I really do not understand the concern if the code is the same, as it appears to me the current patch is very clean and simple. Maybe
> you have not seen the v2 patch and now v3 patch I sent this morning to fix Bruce's comment suggestion.
> 
> For the case of the free routine your suggestion would require an extra counter/variable a bit more code a 'for' loop instead of a
> 'while' loop.

My point was that just one iterator for both loops is enough.
In general, it is a good practise to minimise number of iterators per loop if possible:
in some cases  compiler might get confused and wouldn't be able to eliminate redundant  iterators itself.  
Though yes - technically there is nothing wrong with your approach.
So if you prefer to keep it as it is - I wouldn't insist. 

Konstantin

> +static inline void __attribute__((always_inline))
> +rte_pktmbuf_free_bulk(struct rte_mbuf *m_list[], int16_t npkts)
> +{
> +	while(npkts--)
> +		rte_pktmbuf_free(*m_list++);
> +}
> 
> For the case of the alloc routine I did remove the rte_mbuf * m variable and now I believe it is very clean and changing it to use index
> variables is just a personal preference. I personal preference of this type is not useful IMO and does not cause any harm. Unless you
> can suggest a good technical reason to change I am going to leave the patch as is.
> 
> +static inline int __attribute__((always_inline))
> +rte_pktmbuf_alloc_bulk(struct rte_mempool *mp, struct rte_mbuf *m_list[], int16_t cnt)
> +{
> +   int     ret;
> +
> +   ret = rte_mempool_get_bulk(mp, (void **)m_list, cnt);
> +   if ( ret == 0 ) {
> +       ret = cnt;
> +       while(cnt--) {
> +#ifdef RTE_MBUF_REFCNT
> +           rte_mbuf_refcnt_set(*m_list, 1);
> +#endif /* RTE_MBUF_REFCNT */
> +           rte_pktmbuf_reset(*m_list++);
> +       }
> +   }
> +   return ret;
> +}
> 
> >>>
> >>> I have attached the code and output, please let me know if I did something wrong, but as it stands using the original style is what I
> >> want to go with.
> >>>
> >>>>>
> >>>>> Konstantin
> >>>>>
> >>>>>>
> >>>>>> /Bruce
> >>>>>>
> >>>>>>> Thanks
> >>>>>>> ++Keith
> >>>>>>>
> >>>>>>> On Oct 6, 2014, at 3:56 AM, Richardson, Bruce <bruce.richardson at intel.com> wrote:
> >>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>> -----Original Message-----
> >>>>>>>>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Keith Wiles
> >>>>>>>>> Sent: Sunday, October 05, 2014 12:10 AM
> >>>>>>>>> To: dev at dpdk.org
> >>>>>>>>> Subject: [dpdk-dev] [PATCH 2/2] Adding the routines rte_pktmbuf_alloc_bulk()
> >>>>>>>>> and rte_pktmbuf_free_bulk()
> >>>>>>>>>
> >>>>>>>>> Minor helper routines to mirror the mempool routines and remove the code
> >>>>>>>>> from applications. The ixgbe_rxtx_vec.c routine could be changed to use
> >>>>>>>>> the ret_pktmbuf_alloc_bulk() routine inplace of rte_mempool_get_bulk().
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> I believe such a change would cause a performance regression, as the extra init code in the alloc_bulk() function would
> take
> >>>>>> additional cycles and is not needed. The vector routines use the mempool function directly, so that there is no overhead of
> >> mbuf
> >>>>>> initialization, as the vector routines use their additional "knowledge" of what the mbufs will be used for to init them in a faster
> >> manner
> >>>>>> than can be done inside the mbuf library.
> >>>>>>>>
> >>>>>>>> /Bruce
> >>>>>>>>
> >>>>>>>>> Signed-off-by: Keith Wiles <keith.wiles at windriver.com>
> >>>>>>>>> ---
> >>>>>>>>> lib/librte_mbuf/rte_mbuf.h | 77
> >>>>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++
> >>>>>>>>> 1 file changed, 77 insertions(+)
> >>>>>>>>>
> >>>>>>>>> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> >>>>>>>>> index 1c6e115..f298621 100644
> >>>>>>>>> --- a/lib/librte_mbuf/rte_mbuf.h
> >>>>>>>>> +++ b/lib/librte_mbuf/rte_mbuf.h
> >>>>>>>>> @@ -546,6 +546,41 @@ static inline void rte_pktmbuf_reset(struct rte_mbuf
> >>>>>>>>> *m)
> >>>>>>>>> }
> >>>>>>>>>
> >>>>>>>>> /**
> >>>>>>>>> + * @internal Allocate a list of mbufs from mempool *mp*.
> >>>>>>>>> + * The use of that function is reserved for RTE internal needs.
> >>>>>>>>> + * Please use rte_pktmbuf_alloc_bulk().
> >>>>>>>>> + *
> >>>>>>>>> + * @param mp
> >>>>>>>>> + *   The mempool from which mbuf is allocated.
> >>>>>>>>> + * @param m_list
> >>>>>>>>> + *   The array to place the allocated rte_mbufs pointers.
> >>>>>>>>> + * @param cnt
> >>>>>>>>> + *   The number of mbufs to allocate
> >>>>>>>>> + * @return
> >>>>>>>>> + *   - 0 if the number of mbufs allocated was ok
> >>>>>>>>> + *   - <0 is an ERROR.
> >>>>>>>>> + */
> >>>>>>>>> +static inline int __rte_mbuf_raw_alloc_bulk(struct rte_mempool *mp, struct
> >>>>>>>>> rte_mbuf *m_list[], int cnt)
> >>>>>>>>> +{
> >>>>>>>>> +     struct rte_mbuf *m;
> >>>>>>>>> +     int             ret;
> >>>>>>>>> +
> >>>>>>>>> +     ret = rte_mempool_get_bulk(mp, (void **)m_list, cnt);
> >>>>>>>>> +     if ( ret == 0 ) {
> >>>>>>>>> +             int             i;
> >>>>>>>>> +             for(i = 0; i < cnt; i++) {
> >>>>>>>>> +                     m = *m_list++;
> >>>>>>>>> +#ifdef RTE_MBUF_REFCNT
> >>>>>>>>> +                     rte_mbuf_refcnt_set(m, 1);
> >>>>>>>>> +#endif /* RTE_MBUF_REFCNT */
> >>>>>>>>> +                     rte_pktmbuf_reset(m);
> >>>>>>>>> +             }
> >>>>>>>>> +             ret = cnt;
> >>>>>>>>> +     }
> >>>>>>>>> +     return ret;
> >>>>>>>>> +}
> >>>>>>>>> +
> >>>>>>>>> +/**
> >>>>>>>>> * Allocate a new mbuf from a mempool.
> >>>>>>>>> *
> >>>>>>>>> * This new mbuf contains one segment, which has a length of 0. The pointer
> >>>>>>>>> @@ -671,6 +706,32 @@ __rte_pktmbuf_prefree_seg(struct rte_mbuf *m)
> >>>>>>>>> }
> >>>>>>>>>
> >>>>>>>>> /**
> >>>>>>>>> + * Allocate a list of mbufs from a mempool into a mbufs array.
> >>>>>>>>> + *
> >>>>>>>>> + * This mbuf list contains one segment per mbuf, which has a length of 0. The
> >>>>>>>>> pointer
> >>>>>>>>> + * to data is initialized to have some bytes of headroom in the buffer
> >>>>>>>>> + * (if buffer size allows).
> >>>>>>>>> + *
> >>>>>>>>> + * The routine is just a simple wrapper routine to reduce code in the application
> >>>>>>>>> and
> >>>>>>>>> + * provide a cleaner API for multiple mbuf requests.
> >>>>>>>>> + *
> >>>>>>>>> + * @param mp
> >>>>>>>>> + *   The mempool from which the mbuf is allocated.
> >>>>>>>>> + * @param m_list
> >>>>>>>>> + *   An array of mbuf pointers, cnt must be less then or equal to the size of the
> >>>>>>>>> list.
> >>>>>>>>> + * @param cnt
> >>>>>>>>> + *   Number of slots in the m_list array to fill.
> >>>>>>>>> + * @return
> >>>>>>>>> + *   - The number of valid mbufs pointers in the m_list array.
> >>>>>>>>> + *   - Zero if the request cnt could not be allocated.
> >>>>>>>>> + */
> >>>>>>>>> +static inline int __attribute__((always_inline))
> >>>>>>>>> +rte_pktmbuf_alloc_bulk(struct rte_mempool *mp, struct rte_mbuf *m_list[],
> >>>>>>>>> int16_t cnt)
> >>>>>>>>> +{
> >>>>>>>>> +     return __rte_mbuf_raw_alloc_bulk(mp, m_list, cnt);
> >>>>>>>>> +}
> >>>>>>>>> +
> >>>>>>>>> +/**
> >>>>>>>>> * Free a segment of a packet mbuf into its original mempool.
> >>>>>>>>> *
> >>>>>>>>> * Free an mbuf, without parsing other segments in case of chained
> >>>>>>>>> @@ -708,6 +769,22 @@ static inline void rte_pktmbuf_free(struct rte_mbuf
> >>>>>>>>> *m)
> >>>>>>>>>  }
> >>>>>>>>> }
> >>>>>>>>>
> >>>>>>>>> +/**
> >>>>>>>>> + * Free a list of packet mbufs back into its original mempool.
> >>>>>>>>> + *
> >>>>>>>>> + * Free a list of mbufs by calling rte_pktmbuf_free() in a loop as a wrapper
> >>>>>>>>> function.
> >>>>>>>>> + *
> >>>>>>>>> + * @param m_list
> >>>>>>>>> + *   An array of rte_mbuf pointers to be freed.
> >>>>>>>>> + * @param npkts
> >>>>>>>>> + *   Number of packets to free in list.
> >>>>>>>>> + */
> >>>>>>>>> +static inline void rte_pktmbuf_free_bulk(struct rte_mbuf *m_list[], int16_t
> >>>>>>>>> npkts)
> >>>>>>>>> +{
> >>>>>>>>> +     while(npkts--)
> >>>>>>>>> +             rte_pktmbuf_free(*m_list++);
> >>>>>>>>> +}
> >>>>>>>>> +
> >>>>>>>>> #ifdef RTE_MBUF_REFCNT
> >>>>>>>>>
> >>>>>>>>> /**
> >>>>>>>>> --
> >>>>>>>>> 2.1.0
> >>>>>>>>
> >>>>>>>
> >>>>>>> Keith Wiles, Principal Technologist with CTO office, Wind River mobile 972-213-5533
> >>>>
> >>>> Keith Wiles, Principal Technologist with CTO office, Wind River mobile 972-213-5533
> >>>
> >>> Keith Wiles, Principal Technologist with CTO office, Wind River mobile 972-213-5533
> >>
> >> Keith Wiles, Principal Technologist with CTO office, Wind River mobile 972-213-5533
> 
> Keith Wiles, Principal Technologist with CTO office, Wind River mobile 972-213-5533
> 



More information about the dev mailing list