[dpdk-dev] [PATCH v8 1/3] mempool: support external mempool operations

Jan Viktorin viktorin at rehivetech.com
Thu Jun 9 15:09:18 CEST 2016


On Thu, 9 Jun 2016 10:39:46 +0100
"Hunt, David" <david.hunt at intel.com> wrote:

> Hi Shreyansh,
> 
> On 8/6/2016 2:48 PM, Shreyansh Jain wrote:
> > Hi David,
> >
> > Thanks for explanation. I have some comments inline...
> >  
> >> -----Original Message-----
> >> From: Hunt, David [mailto:david.hunt at intel.com]
> >> Sent: Tuesday, June 07, 2016 2:56 PM
> >> To: Shreyansh Jain <shreyansh.jain at nxp.com>; dev at dpdk.org
> >> Cc: olivier.matz at 6wind.com; viktorin at rehivetech.com;
> >> jerin.jacob at caviumnetworks.com
> >> Subject: Re: [dpdk-dev] [PATCH v8 1/3] mempool: support external mempool
> >> operations
> >>
> >> Hi Shreyansh,
> >>
> >> On 6/6/2016 3:38 PM, Shreyansh Jain wrote:  
> >>> Hi,
> >>>
> >>> (Apologies for overly-eager email sent on this thread earlier. Will be more  
> >> careful in future).  
> >>> This is more of a question/clarification than a comment. (And I have taken  
> >> only some snippets from original mail to keep it cleaner)  
> >>> <snip>  
> >>>> +MEMPOOL_REGISTER_OPS(ops_mp_mc);
> >>>> +MEMPOOL_REGISTER_OPS(ops_sp_sc);
> >>>> +MEMPOOL_REGISTER_OPS(ops_mp_sc);
> >>>> +MEMPOOL_REGISTER_OPS(ops_sp_mc);  
> >>> <snip>
> >>>
> >>>   From the above what I understand is that multiple packet pool handlers can  
> >> be created.  
> >>> I have a use-case where application has multiple pools but only the packet  
> >> pool is hardware backed. Using the hardware for general buffer requirements
> >> would prove costly.  
> >>>   From what I understand from the patch, selection of the pool is based on  
> >> the flags below.
> >>
> >> The flags are only used to select one of the default handlers for
> >> backward compatibility through
> >> the rte_mempool_create call. If you wish to use a mempool handler that
> >> is not one of the
> >> defaults, (i.e. a new hardware handler), you would use the
> >> rte_create_mempool_empty
> >> followed by the rte_mempool_set_ops_byname call.
> >> So, for the external handlers, you create and empty mempool, then set
> >> the operations (ops)
> >> for that particular mempool.  
> > I am concerned about the existing applications (for example, l3fwd).
> > Explicit calls to 'rte_create_mempool_empty->rte_mempool_set_ops_byname' model would require modifications to these applications.
> > Ideally, without any modifications, these applications should be able to use packet pools (backed by hardware) and buffer pools (backed by ring/others) - transparently.
> >
> > If I go by your suggestions, what I understand is, doing the above without modification to applications would be equivalent to:
> >
> >    struct rte_mempool_ops custom_hw_allocator = {...}
> >
> > thereafter, in config/common_base:
> >
> >    CONFIG_RTE_DEFAULT_MEMPOOL_OPS="custom_hw_allocator"
> >
> > calls to rte_pktmbuf_pool_create would use the new allocator.  
> 
> Yes, correct. But only for calls to rte_pktmbuf_pool_create(). Calls to 
> rte_mempool_create will continue to use the default handlers (ring based).
> > But, another problem arises here.
> >
> > There are two distinct paths for allocations of a memory pool:
> > 1. A 'pkt' pool:
> >     rte_pktmbuf_pool_create
> >       \- rte_mempool_create_empty
> >       |   \- rte_mempool_set_ops_byname(..ring_mp_mc..)
> >       |
> >       `- rte_mempool_set_ops_byname
> >             (...RTE_MBUF_DEFAULT_MEMPOOL_OPS..)
> >             /* Override default 'ring_mp_mc' of
> >              * rte_mempool_create */
> >
> > 2. Through generic mempool create API
> >     rte_mempool_create
> >       \- rte_mempool_create_empty
> >             (passing pktmbuf and pool constructors)
> >    
> > I found various instances in example applications where rte_mempool_create() is being called directly for packet pools - bypassing the more semantically correct call to rte_pktmbuf_* for packet pools.
> >
> > In (2) control path, RTE_MBUF_DEFAULT_MEMPOOLS_OPS wouldn't be able to replace custom handler operations for packet buffer allocations.
> >
> >  From a performance point-of-view, Applications should be able to select between packet pools and non-packet pools.  
> 
> This is intended for backward compatibility, and API consistency. Any 
> applications that use
> rte_mempool_create directly will continue to use the default mempool 
> handlers. If the need
> to use a custeom hander, they will need to be modified to call the newer 
> API,
> rte_mempool_create_empty and rte_mempool_set_ops_byname.
> 
> 
> >>> <snip>  
> >>>> +	/*
> >>>> +	 * Since we have 4 combinations of the SP/SC/MP/MC examine the flags to
> >>>> +	 * set the correct index into the table of ops structs.
> >>>> +	 */
> >>>> +	if (flags & (MEMPOOL_F_SP_PUT | MEMPOOL_F_SC_GET))
> >>>> +		rte_mempool_set_ops_byname(mp, "ring_sp_sc");
> >>>> +	else if (flags & MEMPOOL_F_SP_PUT)
> >>>> +		rte_mempool_set_ops_byname(mp, "ring_sp_mc");
> >>>> +	else if (flags & MEMPOOL_F_SC_GET)
> >>>> +		rte_mempool_set_ops_byname(mp, "ring_mp_sc");
> >>>> +	else
> >>>> +		rte_mempool_set_ops_byname(mp, "ring_mp_mc");
> >>>> +  
> > My suggestion is to have an additional flag, 'MEMPOOL_F_PKT_ALLOC', which, if specified, would:
> >
> > ...
> > #define MEMPOOL_F_SC_GET    0x0008
> > #define MEMPOOL_F_PKT_ALLOC 0x0010
> > ...
> >
> > in rte_mempool_create_empty:
> >     ... after checking the other MEMPOOL_F_* flags...
> >
> >      if (flags & MEMPOOL_F_PKT_ALLOC)
> >          rte_mempool_set_ops_byname(mp, RTE_MBUF_DEFAULT_MEMPOOL_OPS)
> >
> > And removing the redundant call to rte_mempool_set_ops_byname() in rte_pktmbuf_create_pool().
> >
> > Thereafter, rte_pktmbuf_pool_create can be changed to:
> >
> >        ...
> >      mp = rte_mempool_create_empty(name, n, elt_size, cache_size,
> > -        sizeof(struct rte_pktmbuf_pool_private), socket_id, 0);
> > +        sizeof(struct rte_pktmbuf_pool_private), socket_id,
> > +        MEMPOOL_F_PKT_ALLOC);
> >      if (mp == NULL)
> >          return NULL;  
> 
> Yes, this would simplify somewhat the creation of a pktmbuf pool, in 
> that it replaces
> the rte_mempool_set_ops_byname with a flag bit. However, I'm not sure we 
> want
> to introduce a third method of creating a mempool to the developers. If we
> introduced this, we would then have:
> 1. rte_pktmbuf_pool_create()
> 2. rte_mempool_create_empty() with MEMPOOL_F_PKT_ALLOC set (which would
>     use the configured custom handler)
> 3. rte_mempool_create_empty() with MEMPOOL_F_PKT_ALLOC __not__ set followed
>     by a call to rte_mempool_set_ops_byname() (would allow several 
> different custom
>     handlers to be used in one application
> 
> Does anyone else have an opinion on this? Oliver, Jerin, Jan?

I am quite careful about this topic as I don't feel to be very involved in all the
use cases. My opinion is that the _new API_ should be able to cover all cases and
the _old API_ should be backwards compatible, however, built on top of the _new API_.

I.e. I think, the flags MEMPOOL_F_SP_PUT, MEMPOOL_F_SC_GET (relicts of the old API)
should be accepted by the old API ONLY. The rte_mempool_create_empty should not process
them. Similarly for a potential MEMPOOL_F_PKT_ALLOC, I would not polute the rte_mempool_create_empty
by this anymore.

In overall we would get exactly 2 approaches (and not more):

* using rte_mempool_create with flags calling the rte_mempool_create_empty and
  rte_mempool_set_ops_byname internally (so this layer can be marked as deprecated
  and removed in the future)

* using rte_mempool_create_empty + rte_mempool_set_ops_byname - allowing any customizations
  but with the necessity to change the applications (new preferred API)

So, the old applications can stay as they are (OK, with a possible new flag
MEMPOOL_F_PKT_ALLOC) and the new one can do the same but you have to set the ops
explicitly.

The more different ways of using those APIs we have, the greater hell we have to maintain.

Regards
Jan

> 
> Regards,
> Dave.
> 
> 



-- 
   Jan Viktorin                  E-mail: Viktorin at RehiveTech.com
   System Architect              Web:    www.RehiveTech.com
   RehiveTech
   Brno, Czech Republic


More information about the dev mailing list