[dpdk-dev] [PATCH v2 1/3] mempool: add stack (lifo) mempool handler

Hunt, David david.hunt at intel.com
Mon Jun 20 14:59:23 CEST 2016


Hi Olivier,

On 20/6/2016 9:17 AM, Olivier Matz wrote:
> Hi David,
>
> On 06/17/2016 04:18 PM, Hunt, David wrote:
>>> After reading it, I realize that it's nearly exactly the same code than
>>> in "app/test: test external mempool handler".
>>> http://patchwork.dpdk.org/dev/patchwork/patch/12896/
>>>
>>> We should drop one of them. If this stack handler is really useful for
>>> a performance use-case, it could go in librte_mempool. At the first
>>> read, the code looks like a demo example : it uses a simple spinlock for
>>> concurrent accesses to the common pool. Maybe the mempool cache hides
>>> this cost, in this case we could also consider removing the use of the
>>> rte_ring.
>> While I agree that the code is similar, the handler in the test is a
>> ring based handler,
>> where as this patch adds an array based handler.
> Not sure I'm getting what you are saying. Do you mean stack instead
> of array?

Yes, apologies, stack.

> Actually, both are stacks when talking about bulks of objects. If we
> consider each objects one by one, that's true the order will differ.
> But as discussed in [1], the cache code already reverses the order of
> objects when doing a mempool_get(). I'd say the reversing in cache code
> is not really needed (only the order of object bulks should remain the
> same). A rte_memcpy() looks to be faster, but it would require to do
> some real-life tests to validate or unvalidate this theory.
>
> So to conclude, I still think both code in app/test and lib/mempool are
> quite similar, and only one of them should be kept.
>
> [1] http://www.dpdk.org/ml/archives/dev/2016-May/039873.html

OK, so we will probably remove the test app portion in the future is if 
is not needed,
and if we apply the stack handler proposed in this patch set.

>> I think that the case for leaving it in as a test for the standard
>> handler as part of the
>> previous mempool handler is valid, but maybe there is a case for
>> removing it if
>> we add the stack handler. Maybe a future patch?
>>
>>> Do you have some some performance numbers? Do you know if it scales
>>> with the number of cores?
>> For the mempool_perf_autotest, I'm seeing a 30% increase in performance
>> for the
>> local cache use-case for 1 - 36 cores (results vary within those tests
>> between
>> 10-45% gain, but with an average of 30% gain over all the tests.).
>>
>> However, for the tests with no local cache configured, throughput of the
>> enqueue/dequeue
>> drops by about 30%, with the 36 core yelding the largest drop of 40%. So
>> this handler would
>> not be recommended in no-cache applications.
> Interesting, thanks. If you also have real-life (I mean network)
> performance tests, I'd be interested too.

I'm afraid don't currently have any real-life performance tests.

> Ideally, we should have a documentation explaining in which cases a
> handler or another should be used. However, if we don't know this
> today, I'm not opposed to add this new handler in 16.07, and let people
> do their tests and comment, then describe it properly for 16.11.
>
> What do you think?

I agree. Add it in 16.07, and let people develop use cases for it, as 
well as possibly
coming up with new handlers for 16.11. There's talk of hardware based 
handlers, I
would also hope to see some of those contributed soon.

Regards,
David.





More information about the dev mailing list