[dpdk-dev] [PATCH] ring: guarantee ordering of cons/prod loading when doing enqueue/dequeue

Zhao, Bing ilovethull at 163.com
Thu Oct 19 13:18:38 CEST 2017


Hi,

On 2017/10/19 18:02, Ananyev, Konstantin wrote:
> 
> Hi Jia,
> 
>>
>> Hi
>>
>>
>> On 10/13/2017 9:02 AM, Jia He Wrote:
>>> Hi Jerin
>>>
>>>
>>> On 10/13/2017 1:23 AM, Jerin Jacob Wrote:
>>>> -----Original Message-----
>>>>> Date: Thu, 12 Oct 2017 17:05:50 +0000
>>>>>
>> [...]
>>>> On the same lines,
>>>>
>>>> Jia He, jie2.liu, bing.zhao,
>>>>
>>>> Is this patch based on code review or do you saw this issue on any of
>>>> the
>>>> arm/ppc target? arm64 will have performance impact with this change.
>> sorry, miss one important information
>> Our platform is an aarch64 server with 46 cpus.
>> If we reduced the involved cpu numbers, the bug occurred less frequently.
>>
>> Yes, mb barrier impact the performance, but correctness is more
>> important, isn't it ;-)
>> Maybe we can  find any other lightweight barrier here?
>>
>> Cheers,
>> Jia
>>> Based on mbuf_autotest, the rte_panic will be invoked in seconds.
>>>
>>> PANIC in test_refcnt_iter():
>>> (lcore=0, iter=0): after 10s only 61 of 64 mbufs left free
>>> 1: [./test(rte_dump_stack+0x38) [0x58d868]]
>>> Aborted (core dumped)
>>>
> 
> So is it only reproducible with mbuf refcnt test?
> Could it be reproduced with some 'pure' ring test
> (no mempools/mbufs refcnt, etc.)?
> The reason I am asking - in that test we also have mbuf refcnt updates
> (that's what for that test was created) and we are doing some optimizations here too
> to avoid excessive atomic updates.
> BTW, if the problem is not reproducible without mbuf refcnt,
> can I suggest to extend the test  with:
>    - add a check that enqueue() operation was successful
>    - walk through the pool and check/printf refcnt of each mbuf.
> Hopefully that would give us some extra information what is going wrong here.
> Konstantin
>    
> 
Currently, the issue is only found in this case here on the ARM 
platform, not sure how it is going with the X86_64 platform. In another 
mail of this thread, we've made a simple test based on this and captured 
some information and I pasted there.(I pasted the patch there :-)) And 
it seems that Juhamatti & Jacod found some reverting action several 
months ago.

BR. Bing



More information about the dev mailing list