[dpdk-dev] [PATCH] eal: fix threads block on barrier

Tan, Jianfeng jianfeng.tan at intel.com
Sat Apr 28 06:22:28 CEST 2018



On 4/28/2018 9:24 AM, Stephen Hemminger wrote:
> On Fri, 27 Apr 2018 21:52:26 +0200
> Thomas Monjalon <thomas at monjalon.net> wrote:
>
>> 27/04/2018 19:45, Shreyansh Jain:
>>> From: Stephen Hemminger [mailto:stephen at networkplumber.org]
>>>> Shreyansh Jain <shreyansh.jain at nxp.com> wrote:
>>>>> From: Jianfeng Tan
>>>>>> Below commit introduced pthread barrier for synchronization.
>>>>>> But two IPC threads block on the barrier, and never wake up.
>>>>>>
>>>>>>    (gdb) bt
>>>>>>    #0  futex_wait (private=0, expected=0, futex_word=0x7fffffffcff4)
>>>>>>        at ../sysdeps/unix/sysv/linux/futex-internal.h:61
>>>>>>    #1  futex_wait_simple (private=0, expected=0,
>>>>>> futex_word=0x7fffffffcff4)
>>>>>>        at ../sysdeps/nptl/futex-internal.h:135
>>>>>>    #2  __pthread_barrier_wait (barrier=0x7fffffffcff0) at
>>>>>> pthread_barrier_wait.c:184
>>>>>>    #3  rte_thread_init (arg=0x7fffffffcfe0)
>>>>>>        at ../dpdk/lib/librte_eal/common/eal_common_thread.c:160
>>>>>>    #4  start_thread (arg=0x7ffff6ecf700) at pthread_create.c:333
>>>>>>    #5  clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
>>>>>>
>>>>>> Through analysis, we find the barrier defined on the stack
>>>>>> could be the root cause. This patch will change to use heap
>>>>>> memory as the barrier.
>>>>>>
>>>>>> Fixes: d651ee4919cd ("eal: set affinity for control threads")
>>>>>>
>>>>>> Cc: Olivier Matz <olivier.matz at 6wind.com>
>>>>>> Cc: Anatoly Burakov <anatoly.burakov at intel.com>
>>>>>>
>>>>>> Signed-off-by: Jianfeng Tan <jianfeng.tan at intel.com>
>>>>> Though I have seen Stephen's comment on this (possibly a library
>>>> bug), this at least fixes an issue which was dogging dpaa and dpaa2 -
>>>> generating bus errors and futex errors with variation in core masks
>>>> provided to applications.
>>>>> Thanks a lot for this.
>>>>>
>>>>> Acked-by: Shreyansh Jain <shreyansh.jain at nxp.com>
>> Applied, thanks Jianfeng.
>>
>>>> Could you verify there is not a use after free by using valgrind or
>>>> some library that poisons memory on free.
>>> I will probably do that soon - but for the time being I don't want
>>> this issue to block the dpaa/dpaa2 for RC1 - these drivers were
>>> completely unusable without this patch.
>> Please Shreyansh, continue the analysis of this bug.
>> Thanks
>>
>>
> The pthread_barrier should also be destroyed when it is no longer needed.

I tried this could also kick the sleeping thread; but due to "The effect 
of subsequent use of the barrier is undefined", I did not use that way.

Anyway, I agree that destroy() shall be called for completeness.

Thanks,
Jianfeng


More information about the dev mailing list