[Bug 1107] [22.11-rc1][meson test] seqlock_autotest test failed
    Jiang, YuX 
    yux.jiang at intel.com
       
    Tue Oct 18 12:01:29 CEST 2022
    
    
  
> -----Original Message-----
> From: Mattias Rönnblom <mattias.ronnblom at ericsson.com>
> Sent: Tuesday, October 18, 2022 5:50 PM
> To: Jiang, YuX <yux.jiang at intel.com>; bugzilla at dpdk.org; dev at dpdk.org
> Cc: David Marchand <david.marchand at redhat.com>; Liang, Cunming
> <cunming.liang at intel.com>
> Subject: Re: [Bug 1107] [22.11-rc1][meson test] seqlock_autotest test failed
> 
> On 2022-10-18 11:08, Jiang, YuX wrote:
> >> -----Original Message-----
> >> From: Mattias Rönnblom <mattias.ronnblom at ericsson.com>
> >> Sent: Tuesday, October 18, 2022 4:44 PM
> >> To: bugzilla at dpdk.org; dev at dpdk.org
> >> Cc: David Marchand <david.marchand at redhat.com>; Liang, Cunming
> >> <cunming.liang at intel.com>
> >> Subject: Re: [Bug 1107] [22.11-rc1][meson test] seqlock_autotest test
> >> failed
> >>
> >> On 2022-10-18 07:57, bugzilla at dpdk.org wrote:
> >>> https://protect2.fireeye.com/v1/url?k=31323334-501d5122-313273af-454
> >>> 44
> >>> 5555731-77db8b1c577f1119&q=1&e=8bb9cf17-6273-48ca-96b9-
> >> 52e8c1287ae2&u=
> >>> https%3A%2F%2Fbugs.dpdk.org%2Fshow_bug.cgi%3Fid%3D1107
> >>>
> >>>               Bug ID: 1107
> >>>              Summary: [22.11-rc1][meson test] seqlock_autotest test failed
> >>>              Product: DPDK
> >>>              Version: 22.11
> >>>             Hardware: All
> >>>                   OS: All
> >>>               Status: UNCONFIRMED
> >>>             Severity: normal
> >>>             Priority: Normal
> >>>            Component: meson
> >>>             Assignee: dev at dpdk.org
> >>>             Reporter: yux.jiang at intel.com
> >>>     Target Milestone: ---
> >>>
> >>> [Environment]
> >>> DPDK version: dpdk22.11.0rc1
> >> a74b1b25136a592c275afbfa6b70771469750aee
> >>> OS: CentOS7.9/3.10.0-1160.62.1.el7.x86_64 or
> >>> 3.10.0-1160.71.1.el7.x86_64
> >>> Compiler: gcc version 4.8.5 20150623
> >>
> >> Have you tried with a different compiler? Preferably one supported by
> >> DPDK, unlike 4.8.5.
> >>
> >> Some versions of GCC had problems with C11 release-type thread fences.
> >> GCC 7.2, for example, could reorder non-atomic stores across the fence.
> >> (That mightily confused me, when I came across this in my very first
> >> program using C11-style atomics.)
> >>
> >> It might be worth disassembling the code to make sure that didn't
> >> happen in your case.
> >>
> >> Also, you could try to replace the release barrier and/or the acquire
> >> barrier with a rte_compiler_barrier(), just to see if this problem is
> >> indeed related to the barriers. On a TSO machine, a compiler barrier
> >> should do the job. Or you use __sync_synchronize(). (Just for
> >> exploration, not as a bug fix or workaround.)
> >>
> > Thanks.
> > Currently it is only found this failure on CentOS7.9.
> > Can test passed on 22.04.1 LTS (Jammy Jellyfish)/5.15.0-27-generic with GCC
> gcc version 11.2.0.
> > So you think this is gcc compiler problem rather than a dpdk bug, right?
> 
> That would be my *guess*.
> 
> Did it pass on 22.04.1 LTS on that Atom CPU?
Currently, we only have one such CPU and which has installed CentOS.
Maybe it is not related to CPU, because we have another CentOS with CPU: Intel(R) Xeon(R) Platinum 8280M CPU @ 2.70GHz, 
This failure still can reproduce on this platform.
I also try the test on different platform which has different gcc version( 9.4.0/8.5.0/11.2.1/12.2.1/7.5.0), not find this failure currently.
Best regards,
Yu Jiang
> 
> > And will not fix, right?
> >
> 
> If someone cares about this platform, I don't see why we can't fix it, provided
> there's small and clean fix. If it happens to be the thread barrier that's the
> problem, and the compiler happens to be the root cause, we could either warn
> about that compiler (on x86?) in the build, disallow it, or have a workaround in
> <rte_atomic.h> for it. Fortunately, there is a DPDK wrapper for
> __atomic_thread_fence() intrinsic, so a fix would take affect across the board.
> 
> I think it would be of value to everyone if you or someone else went to look for
> why this test case fails on that particular compiler and CPU combination. It
> could be a bug in the seqlock implementation, that could potentially affect
> everyone, although only triggered by the seqlock unit tests on your particular
> platform.
> 
> >>> Hardware platform: Intel(R) Atom(TM) CPU C3758 @ 2.20GHz
> >>>
> >>>
> >>> [Test Setup]
> >>> Steps to reproduce
> >>> 1. Use the following command to build DPDK:
> >>> CC=gcc meson -Denable_kmods=True -Dlibdir=lib
> >>> --default-library=static x86_64-native-linuxapp-gcc/ ninja -C
> >>> x86_64-native-linuxapp-gcc/
> >>>
> >>> 2. Execute the following command in the dpdk directory.
> >>> meson test -C x86_64-native-bsdapp-gcc/ seqlock_autotest
> >>>
> >>> [Show the output from the previous commands]
> >>> 2/2 DPDK:fast-tests / seqlock_autotest      FAIL             2.51s   (exit
> >>> status 255 or signal 127 SIGinvalid)
> >>> 04:23:38 MALLOC_PERTURB_=139 DPDK_TEST=seqlock_autotest
> >>> /root/dpdk/x86_64-native-linuxapp-gcc/app/test/dpdk-test
> >>> --file-prefix=seqlock_autotest
> >>> ----------------------------------- output
> >>> -----------------------------------
> >>> stdout:
> >>> RTE>>seqlock_autotest^M
> >>> Reader observed inconsistent data values 10856068477537484964
> >>> 9973142773974991064 9973142773974991064 Test Failed
> >>> RTE>>
> >>> stderr:
> >>> EAL: Detected CPU lcores: 8
> >>> EAL: Detected NUMA nodes: 1
> >>> EAL: Detected static linkage of DPDK
> >>> EAL: Multi-process socket /var/run/dpdk/seqlock_autotest/mp_socket
> >>> EAL: Selected IOVA mode 'VA'
> >>> EAL: 1024 hugepages of size 2097152 reserved, but no mounted
> >>> hugetlbfs found for that size
> >>> APP: HPET is not enabled, using TSC as default timer
> >>>
> >>> [Expected Result]
> >>> Test ok.
> >>>
> >
    
    
More information about the dev
mailing list