[dpdk-dev] [PATCH] build: avoid --as-needed as it causes overlinking

Christian Ehrhardt christian.ehrhardt at canonical.com
Thu Aug 29 12:18:41 CEST 2019


On Wed, Aug 28, 2019 at 5:23 PM Aaron Conole <aconole at redhat.com> wrote:
>
> Aaron Conole <aconole at redhat.com> writes:
>
> > Christian Ehrhardt <christian.ehrhardt at canonical.com> writes:
> >
> >> On Wed, Aug 28, 2019 at 3:53 PM Aaron Conole <aconole at redhat.com> wrote:
> >>>
> >>> Christian Ehrhardt <christian.ehrhardt at canonical.com> writes:
> >>>
> >>> > A while ago telemetry was added in 57ae0ec6 and it also added as-needed
> >>> > to config/meson.build. This seems no more needed these days as due to other
> >>> > build changes the ordering in buildlogs is:
> >>> >   [...] -lrte_telemetry [...] -Wl,--no-as-needed [...]
> >>> > Which means telemetry no more benefits from --no-as-needed anyway.
> >>> >
> >>> > Overlinking problems get triggered by the meson generated pkgconfig which
> >>> > will have:
> >>> >    [...] -Wl,--no-as-needed <somelibsusedbydpdk>
> >>> > This will overlink <somelibs> and in addition anything that follows
> >>> > as it also doesn't wrap back to --as-needed. So if a projects includes
> >>> > dpdk libs + <other> it will also consider <other> with --no-as-needed.
> >>> >
> >>> > Fixes: https://bugs.launchpad.net/ubuntu/+source/dpdk/+bug/1841759
> >>> >
> >>> > Signed-off-by: Christian Ehrhardt <christian.ehrhardt at canonical.com>
> >>> > ---
> >>>
> >>> Hi Christian,
> >>>
> >>> I agree this is something to be fixed.  It will need additional work,
> >>> though:
> >>>
> >>>   https://travis-ci.com/ovsrobot/dpdk/builds/124909245
> >>>
> >>
> >> Thanks for the Link Aaron, yet I'm puzzled what to do there atm.
> >>
> >> The kind of error I found in the failing logs were misleading at first:
> >> - linker can't find -lvirt / -lpqos / ...
> >>   well the test env needs to install them, maybe it was added as
> >> dependency by accident before?
> >
> > Not sure about this.  It's strange to require that we *install* the
> > libraries before we can unit test them.  After all, if I'm going to
> > potentially replace my previously installed libraries, I definitely want
> > to know that the unit tests are passing.
> >
> >>   I'd understand (due to the change) if it would complain about missing symbols
> >>   (no more added due to as-needed, but then for some reason needed)
> >>   But this is vice versa, it just doesn't find the libs in the build env
> >> - error: unrecognized command line option '-Wformat-truncation'
> >>   I don't see how I'd cause this ...
> >> => Maybe this is just an artifact that is even part of the normal/good tests?
> >
> > I don't think so - but there's a simple change.  I've pushed to my own
> > branch and you can see the builds:
> >
> >   https://travis-ci.org/orgcandman/dpdk/branches using the same
> >   series_6154 branch name.
> >
> >> Comparing former logs - last good test was
> >> https://travis-ci.com/ovsrobot/dpdk/builds/124875383
> >> This first seemed more helpful.
> >>
> >> DPDK:fast-tests / eal_flags_w_opt_autotest  FAIL
> >> DPDK:fast-tests / func_reentrancy_autotest  FAIL
> >> DPDK:fast-tests / mbuf_autotest         FAIL
> >> DPDK:fast-tests / mempool_autotest      FAIL
> >> DPDK:fast-tests / ring_pmd_autotest     FAIL
> >> DPDK:fast-tests / sched_autotest        FAIL
> >> DPDK:fast-tests / table_autotest        FAIL
> >> [...]
> >> Overall about 14/60 of the tests failed with no recognizable pattern
> >> why just those and not the others.
> >
> > Good question :)
> >
> >> I only see "Full log written ... on_error", so I can't directly
> >> compare how a good run would look in the configure/build stage.
> >> Looking just at the bad case there are plenty of messages like
> >> - "no available hugepages"
> >> - "cannot reserve memory", ..
> >> But all those indicate more a flaky test(-env) than an error in the
> >> commit, there must be more to it.
> >
> > Okay.  Fair enough.
> >
> >> @Aaron is there a good way to get the rest of the log for a good case
> >> to compare?
> >
> > Let's wait for https://travis-ci.org/orgcandman/dpdk/builds/577910388 to
> > spit out some details.
>
> Oops - forgot to push the revert:
>
> https://travis-ci.org/orgcandman/dpdk/builds/577918381 is the correct
> build.
>
> Sorry.

Thanks Aaron, breaking down the actual different test results ...

Tests: eal_flags_w_opt_autotest eal_flags_b_opt_autotest
  EAL: failed to parse device "00FF:09:0B.3"
  EAL: Unable to parse device '00FF:09:0B.3'

Test: func_reentrancy_autotest
  mempool create/lookup: common object allocated 0 times (should be 1)

Tests: flow_classify_autotest ring_pmd_autotest sched_autotest
table_autotest bitratestats_autotest distributor_autotest
latencystats_autotest reorder_autotest
  MBUF: error setting mempool handler
  Cannot init mbuf pool on socket 0

Test: mbuf_autotest
  MBUF: error setting mempool handler
  cannot allocate mbuf pool

Test: mempool_autotest
  cannot allocate mp_nocache mempool

Test: eventdev_common_autotest
  Tests not executed


I built DPDK off of my commit locally and tried to run the tests, but
they work fine


# app/test/dpdk-test --no-huge -l 0-1 --pci-whitelist 00FF:09:0B.3
EAL: Detected 4 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: Static memory layout is selected, amount of reserved memory can
be adjusted with -m or --socket-mem
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'VA'
EAL: Probing VFIO support...
EAL:   cannot open VFIO container, error 2 (No such file or directory)
EAL: VFIO support could not be initialized
APP: HPET is not enabled, using TSC as default timer

Works just fine ?!
And if I really break it it fails as expected

# app/test/dpdk-test --no-huge -l 0-1 --pci-whitelist 00FF:09:0B.3xxxx
EAL: Detected 4 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: Static memory layout is selected, amount of reserved memory can
be adjusted with -m or --socket-mem
EAL: failed to parse device "00FF:09:0B.3xxxx"
EAL: Unable to parse device '00FF:09:0B.3xxxx'

And trying another one of the cases that are most common "error
setting mempool handler" works fine as well

# app/test/dpdk-test --no-huge -l 0-1
EAL: Detected 4 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: Static memory layout is selected, amount of reserved memory can
be adjusted with -m or --socket-mem
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'VA'
EAL: Probing VFIO support...
EAL:   cannot open VFIO container, error 2 (No such file or directory)
EAL: VFIO support could not be initialized
EAL: PCI device 0000:00:1f.6 on NUMA socket -1
EAL:   Invalid NUMA socket, default to 0
EAL:   probe driver: 8086:15d8 net_e1000_em
APP: HPET is not enabled, using TSC as default timer
RTE>>flow_classify_autotest
Created table_acl for for IPv4 five tuple packets
Allocated mbuf pool on socket 0
Set up IPv4 UDP traffic
ETH  pktlen 14
ETH + IPv4 pktlen 34
ETH + IPv4 + UDP pktlen 42

Set up IPv4 TCP traffic
ETH  pktlen 14
ETH + IPv4 pktlen 34
ETH + IPv4 + TCP pktlen 54

Set up IPv4 SCTP traffic
ETH  pktlen 14
ETH + IPv4 pktlen 34
ETH + IPv4 + SCTP pktlen 42

Test OK


/me is still not seeing what would be wrong with it in this test environment :-/
Especially since the change is on linking, that should be a do-or-die
breakage and not a (what seems random) subset of test fails.

@Aaron, thanks for your help so far. Is there an easy way to run
exactly my code once again to see it if works this time?
Or was https://travis-ci.org/orgcandman/dpdk/builds/577910388 exactly
that already (and again failed at the same tests).

-- 
Christian Ehrhardt
Software Engineer, Ubuntu Server
Canonical Ltd


More information about the dev mailing list