[dpdk-dev] [PATCH 0/8] Improve build process

Neil Horman nhorman at tuxdriver.com
Mon Feb 23 19:23:19 CET 2015


On Mon, Feb 23, 2015 at 02:58:30PM +0000, Gonzalez Monroy, Sergio wrote:
> On 23/02/2015 13:52, Neil Horman wrote:
> >On Mon, Feb 23, 2015 at 10:25:01AM +0000, Gonzalez Monroy, Sergio wrote:
> >>On 22/02/2015 23:37, Neil Horman wrote:
> >>>On Fri, Feb 20, 2015 at 02:31:36PM +0000, Gonzalez Monroy, Sergio wrote:
> >>>>On 13/02/2015 12:51, Neil Horman wrote:
> >>>>>On Fri, Feb 13, 2015 at 11:08:02AM +0000, Gonzalez Monroy, Sergio wrote:
> >>>>>>On 13/02/2015 10:14, Panu Matilainen wrote:
> >>>>>>>On 02/12/2015 05:52 PM, Neil Horman wrote:
> >>>>>>>>On Thu, Feb 12, 2015 at 04:07:50PM +0200, Panu Matilainen wrote:
> >>>>>>>>>On 02/12/2015 02:23 PM, Neil Horman wrote:
> >>>>>>>[...snip...]
> >>>>>>>>>>>>>So I just realized that I was not having into account a possible
> >>>>>>>>>>>>>scenario, where
> >>>>>>>>>>>>>we have an app built with static dpdk libs then loading a dso
> >>>>>>>>>>>>>with -d
> >>>>>>>>>>>>>option.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>In such case, because the pmd would have DT_NEEDED entries,
> >>>>>>>>>>>>>dlopen will
> >>>>>>>>>>>>>fail.
> >>>>>>>>>>>>>So to enable such scenario we would need to build PMDs without
> >>>>>>>>>>>>>DT_NEEDED
> >>>>>>>>>>>>>entries.
> >>>>>>>>>>>>Hmm, for that to be a problem you'd need to have the PMD built
> >>>>>>>>>>>>against
> >>>>>>>>>>>>shared dpdk libs and while the application is built against
> >>>>>>>>>>>>static dpdk
> >>>>>>>>>>>>libs. I dont think that's a supportable scenario in any case.
> >>>>>>>>>>>>
> >>>>>>>>>>>>Or is there some other scenario that I'm not seeing?
> >>>>>>>>>>>>
> >>>>>>>>>>>>    - Panu -
> >>>>>>>>>>>>
> >>>>>>>>>>>I agree with you. I suppose it comes down to, do we want to
> >>>>>>>>>>>support such
> >>>>>>>>>>>scenario?
> >>>>>>>>>>>
> >>>>>>>>>>> From what I can see, it seems that we do currently support such
> >>>>>>>>>>>scenario by
> >>>>>>>>>>>building dpdk apps against all static dpdk libs using
> >>>>>>>>>>>--whole-archive (all
> >>>>>>>>>>>libs and not only PMDs).
> >>>>>>>>>>>http://dpdk.org/browse/dpdk/commit/?id=20afd76a504155e947c770783ef5023e87136ad8
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>Am I misunderstanding this?
> >>>>>>>>>>>
> >>>>>>>>>>Shoot, you're right, I missed the static build aspect to this.  Yes,
> >>>>>>>>>>if we do the following:
> >>>>>>>>>>
> >>>>>>>>>>1) Build the DPDK as a static library
> >>>>>>>>>>2) Link an application against (1)
> >>>>>>>>>>3) Use the dlopen mechanism to load a PMD built as a DSO
> >>>>>>>>>>
> >>>>>>>>>>Then the DT_NEEDED entries in the DSO will go unsatisfied, because
> >>>>>>>>>>the shared
> >>>>>>>>>>objects on which it (the PMD) depends will not exist in the file
> >>>>>>>>>>system.
> >>>>>>>>>I think its even more twisty:
> >>>>>>>>>
> >>>>>>>>>1) Build the DPDK as a static library
> >>>>>>>>>2) Link an application against (1)
> >>>>>>>>>3) Do another build of DPDK as a shared library
> >>>>>>>>>4) In app 2), use the dlopen mechanism to load a PMD built as a part
> >>>>>>>>>of or
> >>>>>>>>>against 3)
> >>>>>>>>>
> >>>>>>>>>Somehow I doubt this would work very well.
> >>>>>>>>>
> >>>>>>>>Ideally it should, presuming the ABI is preserved between (1) and (3),
> >>>>>>>>though I
> >>>>>>>>agree, up until recently, that was an assumption that was unreliable.
> >>>>>>>Versioning is a big and important step towards reliability but there are
> >>>>>>>more issues to solve. This of course getting pretty far from the original
> >>>>>>>topic, but at least one such issue is that there are some cases where a
> >>>>>>>config value affects what are apparently public structs (rte_mbuf wrt
> >>>>>>>RTE_MBUF_REFCNT for example), which really is a no-go.
> >>>>>>>
> >>>>>>Agree, the RTE_MBUF_REFCNT is something that needs to be dealt with asap.
> >>>>>>I'll look into it.
> >>>>>>
> >>>>>>>>>>I think the problem is a little bit orthogonal to the libdpdk_core
> >>>>>>>>>>problem you
> >>>>>>>>>>were initially addressing.  That is to say, this problem of
> >>>>>>>>>>dlopen-ed PMD's
> >>>>>>>>>>exists regardless of weather you build the DPDK as part of a static
> >>>>>>>>>>or dynamic
> >>>>>>>>>>library.  The problems just happen to intersect in their
> >>>>>>>>>>manipulation of the
> >>>>>>>>>>DT_NEEDED entries.
> >>>>>>>>>>
> >>>>>>>>>>Ok, so, given the above, I would say your approach is likely
> >>>>>>>>>>correct, just
> >>>>>>>>>>prevent DT_NEEDED entries from getting added to PMD's. Doing so will
> >>>>>>>>>>sidestep
> >>>>>>>>>>loading issue for libraries that may not exist in the filesystem,
> >>>>>>>>>>but thats ok,
> >>>>>>>>>>because by all rights, the symbols codified in those needed
> >>>>>>>>>>libraries should
> >>>>>>>>>>already be present in the running application (either made available
> >>>>>>>>>>by the
> >>>>>>>>>>application having statically linked them, or having the linker load
> >>>>>>>>>>them from
> >>>>>>>>>>the proper libraries at run time).
> >>>>>>>>>My 5c is that I'd much rather see the common case (all static or all
> >>>>>>>>>shared)
> >>>>>>>>>be simple and reliable, which in case of DSOs includes no lying
> >>>>>>>>>(whether by
> >>>>>>>>>omission or otherwise) about DT_NEEDED, ever. That way the issue is
> >>>>>>>>>dealt
> >>>>>>>>>once where it belongs. If somebody wants to go down the rabbit hole of
> >>>>>>>>>mixed
> >>>>>>>>>shared + static linkage, let them dig the hole by themselves :)
> >>>>>>>>>
> >>>>>>>>This is a fair point.  Can DT_NEEDED sections be stripped via tools like
> >>>>>>>>objcopy
> >>>>>>>>after the build is complete?  If so, end users can hack this corner case
> >>>>>>>>to work
> >>>>>>>>as needed.
> >>>>>>>Patchelf (http://nixos.org/patchelf.html) appears to support that, but
> >>>>>>>given that source is available it'd be easier to just modify the makefiles
> >>>>>>>if that's really needed.
> >>>>>>>
> >>>>>>I think we agree on the issue.
> >>>>>>
> >>>>>>So I'll be sending a patch to add DT_NEEDED entries to all libraries and
> >>>>>>PMDs. The only exception would be librte_eal, which would not have proper
> >>>>>>NEEDED entries.
> >>>>>>Do we bother adding a linker script for librte_eal that would include
> >>>>>>dependent libraries?
> >>>>>>
> >>>>>I say yes to the linker script, but will happily bow to an alternate consensus
> >>>>>Neil
> >>>>>
> >>>>So the case we want to solve is the following circular dependencies:
> >>>>eal             -> mempool, malloc
> >>>>mempool -> eal , malloc, ring
> >>>>malloc      -> eal
> >>>>ring           -> eal, malloc
> >>>>
> >>>>We cannot write/create the proposed (below) linker script at least until we
> >>>>have built mempool and malloc.
> >>>>INPUT ( -lrte_eal.so -lrte_mempool -lrte_malloc )
> >>>>
> >>>Not sure I understand why you have a build time dependency on this.  Link time
> >>>perhaps, but not build time.  Or am I reading too much into your use of the term
> >>>'built' above?
> >>I meant 'built' as compiled + linked. Am I misusing the term?
> >No, you're not (though I misused the term link time above, I meant to say load
> >time).  So you're saying that when you build shared libraries, you get linker
> >errors indicating that, during the build, you're missing symbols, is that
> >correct?  I guess I'm confused because I don't see how thats not happening for
> >everyone, right now.  In other words, I'm not sure what about your changes is
> >giving rise to that problem.
> >
> >>>>Few ways I have thought about implementing this (not particularly fond of
> >>>>any of them) :
> >>>>  - Have the linker script file in the repo (scripts/ ?) in a fixed location
> >>>>and just copy it to $(RTE_OUTPUT)/lib/ once all libs have finished building.
> >>>>  - Generate the file on build time from a defined make variable once all
> >>>>libs have finished
> >>>>
> >>>I'm still not sure I understand.  Why does this dependency exist at build time?
> >>>The dependency between malloc and eal shouldn't be a problem during the build,
> >>>as symbols from each other should just remain undefined, and get resolved at
> >>>load time.
> >>Is that not the way it is currently implemented?
> >>I get the impression that we are talking about different goals (correct me
> >>if it is not the case)
> >>
> >We may well be, I'm not sure yet.
> >
> >>I thought that the agreed solution was to:
> >>1) NOT to create/generate a 'core' library
> >>2) Add DT_NEEDED entries for all libraries (except eal which is the first
> >>library we link)
> >>3) Use linker script for eal
> >>
> >Ok, we're definately on the same page, as thats what I thought the goal was as
> >well.
> >
> >>Given the previously mentioned circular dependencies between eal, mempool,
> >>malloc and ring:
> >>- eal would not be linked against other libraries (no NEEDED entries)
> >>- malloc is linked against eal (previously built), so malloc would have a
> >>NEEDED entry for eal.
> >>
> >>In that scenario, if the linker script is setup/created after we build eal,
> >>then when we try to link malloc
> >>against eal, the linker will pull mempool and malloc too (because we
> >>included them in the linker script).
> >>Therefore, the link fails as none of those libraries (malloc and mempool)
> >>have been built yet.
> >>
> >Ah, I see now, I wasn't thinking about the extra requirements that DT_NEEDED
> >entries placed on the build conditions.
> >
> >I see now, apologies for being dense previously.  Given what you indicate I
> >would say that the solution here is to link the libraries against individual
> >other specific libraries, not the core library that you generate as a linker
> >script.  That way you avoid the circular dependency, and the core library just
> >becomes a convienience for application developers looking to link to a single
> >library.
> >
> I'm not sure I quite understand your suggestion.
> 
> Could you roughly describe steps for building eal, malloc and mempool libs ?
> For example, something like this?
> 1) build eal, which creates librte_eal.so.1
> 2) write linker script for librte_eal.so
> 3) build malloc against eal (-lrte_eal )
> etc

Hm, so I spent a bit of time looking at this, and your right, I thought this was
really just a artifact of the introduction of --as-needed to the build to force
DT_NEEDED entries, and my suggestion was that you simply not link the libraries
that were causing the circular dependency.  I had assumed that the link
directives included -lrte_malloc -lrte_mempool for the eal library, but they
weren't really needed, so you could remove them and it would work out.

Unfortunately that turns out to not be the case.  librte_eal does explicitly use
calls in librte_malloc, and vice versa.  The current use of -no-as-needed in the
build system (which I was previously unaware of), is a hack to avoid having to
address that problem.

That throws a monkey wrench into this plan.  I would see 3 ways forward:

1) Fix the problem - That is to say, remove the use of --no-as-needed from the
build, and address the circular dependencies that arise.  This could/will mean
actually merging libraries with circular dependencies into a single library, as
they should be so that they can completely resolve all of the symbols they use
at link time

2) Ignore the problem.  If we just keep the lack of DT_NEEDED entries in place,
I think the problem goes away, and we can continue on.

I think option 1 is likely the more correct approach, as removing DT_NEEDED to
avoid a circuar depenency is a hack, but it may not be the most pragmatic
approach.  just living without DT_NEEDED entries and documenting link time needs
will certainly be faster, and might be the better course of action, especially
if we provide a 'core' pseudo library/linker script that embodies that action
for the end user.

Neil


> I suppose that another way to go about this, instead of creating the linker
> script that pulls
> dependent libraries, is to always link (using --no-as-needed in case gcc
> adds it by default)
> against these libraries (eal, mempool, malloc, and ring) with necessary doc
> about how to build apps.
> 
> Sergio
> >Neil
> >
> >>Was your suggestion to leave all of these libraries (eal, mempool, malloc,
> >>ring) without NEEDED entries?
> >>
> >No, you can add NEEDED entries there, they will just be for the individual
> >libraries, not the core linker script library.
> >
> >Best
> >Neil
> >
> >>Regards,
> >>Sergio
> >>>What is the error you are getting?
> >>>
> >>>Best
> >>>Neil
> >>>
> >>>>Thoughts? any other approached is more than welcome!
> >>>>
> >>>>Sergio
> >>>>
> >>>>PS: Thinking again on the core library and the issue of having multiple
> >>>>version.map files, we could have a core_version.map instead instead of
> >>>>multiple files per core library (eal, mempool, etc)
> >>>>
> >>>>
> >>
> >>
> 
> 
> 


More information about the dev mailing list