[dpdk-dev] [dpdk-stable] [PATCH 2/2] eal: fix loading of shared libs from driver plugin directories

Ferruh Yigit ferruh.yigit at intel.com
Wed Nov 25 12:05:38 CET 2020


On 11/25/2020 10:01 AM, David Marchand wrote:
> On Tue, Nov 24, 2020 at 4:15 PM Timothy Redaelli <tredaelli at redhat.com> wrote:
>>
>> Commit 49b536fc3060 ("eal: load only shared libs from driver plugin directories")
>> introduced a check that any shared library must ends with .so, but it can't
>> work, at least, on Fedora/CentOS/RHEL since .so symlinks are not installed
>> when you install dpdk package, but only when you install dpdk-devel package.
>>
>> This commit adds also a check for .so.ABI_VERSION to check for shared
>> lib.
>>
>> See Fedora Packaging Guidelines for more informations:
>> https://docs.fedoraproject.org/en-US/packaging-guidelines/#_devel_packages
>>
>> Fixes: 49b536fc3060 ("eal: load only shared libs from driver plugin directories")
>> Cc: bruce.richardson at intel.com
>> Signed-off-by: Timothy Redaelli <tredaelli at redhat.com>
>> ---
>>   lib/librte_eal/common/eal_common_options.c | 6 ++++--
>>   1 file changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/lib/librte_eal/common/eal_common_options.c b/lib/librte_eal/common/eal_common_options.c
>> index e6f77ad217..e9e2362c53 100644
>> --- a/lib/librte_eal/common/eal_common_options.c
>> +++ b/lib/librte_eal/common/eal_common_options.c
>> @@ -402,8 +402,10 @@ eal_plugindir_init(const char *path)
>>                  struct stat sb;
>>                  int nlen = strnlen(dent->d_name, sizeof(dent->d_name));
>>
>> -               /* check if name ends in .so */
>> -               if (strcmp(&dent->d_name[nlen - 3], ".so") != 0)
>> +               /* check if name ends in .so or .so.ABI_VERSION */
>> +               if (strcmp(&dent->d_name[nlen - 3], ".so") != 0 &&
>> +                   strcmp(&dent->d_name[nlen - 4 - strlen(ABI_VERSION)],
>> +                          ".so."ABI_VERSION) != 0)
>>                          continue;
>>
>>                  snprintf(sopath, sizeof(sopath), "%s/%s", path, dent->d_name);
> 
> A first doubt I had about this patch is about multiple loading of the
> same driver, and its effects.
> So I first had a look at the main branch current state (using
> devtools/test-null.sh) and I can see multiple loading already happens
> for statically linked drivers like the bond driver.
> 
> $ LD_LIBRARY_PATH=build/lib gdb build/app/dpdk-testpmd -ex "b main"
> -ex "run -c 3 --no-huge -m 20 -d build/drivers '--log-level=*:debug'
> -a 0:0.0 --vdev net_null1 --vdev net_null2 -- --no-mlockall
> --total-num-mbufs=2048 -ia"
> ...
> (gdb) info sharedlibrary
> ...
> 0x00007ffff796d5b0  0x00007ffff797d2d6  Yes
> /home/dmarchan/dpdk/build/app/../drivers/librte_net_bond.so.21
> ...
> (gdb) b rte_eal_init
> (gdb) c
> (gdb) finish
> ...
> EAL: open shared lib build/drivers/librte_net_avp.so
> EAL: open shared lib build/drivers/librte_net_bond.so
> EAL: open shared lib build/drivers/librte_net_ixgbe.so
> ...
> (gdb) set $elm=*(struct shared_driver *)solib_list.tqh_first
> (gdb) while $elm
> p $elm
> set $elm=*(struct shared_driver *)($elm.next.tqe_next)
> end
> ...
> $67 = {next = {tqe_next = 0x5c3350, tqe_prev = 0x5c1310}, name =
> "build/drivers/librte_net_avp.so", '\000' <repeats 4064 times>,
> lib_handle = 0x616870}
> $68 = {next = {tqe_next = 0x5c4370, tqe_prev = 0x5c2330}, name =
> "build/drivers/librte_net_bond.so", '\000' <repeats 4063 times>,
> lib_handle = 0x7ffff7995a80}
> $69 = {next = {tqe_next = 0x5c5390, tqe_prev = 0x5c3350}, name =
> "build/drivers/librte_net_ixgbe.so", '\000' <repeats 4062 times>,
> lib_handle = 0x7ffff7942fc0}
> ...
> 
> I could not confirm the 0x7ffff7995a80 handle points at the first load
> of the bond driver.
> gdb does not seem to know of the linker internal structures, a bit
> hard to tell but the addresses are in the same range, so likely the
> same object.
> 
> I confirmed the bond constructor only being called once when the bond
> driver is loaded because of static link:
> 
> $ LD_LIBRARY_PATH=build/lib gdb build/app/dpdk-testpmd -ex "b
> vdrvinitfn_pmd_bond_drv" -ex "run -c 3 --no-huge -m 20 -d
> build/drivers '--log-level=*:debug' -a 0:0.0 --vdev net_null1 --vdev
> net_null2 -- --no-mlockall --total-num-mbufs=2048 -ia"
> GNU gdb (GDB) Fedora 8.3.50.20190824-30.fc31
> ...
> Reading symbols from build/app/dpdk-testpmd...
> Function "vdrvinitfn_pmd_bond_drv" not defined.
> Make breakpoint pending on future shared library load? (y or [n]) y
> Breakpoint 1 (vdrvinitfn_pmd_bond_drv) pending.
> Starting program: /home/dmarchan/dpdk/build/app/dpdk-testpmd -c 3
> --no-huge -m 20 -d build/drivers '--log-level=*:debug' -a 0:0.0 --vdev
> net_null1 --vdev net_null2 -- --no-mlockall --total-num-mbufs=2048 -ia
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/usr/lib64/libthread_db.so.1".
> 
> Breakpoint 1, vdrvinitfn_pmd_bond_drv () at
> ../drivers/net/bonding/rte_eth_bond_pmd.c:3755
> 3755    RTE_PMD_REGISTER_VDEV(net_bonding, pmd_bond_drv);
> Missing separate debuginfos, use: dnf debuginfo-install
> elfutils-libelf-0.179-2.fc31.x86_64 glibc-2.30-13.fc31.x86_64
> jansson-2.12-4.fc31.x86_64 libbsd-0.9.1-4.fc31.x86_64
> libfdt-1.6.0-1.fc31.x86_64 numactl-libs-2.0.12-3.fc31.x86_64
> zlib-1.2.11-20.fc31.x86_64
> 
> (gdb) c
> Continuing.
> ...
> EAL: open shared lib build/drivers/librte_net_avp.so
> EAL: open shared lib build/drivers/librte_net_bond.so
> EAL: open shared lib build/drivers/librte_net_ixgbe.so
> ...
> testpmd>
> 
> 
> Now, about this patch.
> In this test, it has the effect of loading all drivers twice (or
> thrice if statically linked).
> I see no problem with it since the linker is intelligent enough and
> constructors are only called once.
> 
> 
> My only remaining doubt is whether we should be looking for
> .ABI_VERSION or .ABI_MAJOR extension.
> This could be discussed, but in its current form, this patch lgtm.
> 
> Acked-by: David Marchand <david.marchand at redhat.com>
> 
> 

I also checked the same, yes same file processed twice, with actual name and 
symlink, but since eal using 'realpath()' before 'dlopen()' the path passed to 
the 'dlopen()' is exact same for both symlink and actual file.

And 'dlopen()' has following in the man page:
"
  If  the  same  shared object is opened again with dlopen(), the same object 
handle is returned.  The dynamic linker maintains reference counts for object 
handles, so a dynamically loaded shared object is not deallocated until 
dlclose() has been called on it as many times as dlopen() has succeeded on it. 
Constructors (see below) are called only when the object is actually loaded into 
memory (i.e., when the reference count increases to 1).
"

As I run/debug with or without the symlinks, it looks OK with patch,
Tested-by: Ferruh Yigit <ferruh.yigit at intel.com>


More information about the dev mailing list