Should we try to be more graceful in library init on old Hardware?
Bruce Richardson
bruce.richardson at intel.com
Thu Mar 30 15:28:00 CEST 2023
On Thu, Mar 30, 2023 at 02:15:42PM +0100, Bruce Richardson wrote:
> On Thu, Mar 30, 2023 at 02:53:41PM +0200, Christian Ehrhardt wrote:
> > Hi,
> > I've recently gotten a kind of bug I was waiting for many years.
> > In fact I wondered if it would still come up as each year made it less likely.
> > But it happened and I got a crash report of someone using dpdk a
> > rather old pre sse4.2 hardware.
> > => https://bugs.launchpad.net/ubuntu/+source/dpdk/+bug/2009635/comments/9
> >
> > The reporter was nice and tried the newer 22.11, but that is just as affected.
> >
> > I understand that DPDK, as a project, has set this as the minimal
> > accepted hardware capability.
> > But due to some programs - in this case UHD - being able to do many
> > other things it might happen that UHD or any else just links to DPDK
> > (as it could be used with it) and due to that runs into a crash when
> > loading. In theory other tools like collectd which has dpdk support
> > would be affected by the same.
> >
> > Example:
> > root at 1bee22d20ca0:/# uhd_usrp_probe
> > Illegal instruction (core dumped)
> >
> > (gdb) bt
> > #0 0x00007f4b2d3a3374 in rte_srand () from
> > /lib/x86_64-linux-gnu/librte_eal.so.23
> > #1 0x00007f4b2d3967ec in ?? () from /lib/x86_64-linux-gnu/librte_eal.so.23
> > #2 0x00007f4b2e5d1fbe in call_init (l=<optimized out>,
> > argc=argc at entry=1, argv=argv at entry=0x7ffeabf5b488,
> > env=env at entry=0x7ffeabf5b498)
> > at ./elf/dl-init.c:70
> > #3 0x00007f4b2e5d20a8 in call_init (env=0x7ffeabf5b498,
> > argv=0x7ffeabf5b488, argc=1, l=<optimized out>) at ./elf/dl-init.c:33
> > #4 _dl_init (main_map=0x7f4b2e6042e0, argc=1, argv=0x7ffeabf5b488,
> > env=0x7ffeabf5b498) at ./elf/dl-init.c:117
> > #5 0x00007f4b2e5ea8b0 in _dl_start_user () from /lib64/ld-linux-x86-64.so.2
> > #6 0x0000000000000001 in ?? ()
> > #7 0x00007ffeabf5c844 in ?? ()
> > #8 0x0000000000000000 in ?? ()
> >
> > Right now all we could do is:
> > a) say bad luck old hardware (not nice)
> > b) make super complex alternative builds with and without dpdk support
> > c) ask the DPDK project to work on non sse4.2 (unlikely and too late
> > in 2023 I guess)
> > d) Somehow make the initialization graceful (that is what I'm RFC here)
> >
> > If we could manage to get that DPDK to ensure the lib loading paths
> > are SSE4.2 free.
> > Then we could check the capabilities on the actual initialization and
> > return a proper bad result instead of a crash.
> > Due to that only real-users of DPDK would be required to have
> > sufficiently new hardware.
> > And OTOH users of software that links, but in the current config would
> > not use DPDK would suffer less.
> >
> > WDYT?
> > Maybe it has been already discussed and I did neither remember nor find it?
> >
> It certainly hasn't been discussed previously, but there is meant to be
> support for this in EAL init itself. Almost the first function called
> from eal_init() is "rte_cpu_is_supported()" [1] which checks the build-time
> CPU flags against those of the current system.
> Unfortunately, from the error message you are getting, that doesn't seem to
> be working ok in the case of SSE4.2. It seems the compiler is inserting
> SSE4 instructions before we even get to that point. :-(
>
> Perhaps we need to move eal init to a new file, and compile it (and the
> cpuflag checks) with very minimal CPU flags.
>
Following up to my own mail...
I believe we may be able to solve this easier by maybe using the "target"
attribute for those functions. For x86 builds I don't see why eal init
cannot be compiled for an earlier SSE version, (march=core2, perhaps). It's
not a performance-sensitive function.
Thoughts?
/Bruce
More information about the dev
mailing list