[PATCH v4] net/af_xdp: re-enable secondary process support

Loftus, Ciara ciara.loftus at intel.com
Wed Feb 16 12:23:15 CET 2022


> Subject: RE: [PATCH v4] net/af_xdp: re-enable secondary process support
> 
> >
> > On 2/11/2022 1:01 PM, Loftus, Ciara wrote:
> > >>
> > >> On 2/11/2022 9:26 AM, Loftus, Ciara wrote:
> > >>>>>
> > >>>>> On 2/10/2022 5:47 PM, Loftus, Ciara wrote:
> > >>>>>>> Subject: Re: [PATCH v4] net/af_xdp: re-enable secondary process
> > >>>> support
> > >>>>>>>
> > >>>>>>> On 2/10/2022 3:40 PM, Loftus, Ciara wrote:
> > >>>>>>>>> Subject: Re: [PATCH v4] net/af_xdp: re-enable secondary
> > process
> > >>>>> support
> > >>>>>>>>>
> > >>>>>>>>> On 2/9/2022 9:48 AM, Ciara Loftus wrote:
> > >>>>>>>>>> Secondary process support had been disabled for the
> AF_XDP
> > >> PMD
> > >>>>>>>>> because
> > >>>>>>>>>> there was no logic in place to share the AF_XDP socket file
> > >>>> descriptors
> > >>>>>>>>>> between the processes. This commit introduces this logic
> using
> > >> the
> > >>>>> IPC
> > >>>>>>>>>> APIs.
> > >>>>>>>>>>
> > >>>>>>>>>> Rx and Tx are disabled in the secondary process due to
> memory
> > >>>>> mapping
> > >>>>>>> of
> > >>>>>>>>>> the AF_XDP rings being assigned by the kernel in the primary
> > >>>> process
> > >>>>>>> only.
> > >>>>>>>>>> However other operations including retrieval of stats are
> > >> permitted.
> > >>>>>>>>>>
> > >>>>>>>>>> Signed-off-by: Ciara Loftus <ciara.loftus at intel.com>
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> Hi Ciara,
> > >>>>>>>>>
> > >>>>>>>>> When I tried to test the patch getting following error [1], it
> > doesn't
> > >>>> look
> > >>>>>>>>> related to this patch but can you help to fix the issue, thanks.
> > >>>>>>>>>
> > >>>>>>>>> [1]
> > >>>>>>>>> libxdp: Couldn't find a BPF file with name xsk_def_xdp_prog.o
> > >>>>>>>>> xsk_configure(): Failed to create xsk socket.
> > >>>>>>>>> eth_rx_queue_setup(): Failed to configure xdp socket
> > >>>>>>>>> Fail to configure port 2 rx queues
> > >>>>>>>>> EAL: Error - exiting with code: 1
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Hi Ferruh,
> > >>>>>>>>
> > >>>>>>>> This file should be generated when libxdp is compiled.
> > >>>>>>>> Mine is located @ /usr/local/lib/bpf/xsk_def_xdp_prog.o
> > >>>>>>>> Can you check if that file is there for you? It could be in
> > >>>>>>> /usr/local/lib64/bpf/ on your machine.
> > >>>>>>>> What kernel are you running on?
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>> It is in: /usr/local/lib64/bpf/xsk_def_xdp_prog.o
> > >>>>>>>
> > >>>>>>> I had to compile libxdp from source because OS package version
> > was
> > >> old
> > >>>>>>> to work with af_xdp.
> > >>>>>>> Is something required to point location of this file to af_xdp
> PMD?
> > >>>>>>>
> > >>>>>>> I run kernel:
> > >>>>>>> 5.15.16-200.fc35.x86_64
> > >>>>>>
> > >>>>>> I read through the libxdp code to figure out what happens when
> > >>>> searching
> > >>>>> for the file:
> > >>>>>> https://github.com/xdp-project/xdp-
> > >>>>> tools/blob/v1.2.2/lib/libxdp/libxdp.c#L1055
> > >>>>>>
> > >>>>>> secure_getenv(XDP_OBJECT_ENVVAR) is called which according to
> > the
> > >>>>> README "defaults to /usr/lib/bpf (or /usr/lib64/bpf on systems
> using
> > a
> > >> split
> > >>>>> library path)".
> > >>>>>> If that fails, BPF_OBJECT_PATH will be searched, which points to
> > >>>>> /usr/lib/bpf
> > >>>>>>
> > >>>>>> I discovered that on my system the getenv() call fails, but the file is
> > >>>>> eventually found because luckily BPF_OBJECT_PATH points to the
> > >>>>> appropriate place for me (lib):
> > >>>>>> https://github.com/xdp-project/xdp-
> > >> tools/blob/v1.2.2/lib/util/util.h#L24
> > >>>>>> I suspect the same failure is happening for you, but since
> > >>>>> BPF_OBJECT_PATH points to lib and not lib64, the file is not found.
> > >>>>>> As a temporary measure can you create a symlink in
> > /usr/local/lib/bpf/
> > >> to
> > >>>>> point to /usr/local/lib/bpf/xsk_def_xdp_prog.o
> > >>>>>> I will investigate the libxdp issue further. Maybe a change is
> needed
> > in
> > >>>> the
> > >>>>> library. If a change or setup recommendation is needed in DPDK I
> will
> > >>>> create a
> > >>>>> patch.
> > >>>>>>
> > >>>>>
> > >>>>>
> > >>>>> I don't have XDP_OBJECT_ENVVAR or BPF_OBJECT_PATH
> > environment
> > >>>>> variables set,
> > >>>>> if they should be we should document them.
> > >>>>>
> > >>>>> When I created '/usr/local/lib/bpf/' link, the BPF file found.
> > >>>>> This should be clarified/documented for users.
> > >>>>
> > >>>> Ok. Ideally we shouldn't have to create the symlink. I will look for a
> > better
> > >>>> solution and submit a patch.
> > >>>> The symlink might be a temporary solution if another solution is not
> > >> found.
> > >>>
> > >>> Can you please try setting the environment variable
> > >> LIBXDP_OBJECT_PATH=/usr/local/lib64/bpf/
> > >>> And see if your test works without the symlink?
> > >>> This worked for me and the getenv succeeded.
> > >>> If it works for you too, I'll create a patch for the docs instructing users
> to
> > do
> > >> the same.
> > >>>
> > >>
> > >> I confirm it works, and +1 to document it.
> > >>
> > >>
> > >> btw, when this environment variable is not set (and no symlink), af_xdp
> > fails
> > >> and testpmd crashes. I think af_xdp failure shouldn't cause a crash in
> > >> testpmd,
> > >> most probably some error checks are needed in the af_xdp driver.
> > >
> > > When I trigger the error case in my environment I get a graceful exit:
> > >
> > > libxdp: Couldn't find a BPF file with name xsk_def_xdp_prog.o
> > > xsk_configure(): Failed to create xsk socket.
> > > eth_rx_queue_setup(): Failed to configure xdp socket
> > > Fail to configure port 0 rx queues
> > > EAL: Error - exiting with code: 1
> > >    Cause: Start ports failed
> > >
> > > Can you please provide more info on your crash?
> > >
> >
> > $ ./build/app/dpdk-testpmd --vdev net_af_xdp0,iface=enp24s0f1 --vdev
> > net_af_xdp1,iface=enp94s0f1 -- -i
> > ...
> > Configuring Port 2 (socket 0)
> > libxdp: Couldn't find a BPF file with name xsk_def_xdp_prog.o
> > xsk_configure(): Failed to create xsk socket.
> > eth_rx_queue_setup(): Failed to configure xdp socket
> > Fail to configure port 2 rx queues
> > EAL: Error - exiting with code: 1
> >    Cause: Start ports failed
> > Segmentation fault (core dumped)
> >
> > (I have two physical interfaces too, af_xdp ports are port 2 & 3)
> 
> Thanks for providing more info. I have reproduced this issue.
> It doesn't happen when --no-pci is used.
> The issue is not related to the libxdp patch. It just occurs when
> xsk_configure() fails. As you said, some extra error checks/handling are
> probably required.
> Thanks for reporting. I'll work on a fix.
> 

I reproduced the issue with some other vdevs (af_packet and null) by forcing an error return from their respective rx_queue_setup functions.
I traced the issue back to dfbc61a2f9a6 ("mem: detach memsegs on cleanup"). Before this commit there is no SEGV after the vdev setup fails.
The SEGVs occur in the phy driver code.
For i40e it occurs @ i40e_dev_alarm_handler()
For ixgbe it occurs @ ixgbe_dev_setup_link_thread_handler()
Not sure exactly why they occur though. Anatoly, you authored the commit mentioned above. Can you think of any reason why it might cause this behaviour?


More information about the dev mailing list