[dpdk-dev] kernel: BUG: soft lockup - CPU#1 stuck for 22s! [kni_single:1782]

Jay Rolette rolette at infiniteio.com
Mon Feb 16 17:33:52 CET 2015


On Tue, Feb 10, 2015 at 7:33 PM, Jay Rolette <rolette at infiniteio.com> wrote:

> Environment:
>   * DPDK 1.6.0r2
>   * Ubuntu 14.04 LTS
>   * kernel: 3.13.0-38-generic
>
> When we start exercising KNI a fair bit (transferring files across it,
> both sending and receiving), I'm starting to see a fair bit of these kernel
> lockups:
>
> kernel: BUG: soft lockup - CPU#1 stuck for 22s! [kni_single:1782]
>
> Frequently I can't do much other than get a screenshot of the error
> message coming across the console session once we get into this state, so
> debugging what is happening is "interesting"...
>
> I've seen this on multiple hardware platforms (so not box specific) as
> well as virtual machines.
>
> Are there any known issues with KNI that would cause kernel lockups in
> DPDK 1.6? Really hoping someone that knows KNI well can point me in the
> right direction.
>
> KNI in the 1.8 tree is significantly different, so it didn't look
> straight-forward to back-port it, although I do see a few changes that
> might be relevant.
>

Found the problem. No patch to submit since it's already fixed in later
versions of DPDK, but thought I'd follow up with the details since I'm sure
we aren't the only ones trying to use bleeding-edge versions of DPDK...

In kni_net_rx_normal(), it was calling netif_receive_skb() instead of
netif_rx(). The source for netif_receive_skb() point out that it should
only be called from soft-irq context, which isn't the case for KNI.

As typical, simple fix once you track it down.

Yao-Po Wang's fix:  commit 41a6ebded53982107c1adfc0652d6cc1375a7db9.

Cheers,
Jay


More information about the dev mailing list