[dpdk-dev] [PATCH] kni: fix kernel deadlock when using mlx devices

Thomas Monjalon thomas at monjalon.net
Wed Mar 18 16:17:57 CET 2020


17/01/2020 17:43, Ferruh Yigit:
> On 12/22/2019 5:55 PM, Stephen Hemminger wrote:
> > This fixes a deadlock when using KNI with bifurcated drivers.
> > Bringing kni device up always times out when using Mellanox
> > devices.
> > 
> > The kernel KNI driver sends message to userspace to complete
> > the request. For the case of bifurcated driver, this may involve
> > an additional request to kernel to change state. This request
> > would deadlock because KNI was holding the RTNL mutex.
> > 
> > This was a bad design which goes back to the original code.
> > A workaround is for KNI driver to drop RTNL while waiting.
> > To prevent the device from disappearing while the operation
> > is in progress, it needs to hold reference to network device
> > while waiting.
> > 
> > As an added benefit, an useless error check can also be removed.
> > 
> > Fixes: 3fc5ca2f6352 ("kni: initial import")
> > Cc: stable at dpdk.org
> > Signed-off-by: Stephen Hemminger <stephen at networkplumber.org>
> > ---
> 
> This patch cause a hang on my server, not sure what exactly was the problem but
> kernel log was continuously printing "Cannot send to req_q". Will dig more.

Ferruh, did you have a chance to check what is hanging?
Stephen, is there any news on your side?




More information about the dev mailing list