[EXTERNAL] Re: [PATCH v3] net/netvsc: fix number Tx queues > Rx queues
Alan Elder
alan.elder at microsoft.com
Fri Apr 12 18:50:55 CEST 2024
> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit at amd.com>
> Sent: Friday, April 12, 2024 6:23 AM
> To: Alan Elder <alan.elder at microsoft.com>; Long Li <longli at microsoft.com>;
> Andrew Rybchenko <andrew.rybchenko at oktetlabs.ru>
> Cc: dev at dpdk.org; stephen <stephen at networkplumber.org>
> Subject: Re: [EXTERNAL] Re: [PATCH v3] net/netvsc: fix number Tx queues > Rx
> queues
>
> On 4/11/2024 9:45 PM, Alan Elder wrote:
> >> -----Original Message-----
> >> From: Ferruh Yigit <ferruh.yigit at amd.com>
> >> Sent: Thursday, April 11, 2024 7:38 AM
> >> To: Alan Elder <alan.elder at microsoft.com>; Long Li
> >> <longli at microsoft.com>; Andrew Rybchenko
> >> <andrew.rybchenko at oktetlabs.ru>
> >> Cc: dev at dpdk.org; stephen <stephen at networkplumber.org>
> >> Subject: [EXTERNAL] Re: [PATCH v3] net/netvsc: fix number Tx queues >
> >> Rx queues
> >>
> >> On 3/19/2024 2:16 PM, Alan Elder wrote:
> >>> The previous code allowed the number of Tx queues to be set higher
> >>> than the number of Rx queues. If a packet was sent on a Tx queue
> >>> with index
> >>>> = number Rx queues there was a segfault.
> >>> This commit fixes the issue by creating an Rx queue for every Tx
> >>> queue meaning that an event buffer is allocated to handle receiving
> >>> Tx completion messages.
> >>>
> >>> mbuf pool and Rx ring are not allocated for these additional Rx
> >>> queues and RSS configuration ensures that no packets are received on
> them.
> >>>
> >>> Fixes: 4e9c73e96e83 ("net/netvsc: add Hyper-V network device")
> >>> Cc: sthemmin at microsoft.com
> >>> Cc: stable at dpdk.org
> >>>
> >>> Signed-off-by: Alan Elder <alan.elder at microsoft.com>
> >>>
> >>
> >> Hi Alan,
> >>
> >> What is the root cause of the crash, is it in driver scope or application?
> >
> > Hi Ferruh,
> >
> > The root cause of the crash was in the driver - a packet received on a Tx
> queue that had no corresponding Rx queue would cause the dev->data-
> >rx_queues[] array to be accessed past the length of the array.
> >
> > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgith
> >
> ub.com%2FDPDK%2Fdpdk%2Fblob%2Fmain%2Fdrivers%2Fnet%2Fnetvsc%2Fhn
> _rxtx.
> >
> c%23L1071&data=05%7C02%7Calan.elder%40microsoft.com%7C3985f99c07c1
> 4a64
> >
> 99fd08dc5ada98d0%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C6
> 3848514
> >
> 2149539930%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjo
> iV2luMzI
> >
> iLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=Y%2F6lr6v2j4Q
> cSm6g0
> > dTcV%2FEimyfPs0nMBJ0X5s9omAE%3D&reserved=0
> >
> >
>
> Why there is an access to Rx queue when processing Tx queue?
>
> A backtrace of the crash can help to understand the issue, can you please
> include this in commit log, plus some explanation why crash happens?
>
> Thanks,
> Ferruh
Hi Ferruh,
Netvsc slow path needs to handle Tx completion messages (to know when it can reclaim Tx buffers). Tx completion messages are received on Rx queue, which is why the Rx queue is accessed as part of transmit processing.
An example call stack is:
#6 rte_spinlock_trylock (sl=0x20) at /include/rte_spinlock.h
#7 hn_process_events (hv=, queue_id=2, tx_limit=) at /drivers/net/netvsc/hn_rxtx.c
#8 hn_xmit_pkts (ptxq=, tx_pkts=, nb_pkts=1) at /drivers/net/netvsc/hn_rxtx.c
Which leads to the SEGV as 0x20 is not a valid address.
I'll update the commit messages and resubmit the patch.
Thanks,
Alan
More information about the dev
mailing list