[dpdk-dev] [PATCH] bus/vmbus: fix race in sub channel creation

Mohammed Gamal mgamal at redhat.com
Tue Dec 4 12:59:46 CET 2018


On Mon, 2018-12-03 at 08:48 -0800, Stephen Hemminger wrote:
> On Mon, 03 Dec 2018 07:02:55 +0100
> Mohammed Gamal <mgamal at redhat.com> wrote:
> 
> > On Fri, 2018-11-30 at 12:24 -0800, Stephen Hemminger wrote:
> > > When using multiple queues, there was a race with the kernel
> > > in setting up the second channel. This is do due to a kernel
> > > change
> > > whiche does not allow accessing sysfs files for Hyper-V
> > > channels that are not opened.
> > > 
> > > The fix is simple, just move the logic to detect not ready
> > > sub channels earlier in the existing loop.
> > > 
> > > Fixes: 831dba47bd36 ("bus/vmbus: add Hyper-V virtual bus
> > > support")
> > > Reported-by:Mohammed Gamal <mgamal at redhat.com>
> > > Signed-off-by: Stephen Hemminger <sthemmin at microsoft.com>
> > > ---
> > >  drivers/bus/vmbus/linux/vmbus_uio.c | 12 ++++++------
> > >  1 file changed, 6 insertions(+), 6 deletions(-)
> > > 
> > > diff --git a/drivers/bus/vmbus/linux/vmbus_uio.c
> > > b/drivers/bus/vmbus/linux/vmbus_uio.c
> > > index 12e97e3a420a..38df4d724ed5 100644
> > > --- a/drivers/bus/vmbus/linux/vmbus_uio.c
> > > +++ b/drivers/bus/vmbus/linux/vmbus_uio.c
> > > @@ -357,6 +357,12 @@ int vmbus_uio_get_subchan(struct
> > > vmbus_channel
> > > *primary,
> > >  			continue;
> > >  		}
> > >  
> > > +		if (!vmbus_isnew_subchannel(primary, relid))
> > > +			continue;	/* Already know about
> > > you
> > > */
> > > +
> > > +		if (!vmbus_uio_ring_present(dev, relid))
> > > +			continue;	/* Ring may not be
> > > ready
> > > yet */
> > > +
> > >  		snprintf(subchan_path, sizeof(subchan_path),
> > > "%s/%lu",
> > >  			 chan_path, relid);
> > >  		err = vmbus_uio_sysfs_read(subchan_path,
> > > "subchannel_id",
> > > @@ -370,12 +376,6 @@ int vmbus_uio_get_subchan(struct
> > > vmbus_channel
> > > *primary,
> > >  		if (subid == 0)
> > >  			continue;	/* skip primary channel
> > > */
> > >  
> > > -		if (!vmbus_isnew_subchannel(primary, relid))
> > > -			continue;
> > > -
> > > -		if (!vmbus_uio_ring_present(dev, relid))
> > > -			continue;	/* Ring may not be
> > > ready
> > > yet */
> > > -
> > >  		err = vmbus_uio_sysfs_read(subchan_path,
> > > "monitor_id",
> > >  					   &monid, UINT8_MAX);
> > >  		if (err) {  
> > 
> > With this patch I am now getting the following error:
> > [...]
> > Configuring Port 0 (socket 0)
> > hn_dev_configure():  >>  
> > hn_rndis_link_status(): link status 0x40020006
> > hn_subchan_configure(): open 1 subchannels
> > hn_subchan_configure(): open subchannel failed: -2
> > hn_dev_configure(): subchannel configuration failed
> > Port0 dev_configure = -2
> > hn_dev_rx_queue_release():  >>
> > hn_dev_rx_queue_release():  >>
> > hn_dev_tx_queue_release():  >>
> > hn_dev_tx_queue_release():  >>  
> > Fail to configure port 0
> > EAL: Error - exiting with code: 1
> >   Cause: Start ports failed
> > 
> > Apparently, no subchannels were ready. Anything I may have missed
> > or
> > misconfigured?
> > 
> > Regards,
> > Mohammed
> 
> Could you check the kernel log?
> 
I did. No relevant messages seem to be there.

> The way sub channel configuration works is that the userspace code in
> DPDK
> sends a message to the hypervisor that it would like N subchannels,
> then
> the response from the hypervisor is processed by the kernel causing
> sysfs
> files to be created. Meanwhile the userspace is polling waiting for
> the
> sysfs files to show up (for 10 seconds). You could increas the
> timeout or
> go looking in the sysfs directory  to see what is present.

Tried increasing that up to 100 seconds, still nothing. Could it be a
problem on my host? The VM I am using is on a local hyper-v instance.
 
> 
> There is no good way to handle errors here, the hypervisor doesn't
> really
> give much feedback.


More information about the dev mailing list