[dpdk-dev] [PATCH 2/2] virtio: fix PCI config err handling

Luca Boccassi bluca at debian.org
Thu Aug 16 12:27:33 CEST 2018


On Thu, 2018-08-16 at 14:46 +0800, Tiwei Bie wrote:
> On Wed, Aug 15, 2018 at 10:50:57AM +0100, Luca Boccassi wrote:
> > On Wed, 2018-08-15 at 11:11 +0800, Tiwei Bie wrote:
> > > On Tue, Aug 14, 2018 at 03:30:35PM +0100, Luca Boccassi wrote:
> > > > From: Brian Russell <brussell at brocade.com>
> > > > 
> > > > In virtio_read_caps, rte_pci_read_config returns the number of
> > > > bytes
> > > > read from PCI config or < 0 on error.
> > > > If less than the expected number of bytes are read then log the
> > > > failure and return rather than carrying on with garbage.
> > > 
> > > Is this a fix or an improvement?
> > > Or did you see anything broken without this patch?
> > > If so, we may need a fixes line and Cc stable.
> > 
> > It is a fix, as it was creating problems in production due to the
> > constant flux of errors in the logs.
> 
> Could you be a bit more specific about which errors
> were logged if possible?
> 
> If my understanding is correct, you mean the errors
> were logged because less than the required amount of
> bytes were read?

Yes - rte_pci_read_config on Linux will return not just 0/-1, but the
actual number of bytes read. If it's less than the required amount, the
code then goes on and reads garbage, which causes errors later in the
execution. Checking that we actually got the amount of data we need
fixes this issue.

> > But given patch 1/2 is effectively doing a small change in the BSD
> > bus
> > API, and it's a requirement for 2/2, I don't think we can include
> > it in
> > the stable releases unfortunately.
> 
> If it's a fix, we need a fixes line.

Sure, will send a v2.

> > 
> > > > 
> 
> [...]
> > > > @@ -567,16 +567,18 @@ virtio_read_caps(struct rte_pci_device
> > > > *dev,
> > > > struct virtio_hw *hw)
> > > >  	}
> > > >  
> > > >  	ret = rte_pci_read_config(dev, &pos, 1,
> > > > PCI_CAPABILITY_LIST);
> > > > -	if (ret < 0) {
> > > > -		PMD_INIT_LOG(DEBUG, "failed to read pci
> > > > capability list");
> > > > +	if (ret != 1) {
> > > > +		PMD_INIT_LOG(DEBUG,
> > > > +			     "failed to read pci capability
> > > > list, ret %d", ret);
> > > >  		return -1;
> > > >  	}
> > > >  
> > > >  	while (pos) {
> > > >  		ret = rte_pci_read_config(dev, &cap,
> > > > sizeof(cap), pos);
> > > > -		if (ret < 0) {
> > > > -			PMD_INIT_LOG(ERR,
> > > > -				"failed to read pci cap at
> > > > pos: %x", pos);
> > > > +		if (ret != sizeof(cap)) {
> 
> Above code has to successfully read a full virtio
> PCI capability during each read, otherwise it will
> give up reading other capabilities and may fallback
> to the legacy mode. In which case it will fail to
> read the requested amount of bytes? Should we try
> to read the generic PCI fields first?

I do not know what exactly causes less than required bytes to be read,
but we have seen it happen in production (not 100% of the times though
- so I think it's worth keeping the structure as-is). As you said in
that case it falls back to legacy mode which, in our experience in
production deployments, then succeeds. That's why the error level print
is undesired - because the code will actually work via the fallback,
but the customers will see scary errors in the logs and open
escalations :-)

> Besides, you also need to update other calls to
> rte_pci_read_config(), e.g.:
> 
> https://github.com/DPDK/dpdk/blob/76b9d9de5c7d/drivers/net/virtio/vir
> tio_pci.c#L696
> 
> Thanks

Sure I will apply the same changes in v2.

-- 
Kind regards,
Luca Boccassi


More information about the dev mailing list