|SUCCESS| [v2, 7/7] net/netvsc: handle VF recovery events for service reset

dpdklab at iol.unh.edu dpdklab at iol.unh.edu
Wed May 6 05:32:54 CEST 2026


Test-Label: ai-code-review
Test-Status: SUCCESS
https://dpdk.org/patch/163616

_AI Code Review_

=== Patch Review: series.patch (via Claude) ===
Review date: 2026-05-06

Reviewing the patch series for DPDK net/netvsc driver against the coding
standards and guidelines.

commit 20260506020529.281654-1-longli at microsoft.com
Author: Long Li <longli at microsoft.com>
Subject: [PATCH v2 1/7] net/netvsc: retry VF hotplug indefinitely until PCI device disappears

The patch replaces a fixed retry limit with indefinite retry for VF
hotplug, detecting PCI device removal via sysfs to terminate the loop.

> diff --git a/drivers/net/netvsc/hn_ethdev.c b/drivers/net/netvsc/hn_ethdev.c
> index b8880edb4c..34040b3e57 100644
> --- a/drivers/net/netvsc/hn_ethdev.c
> +++ b/drivers/net/netvsc/hn_ethdev.c

[ ... ]

> @@ -622,19 +622,38 @@ static void netvsc_hotplug_retry(void *args)
>  	PMD_DRV_LOG(DEBUG, "%s: retry count %d",
>  		    __func__, hot_ctx->eal_hot_plug_retry);
>  
> -	if (hot_ctx->eal_hot_plug_retry++ > NETVSC_MAX_HOTADD_RETRY) {
> -		PMD_DRV_LOG(NOTICE, "Failed to parse PCI device retry=%d",
> -			    hot_ctx->eal_hot_plug_retry);
> +	hot_ctx->eal_hot_plug_retry++;
> +
> +	/* Check if PCI device still exists -- if it disappeared, give up.
> +	 * Otherwise keep retrying indefinitely until the net directory
> +	 * appears.  This is safe because:
> +	 * - MANA driver probe can take >100s after PCI rescan
> +	 * - The retry uses rte_eal_alarm callbacks serialized on the
> +	 *   EAL interrupt thread, preventing races with device close
> +	 * - Device close cancels pending alarms and frees the context
> +	 * - If the PCI device is removed, the access() check below
> +	 *   detects the missing sysfs path and stops immediately
> +	 */
> +	snprintf(buf, sizeof(buf), "/sys/bus/pci/devices/%s", d->name);
> +	if (access(buf, F_OK) != 0) {

Does this code check the return value of snprintf for potential truncation?
The buffer sizing relies on d->name not exceeding available space. Consider
using snprintf return value validation.

> +		PMD_DRV_LOG(NOTICE,
> +			    "PCI device %s no longer exists, giving up after %d retries",
> +			    d->name, hot_ctx->eal_hot_plug_retry);
>  		goto free_hotadd_ctx;
>  	}
>  
>  	snprintf(buf, sizeof(buf), "/sys/bus/pci/devices/%s/net", d->name);

Same snprintf truncation question applies here.

>  	di = opendir(buf);
>  	if (!di) {

Implicit NULL comparison: should be if (di == NULL) per coding style.

> -		PMD_DRV_LOG(DEBUG, "%s: can't open directory %s, "
> -			    "retrying in 1 second", __func__, buf);
> -		/* The device is still being initialized, retry after 1 second */
> -		rte_eal_alarm_set(1000000, netvsc_hotplug_retry, hot_ctx);
> +		if (hot_ctx->eal_hot_plug_retry % 30 == 0)
> +			PMD_DRV_LOG(NOTICE,
> +				    "%s: waiting for %s (retry %d, %ds elapsed)",
> +				    __func__, buf, hot_ctx->eal_hot_plug_retry,
> +				    hot_ctx->eal_hot_plug_retry);

Does the elapsed time calculation assume NETVSC_HOTADD_RETRY_INTERVAL is
always 1 second? This creates a dependency on the macro value matching the
log message.


---

commit 20260506020529.281654-2-longli at microsoft.com
Author: Long Li <longli at microsoft.com>
Subject: [PATCH v2 2/7] net/netvsc: retry on SIOCGIFHWADDR failure during VF hotplug

The patch retries when SIOCGIFHWADDR ioctl fails, assuming the interface
is being renamed by udev.

> diff --git a/drivers/net/netvsc/hn_ethdev.c b/drivers/net/netvsc/hn_ethdev.c
> index 34040b3e57..1fa64cab18 100644
> --- a/drivers/net/netvsc/hn_ethdev.c
> +++ b/drivers/net/netvsc/hn_ethdev.c
> @@ -673,10 +673,18 @@ static void netvsc_hotplug_retry(void *args)
>  		ret = ioctl(s, SIOCGIFHWADDR, &req);
>  		close(s);
>  		if (ret == -1) {
> -			PMD_DRV_LOG(ERR,
> -				    "Failed to send SIOCGIFHWADDR for device %s",
> +			/* Interface may be renamed by udev (e.g. eth1 - ens1).
> +			 * Retry from the top -- the PCI device check above
> +			 * ensures we stop if the device disappears.
> +			 */
> +			PMD_DRV_LOG(DEBUG,
> +				    "Failed to send SIOCGIFHWADDR for device %s, "
> +				    "interface may be renaming, retrying",
>  				    dir->d_name);
> -			break;
> +			closedir(di);
> +			rte_eal_alarm_set(NETVSC_HOTADD_RETRY_INTERVAL,
> +					  netvsc_hotplug_retry, hot_ctx);
> +			return;

Does this path reset hot_ctx->mac_retry counter before returning? The retry
assumes mac_retry is per-MAC-matching attempt (patch 5/7), but here we
restart the entire loop without resetting any counters.


---

commit 20260506020529.281654-3-longli at microsoft.com
Author: Long Li <longli at microsoft.com>
Subject: [PATCH v2 3/7] net/netvsc: retry full probe when IB device not ready during hotplug

The patch restarts the full retry loop when rte_eal_hotplug_add returns
-ENODEV, assuming the IB device is not ready.

> diff --git a/drivers/net/netvsc/hn_ethdev.c b/drivers/net/netvsc/hn_ethdev.c
> index 1fa64cab18..130fea38ab 100644
> --- a/drivers/net/netvsc/hn_ethdev.c
> +++ b/drivers/net/netvsc/hn_ethdev.c
> @@ -716,17 +716,36 @@ static void netvsc_hotplug_retry(void *args)
>  			 * parent device, restore its args.
>  			 */
>  			ret = rte_eal_hotplug_add(d->bus->name, d->name, drv_str ? drv_str : "");

Ternary operator with string: should be explicit NULL check. Consider
if (drv_str != NULL) pattern instead of implicit boolean conversion.

> -			if (ret) {
> -				PMD_DRV_LOG(ERR,
> -					    "Failed to add PCI device %s",
> +			free(drv_str);
> +
> +			if (ret == -ENODEV) {
> +				/* IB device not ready yet (mana_ib not probed).
> +				 * Restart the full retry from the PCI device
> +				 * check so we re-verify the device exists and
> +				 * get fresh interface names after any renames.
> +				 * This retries indefinitely -- the PCI sysfs
> +				 * check at the top of this function ensures
> +				 * we stop if the device disappears.
> +				 */
> +				PMD_DRV_LOG(NOTICE,
> +					    "IB device not ready for %s, "
> +					    "restarting probe in 1 second",
>  					    d->name);
> +				closedir(di);
> +				rte_eal_alarm_set(NETVSC_HOTADD_RETRY_INTERVAL,
> +						  netvsc_hotplug_retry,
> +						  hot_ctx);
> +				return;

Does this code properly handle the case where closedir is called on a
directory that was never successfully opened? The variable di could be
NULL here if opendir failed earlier.


---

commit 20260506020529.281654-5-longli at microsoft.com
Author: Long Li <longli at microsoft.com>
Subject: [PATCH v2 5/7] net/netvsc: retry when no matching MAC found in net directory

The patch adds retry logic when no matching MAC address is found in the
net directory, using a separate counter for this condition.

> diff --git a/drivers/net/netvsc/hn_ethdev.c b/drivers/net/netvsc/hn_ethdev.c
> index d1c12ca9d5..08465489f2 100644
> --- a/drivers/net/netvsc/hn_ethdev.c
> +++ b/drivers/net/netvsc/hn_ethdev.c

[ ... ]

> @@ -767,8 +773,36 @@ static void netvsc_hotplug_retry(void *args)
>  			    RTE_ETHER_ADDR_BYTES(dev->data->mac_addrs));
>  	}
>  
> +	/* If we opened the net directory but didn't find a matching MAC,
> +	 * the VF interface may not have appeared yet (e.g. on a multi-NIC
> +	 * PCI device, the second VF registers later). Retry.
> +	 */
> +	if (di != NULL) {
> +		closedir(di);
> +		di = NULL;
> +		if (dir == NULL) {
> +			/* readdir returned NULL -- loop ended without match */
> +			hot_ctx->mac_retry++;
> +			if (hot_ctx->mac_retry < NETVSC_MAX_MAC_RETRY) {
> +				PMD_DRV_LOG(DEBUG,
> +					    "%s: no matching MAC found in %s, "
> +					    "retrying in 1 second (mac_retry %d/%d)",
> +					    __func__, buf,
> +					    hot_ctx->mac_retry,
> +					    NETVSC_MAX_MAC_RETRY);
> +				rte_eal_alarm_set(NETVSC_HOTADD_RETRY_INTERVAL,
> +						  netvsc_hotplug_retry,
> +						  hot_ctx);
> +				return;

Does this logic properly distinguish between readdir returning NULL due to
end-of-directory versus readdir returning NULL due to an error? errno should
be checked to differentiate these cases.

> +			}
> +			PMD_DRV_LOG(NOTICE,
> +				    "%s: no matching MAC found after %d retries, giving up",
> +				    __func__, hot_ctx->mac_retry);
> +		}
> +	}

Does the mac_retry counter get reset when restarting from earlier retry
paths (SIOCGIFHWADDR failure in patch 2, -ENODEV in patch 3)? These restarts
use the same hot_ctx but reset the directory scan.


---

commit 20260506020529.281654-6-longli at microsoft.com
Author: Long Li <longli at microsoft.com>
Subject: [PATCH v2 6/7] net/netvsc: forward per-queue stats from VF device

The patch calls the VF device stats_get directly to obtain per-queue
statistics.

> diff --git a/drivers/net/netvsc/hn_vf.c b/drivers/net/netvsc/hn_vf.c
> index 1fcc65a712..49e6a5b283 100644
> --- a/drivers/net/netvsc/hn_vf.c
> +++ b/drivers/net/netvsc/hn_vf.c
> @@ -749,7 +749,7 @@ void hn_vf_rx_queue_release(struct hn_data *hv, uint16_t queue_id)
>  
>  int hn_vf_stats_get(struct rte_eth_dev *dev,
>  		    struct rte_eth_stats *stats,
> -		    struct eth_queue_stats *qstats __rte_unused)
> +		    struct eth_queue_stats *qstats)
>  {
>  	struct hn_data *hv = dev->data->dev_private;
>  	struct rte_eth_dev *vf_dev;
> @@ -757,8 +757,22 @@ int hn_vf_stats_get(struct rte_eth_dev *dev,
>  
>  	rte_rwlock_read_lock(&hv->vf_lock);
>  	vf_dev = hn_get_vf_dev(hv);
> -	if (vf_dev)
> -		ret = rte_eth_stats_get(vf_dev->data->port_id, stats);
> +	if (vf_dev) {

Implicit boolean check: should be if (vf_dev != NULL).

> +		/* Call dev_ops->stats_get directly instead of the public
> +		 * rte_eth_stats_get API because we need to forward the
> +		 * per-queue stats (qstats) which the public API does not
> +		 * support.  Fall back to the public API if the VF driver
> +		 * does not implement stats_get or returns -ENOTSUP.
> +		 */
> +		if (vf_dev->dev_ops->stats_get != NULL) {
> +			ret = vf_dev->dev_ops->stats_get(vf_dev, stats, qstats);
> +			if (ret == -ENOTSUP)
> +				ret = rte_eth_stats_get(vf_dev->data->port_id,
> +							stats);
> +		} else {
> +			ret = rte_eth_stats_get(vf_dev->data->port_id, stats);
> +		}
> +	}

Does this code handle the case where vf_dev->dev_ops itself is NULL? The
dereference vf_dev->dev_ops->stats_get assumes dev_ops is valid.


---

commit 20260506020529.281654-7-longli at microsoft.com
Author: Long Li <longli at microsoft.com>
Subject: [PATCH v2 7/7] net/netvsc: handle VF recovery events for service reset

The patch registers callbacks for VF recovery events to handle MANA service
resets without full device removal.

> diff --git a/drivers/net/netvsc/hn_vf.c b/drivers/net/netvsc/hn_vf.c
> index 49e6a5b283..52e1bb7413 100644
> --- a/drivers/net/netvsc/hn_vf.c
> +++ b/drivers/net/netvsc/hn_vf.c

[ ... ]

> @@ -111,7 +118,56 @@ static int hn_vf_attach(struct rte_eth_dev *dev, struct hn_data


More information about the test-report mailing list