[PATCH] common/mlx5: skip ROCE disable for auxiliary SF devices
Max Tottenham
mtottenh at akamai.com
Mon Feb 9 14:01:51 CET 2026
On Thu Feb 5, 2026 at 2:59 PM GMT, Dariusz Sosnowski wrote:
> Thank you for reporting the issue and the patch.
> Please see comments below.
Responses inline
>
> On Sat, Jan 10, 2026 at 11:15:10PM +0000, Max Tottenham wrote:
> > When probing an SF as a vDPA device, mlx5_roce_disable() targets the
> > parent PF address (via mlx5_dev_to_pci_str). This incorrectly attempts
> > to disable ROCE on the parent PF rather than the SF itself.
> >
> > This causes vDPA probe failures when the parent PF already has an open
> > IB context (e.g., probed for uplink ports or SF representors).
> >
> > For SubFunctions, ROCE is configured via devlink parameters
> > (enable_roce) before device creation. Skip the runtime ROCE disable
> > for auxiliary devices since the devlink configuration is already in
> > effect and targeting the parent PF is incorrect.
> >
> > Signed-off-by: Max Tottenham <mtottenh at akamai.com>
> > ---
> > drivers/common/mlx5/linux/mlx5_common_os.c | 13 +++++++++++++
> > 1 file changed, 13 insertions(+)
> >
> > diff --git a/drivers/common/mlx5/linux/mlx5_common_os.c b/drivers/common/mlx5/linux/mlx5_common_os.c
> > index 2867e21618..6fa12e06eb 100644
> > --- a/drivers/common/mlx5/linux/mlx5_common_os.c
> > +++ b/drivers/common/mlx5/linux/mlx5_common_os.c
> > @@ -690,6 +690,19 @@ mlx5_roce_disable(const struct rte_device *dev)
> > {
> > char pci_addr[PCI_PRI_STR_SIZE] = { 0 };
> >
> > + /*
> > + * For auxiliary devices (SFs), ROCE is configured via devlink
> > + * parameters (enable_roce) before device creation. Skip runtime
> > + * ROCE disable since mlx5_dev_to_pci_str() returns the parent PF
> > + * address, not the SF - disabling ROCE on the parent PF is both
> > + * incorrect and may fail if the PF already has an active IB context.
> > + */
> > + if (!mlx5_dev_is_pci(dev)) {
> > + DRV_LOG(INFO, "Skipping ROCE disable for auxiliary device \"%s\"",
> > + dev->name);
> > + return 0;
> > + }
>
> The logic has a bug as you mentioned, but I don't think
> it would be a good idea to not disable ROCE automatically for SFs.
> Especially since, IIUC, enable_roce option value is not inherited
> from PF when new SF is created and probed.
> It'll move more responsibility to the user regarding
> port configuration.
Ok. I can see arguments for/against here - as I think that for SFs, my
recollection is that you are already required to disable ROCE if not
using the hotplug API. But it's a small enough change to attempt a
disable here.
>
> In my opinion mlx5_roce_disable() should have split logic like so:
>
> if mlx5_dev_is_pci(dev)
> // for PCI devices continue as usual:
> // try disabling ROCE through netlink or sysfs
> else
> // for SFs: try disabling ROCE through netlink
>
> This would require some adjustments in mlx5_nl_roce_disable()
> and related code.
> Specifically, devlink attributes should be adjusted when disabling ROCE
> for auxiliary devices:
>
> - DEVLINK_ATTR_BUS_NAME = "auxiliary"
> - DEVLINK_ATTR_DEV_NAME = device name from rte_device->name
>
> Would you be able to make the necessary changes?
>
Sure. I also found a few other SF hotplug related bugs I'll send along
with the V2.
> > +
> > if (mlx5_dev_to_pci_str(dev, pci_addr, sizeof(pci_addr)) < 0)
> > return -rte_errno;
> > /* Firstly try to disable ROCE by Netlink and fallback to sysfs. */
> > --
> > 2.51.2
> >
>
> Best regards,
> Dariusz Sosnowski
More information about the dev
mailing list