patch 'net/mlx5: fix port event cleaning order' has been queued to stable release 20.11.7

Michael Baum michaelba at nvidia.com
Sun Nov 20 08:28:42 CET 2022


On Fri, 2022-11-18 at 16:28, Luca Boccassi wrote:
> 
> On Fri, 2022-11-18 at 12:53 +0000, Michael Baum wrote:
> > Hi Luca,
> >
> > This patch causes another issue, so I have sent another patch to squash into.
> >
> > The title of this patch is: " [PATCH 20.11] net/mlx5: fix invalid memory access
> in port closing"
> >
> > Thanks,
> > Michael Baum
> 
> Is it a backport gone wrong (or an issue specific to 20.11), or is it an issue also in
> the same change that went in main? If the latter, what's the commit id of the
> fix?

The issue also in the same change that went in main.
I sent the fix and it isn't merged yet, the fix in patchwork: https://patchwork.dpdk.org/project/dpdk/patch/20221117152807.1259256-1-michaelba@nvidia.com/

> 
> > > -----Original Message-----
> > > From: luca.boccassi at gmail.com <luca.boccassi at gmail.com>
> > > Sent: Friday, 18 November 2022 1:09
> > > To: Michael Baum <michaelba at nvidia.com>
> > > Cc: Matan Azrad <matan at nvidia.com>; dpdk stable <stable at dpdk.org>
> > > Subject: patch 'net/mlx5: fix port event cleaning order' has been
> > > queued to stable release 20.11.7
> > >
> > > External email: Use caution opening links or attachments
> > >
> > >
> > > Hi,
> > >
> > > FYI, your patch has been queued to stable release 20.11.7
> > >
> > > Note it hasn't been pushed to http://dpdk.org/browse/dpdk-stable yet.
> > > It will be pushed if I get no objections before 11/19/22. So please
> > > shout if anyone has objections.
> > >
> > > Also note that after the patch there's a diff of the upstream commit
> > > vs the patch applied to the branch. This will indicate if there was
> > > any rebasing needed to apply to the stable branch. If there were
> > > code changes for rebasing
> > > (ie: not only metadata diffs), please double check that the rebase
> > > was correctly done.
> > >
> > > Queued patches are on a temporary branch at:
> > > https://github.com/kevintraynor/dpdk-stable
> > >
> > > This queued commit can be viewed at:
> > > https://github.com/kevintraynor/dpdk-
> > > stable/commit/79c37d65d2ff68ccd8dd2ad99340f54c80232918
> > >
> > > Thanks.
> > >
> > > Luca Boccassi
> > >
> > > ---
> > > From 79c37d65d2ff68ccd8dd2ad99340f54c80232918 Mon Sep 17 00:00:00
> > > 2001
> > > From: Michael Baum <michaelba at nvidia.com>
> > > Date: Thu, 10 Nov 2022 00:29:38 +0200
> > > Subject: [PATCH] net/mlx5: fix port event cleaning order
> > >
> > > [ upstream commit 13c5c093905c09bb6207ee1c6a4f05d39f8badcd ]
> > >
> > > The shared IB device (sh) has per port data with filed for interrupt
> > > handler port_id. It used by shared interrupt handler to find the
> > > corresponding rte_eth device by IB port index.
> > > If value is equal or greater RTE_MAX_ETHPORTS it means there is no
> > > subhandler installed for specified IB port index.
> > >
> > > When a few ports are created under same sh, the sh is created with
> > > the first port and the interrupt handler port_id is initialized to
> > > RTE_MAX_ETHPORTS for each port.
> > > In port creation, the interrupt handler port_id is updated with the correct
> value.
> > > Since this updating, the mlx5_dev_interrupt_nl_cb function uses this
> > > port and its priv structure.
> > > However, when the ports are closed, this filed isn't updated and the
> > > interrupt handler continue working until it is uninstalled in SH destruction.
> > > If mlx5_dev_interrupt_nl_cb is called between port closing and SH
> > > destruction, it uses invalid port causing a crash.
> > >
> > > This patch adds interrupt handler port_id updating to the close
> > > function and add memory barrier to make sure it is done before priv reset.
> > >
> > > Fixes: 655c3c26c11e ("net/mlx5: fix initial link status detection")
> > >
> > > Signed-off-by: Michael Baum <michaelba at nvidia.com>
> > > Acked-by: Matan Azrad <matan at nvidia.com>
> > > ---
> > >  drivers/net/mlx5/linux/mlx5_os.c | 3 +++
> > >  drivers/net/mlx5/mlx5.c          | 6 ++++++
> > >  2 files changed, 9 insertions(+)
> > >
> > > diff --git a/drivers/net/mlx5/linux/mlx5_os.c
> > > b/drivers/net/mlx5/linux/mlx5_os.c
> > > index af19b54b7e..e79b1a275c 100644
> > > --- a/drivers/net/mlx5/linux/mlx5_os.c
> > > +++ b/drivers/net/mlx5/linux/mlx5_os.c
> > > @@ -1640,6 +1640,9 @@ err_secondary:
> > >         return eth_dev;
> > >  error:
> > >         if (priv) {
> > > +               priv->sh->port[priv->dev_port - 1].nl_ih_port_id =
> > > +                                                              RTE_MAX_ETHPORTS;
> > > +               rte_io_wmb();
> > >                 if (priv->mreg_cp_tbl)
> > >                         mlx5_hlist_destroy(priv->mreg_cp_tbl);
> > >                 if (priv->sh)
> > > diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index
> > > 90985479de..22d3ecace2 100644
> > > --- a/drivers/net/mlx5/mlx5.c
> > > +++ b/drivers/net/mlx5/mlx5.c
> > > @@ -1453,6 +1453,12 @@ mlx5_dev_close(struct rte_eth_dev *dev)
> > >                 if (!c)
> > >                         claim_zero(rte_eth_switch_domain_free(priv->domain_id));
> > >         }
> > > +       priv->sh->port[priv->dev_port - 1].nl_ih_port_id =
> RTE_MAX_ETHPORTS;
> > > +       /*
> > > +        * The interrupt handler port id must be reset before priv is reset
> > > +        * since 'mlx5_dev_interrupt_nl_cb' uses priv.
> > > +        */
> > > +       rte_io_wmb();
> > >         memset(priv, 0, sizeof(*priv));
> > >         priv->domain_id = RTE_ETH_DEV_SWITCH_DOMAIN_ID_INVALID;
> > >         /*
> > > --
> > > 2.34.1
> > >
> > > ---
> > >   Diff of the applied patch vs upstream commit (please double-check
> > > if non-
> > > empty:
> > > ---
> > > --- -   2022-11-17 23:07:56.193400526 +0000
> > > +++ 0016-net-mlx5-fix-port-event-cleaning-order.patch   2022-11-17
> > > 23:07:55.492330367 +0000
> > > @@ -1 +1 @@
> > > -From 13c5c093905c09bb6207ee1c6a4f05d39f8badcd Mon Sep 17 00:00:00
> > > 2001
> > > +From 79c37d65d2ff68ccd8dd2ad99340f54c80232918 Mon Sep 17 00:00:00
> > > 2001
> > > @@ -5,0 +6,2 @@
> > > +[ upstream commit 13c5c093905c09bb6207ee1c6a4f05d39f8badcd ]
> > > +
> > > @@ -28 +29,0 @@
> > > -Cc: stable at dpdk.org
> > > @@ -38 +39 @@
> > > -index 2b6741396d..a71474c90a 100644
> > > +index af19b54b7e..e79b1a275c 100644
> > > @@ -41 +42 @@
> > > -@@ -1676,6 +1676,9 @@ err_secondary:
> > > +@@ -1640,6 +1640,9 @@ err_secondary:
> > > @@ -48,3 +49,3 @@
> > > - #ifdef HAVE_MLX5_HWS_SUPPORT
> > > -               if (eth_dev &&
> > > -                   priv->sh &&
> > > +               if (priv->mreg_cp_tbl)
> > > +                       mlx5_hlist_destroy(priv->mreg_cp_tbl);
> > > +               if (priv->sh)
> > > @@ -52 +53 @@
> > > -index 1cf6df6049..95b0151fbc 100644
> > > +index 90985479de..22d3ecace2 100644
> > > @@ -55 +56 @@
> > > -@@ -2137,6 +2137,12 @@ mlx5_dev_close(struct rte_eth_dev *dev)
> > > +@@ -1453,6 +1453,12 @@ mlx5_dev_close(struct rte_eth_dev *dev)



More information about the stable mailing list