patch 'ethdev: fix device init without socket-local memory' has been queued to stable release 23.11.2

Xueming Li xuemingl at nvidia.com
Mon Aug 12 14:50:11 CEST 2024


Hi,

FYI, your patch has been queued to stable release 23.11.2

Note it hasn't been pushed to http://dpdk.org/browse/dpdk-stable yet.
It will be pushed if I get no objections before 08/14/24. So please
shout if anyone has objections.

Also note that after the patch there's a diff of the upstream commit vs the
patch applied to the branch. This will indicate if there was any rebasing
needed to apply to the stable branch. If there were code changes for rebasing
(ie: not only metadata diffs), please double check that the rebase was
correctly done.

Queued patches are on a temporary branch at:
https://git.dpdk.org/dpdk-stable/log/?h=23.11-staging

This queued commit can be viewed at:
https://git.dpdk.org/dpdk-stable/commit/?h=23.11-staging&id=7ff9eeeb710f2d1b441d7db1efca0809c20bebae

Thanks.

Xueming Li <xuemingl at nvidia.com>

---
>From 7ff9eeeb710f2d1b441d7db1efca0809c20bebae Mon Sep 17 00:00:00 2001
From: Bruce Richardson <bruce.richardson at intel.com>
Date: Mon, 22 Jul 2024 11:02:28 +0100
Subject: [PATCH] ethdev: fix device init without socket-local memory
Cc: Xueming Li <xuemingl at nvidia.com>

[ upstream commit ed34d87d9cfbae8b908159f60df2008e45e4c39f ]

When allocating memory for an ethdev, the rte_malloc_socket call used
only allocates memory on the NUMA node/socket local to the device. This
means that even if the user wanted to, they could never use a remote NIC
without also having memory on that NIC's socket.

For example, if we change examples/skeleton/basicfwd.c to have
SOCKET_ID_ANY as the socket_id parameter for Rx and Tx rings, we should
be able to run the app cross-numa e.g. as below, where the two PCI
devices are on socket 1, and core 1 is on socket 0:

 ./build/examples/dpdk-skeleton -l 1 --legacy-mem --socket-mem=1024,0 \
		-a a8:00.0 -a b8:00.0

This fails however, with the error:

  ETHDEV: failed to allocate private data
  PCI_BUS: Requested device 0000:a8:00.0 cannot be used

We can remove this restriction by doing a fallback call to general
rte_malloc after a call to rte_malloc_socket fails. This should be safe
to do because the later ethdev calls to setup Rx/Tx queues all take a
socket_id parameter, which can be used by applications to enforce the
requirement for local-only memory for a device, if so desired. [If
device-local memory is present it will be used as before, while if not
present the rte_eth_dev_configure call will now pass, but the subsequent
queue setup calls requesting local memory will fail].

Fixes: e489007a411c ("ethdev: add generic create/destroy ethdev APIs")
Fixes: dcd5c8112bc3 ("ethdev: add PCI driver helpers")

Signed-off-by: Bruce Richardson <bruce.richardson at intel.com>
Signed-off-by: Padraig Connolly <padraig.j.connolly at intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit at amd.com>
---
 lib/ethdev/ethdev_driver.c | 20 +++++++++++++++-----
 lib/ethdev/ethdev_pci.h    | 20 +++++++++++++++++---
 2 files changed, 32 insertions(+), 8 deletions(-)

diff --git a/lib/ethdev/ethdev_driver.c b/lib/ethdev/ethdev_driver.c
index 55a9dcc565..47659d5e8c 100644
--- a/lib/ethdev/ethdev_driver.c
+++ b/lib/ethdev/ethdev_driver.c
@@ -297,15 +297,25 @@ rte_eth_dev_create(struct rte_device *device, const char *name,
 			return -ENODEV;
 
 		if (priv_data_size) {
+			/* try alloc private data on device-local node. */
 			ethdev->data->dev_private = rte_zmalloc_socket(
 				name, priv_data_size, RTE_CACHE_LINE_SIZE,
 				device->numa_node);
 
-			if (!ethdev->data->dev_private) {
-				RTE_ETHDEV_LOG(ERR,
-					"failed to allocate private data\n");
-				retval = -ENOMEM;
-				goto probe_failed;
+			/* fall back to alloc on any socket on failure */
+			if (ethdev->data->dev_private == NULL) {
+				ethdev->data->dev_private = rte_zmalloc(name,
+						priv_data_size, RTE_CACHE_LINE_SIZE);
+
+				if (ethdev->data->dev_private == NULL) {
+					RTE_ETHDEV_LOG(ERR, "failed to allocate private data\n");
+					retval = -ENOMEM;
+					goto probe_failed;
+				}
+				/* got memory, but not local, so issue warning */
+				RTE_ETHDEV_LOG(WARNING,
+					       "Private data for ethdev '%s' not allocated on local NUMA node %d\n",
+					       device->name, device->numa_node);
 			}
 		}
 	} else {
diff --git a/lib/ethdev/ethdev_pci.h b/lib/ethdev/ethdev_pci.h
index ddb559aa95..c40bc2ed02 100644
--- a/lib/ethdev/ethdev_pci.h
+++ b/lib/ethdev/ethdev_pci.h
@@ -93,12 +93,26 @@ rte_eth_dev_pci_allocate(struct rte_pci_device *dev, size_t private_data_size)
 			return NULL;
 
 		if (private_data_size) {
+			/* Try and alloc the private-data structure on socket local to the device */
 			eth_dev->data->dev_private = rte_zmalloc_socket(name,
 				private_data_size, RTE_CACHE_LINE_SIZE,
 				dev->device.numa_node);
-			if (!eth_dev->data->dev_private) {
-				rte_eth_dev_release_port(eth_dev);
-				return NULL;
+
+			/* if cannot allocate memory on the socket local to the device
+			 * use rte_malloc to allocate memory on some other socket, if available.
+			 */
+			if (eth_dev->data->dev_private == NULL) {
+				eth_dev->data->dev_private = rte_zmalloc(name,
+						private_data_size, RTE_CACHE_LINE_SIZE);
+
+				if (eth_dev->data->dev_private == NULL) {
+					rte_eth_dev_release_port(eth_dev);
+					return NULL;
+				}
+				/* got memory, but not local, so issue warning */
+				RTE_ETHDEV_LOG(WARNING,
+					       "Private data for ethdev '%s' not allocated on local NUMA node %d\n",
+					       dev->device.name, dev->device.numa_node);
 			}
 		}
 	} else {
-- 
2.34.1

---
  Diff of the applied patch vs upstream commit (please double-check if non-empty:
---
--- -	2024-08-12 20:44:06.766476861 +0800
+++ 0134-ethdev-fix-device-init-without-socket-local-memory.patch	2024-08-12 20:44:02.545069383 +0800
@@ -1 +1 @@
-From ed34d87d9cfbae8b908159f60df2008e45e4c39f Mon Sep 17 00:00:00 2001
+From 7ff9eeeb710f2d1b441d7db1efca0809c20bebae Mon Sep 17 00:00:00 2001
@@ -4,0 +5,3 @@
+Cc: Xueming Li <xuemingl at nvidia.com>
+
+[ upstream commit ed34d87d9cfbae8b908159f60df2008e45e4c39f ]
@@ -35 +37,0 @@
-Cc: stable at dpdk.org
@@ -46 +48 @@
-index f48c0eb8bc..c335a25a82 100644
+index 55a9dcc565..47659d5e8c 100644
@@ -49 +51 @@
-@@ -303,15 +303,25 @@ rte_eth_dev_create(struct rte_device *device, const char *name,
+@@ -297,15 +297,25 @@ rte_eth_dev_create(struct rte_device *device, const char *name,
@@ -59,2 +61,2 @@
--				RTE_ETHDEV_LOG_LINE(ERR,
--					"failed to allocate private data");
+-				RTE_ETHDEV_LOG(ERR,
+-					"failed to allocate private data\n");
@@ -69 +71 @@
-+					RTE_ETHDEV_LOG_LINE(ERR, "failed to allocate private data");
++					RTE_ETHDEV_LOG(ERR, "failed to allocate private data\n");
@@ -74,3 +76,3 @@
-+				RTE_ETHDEV_LOG_LINE(WARNING,
-+						"Private data for ethdev '%s' not allocated on local NUMA node %d",
-+						device->name, device->numa_node);
++				RTE_ETHDEV_LOG(WARNING,
++					       "Private data for ethdev '%s' not allocated on local NUMA node %d\n",
++					       device->name, device->numa_node);
@@ -81 +83 @@
-index 737fff1833..ec4f731270 100644
+index ddb559aa95..c40bc2ed02 100644
@@ -108,3 +110,3 @@
-+				RTE_ETHDEV_LOG_LINE(WARNING,
-+						"Private data for ethdev '%s' not allocated on local NUMA node %d",
-+						dev->device.name, dev->device.numa_node);
++				RTE_ETHDEV_LOG(WARNING,
++					       "Private data for ethdev '%s' not allocated on local NUMA node %d\n",
++					       dev->device.name, dev->device.numa_node);


More information about the stable mailing list