[dpdk-dev] rte_eth_dev_socket_id() vs KVM/AWS/...
Burakov, Anatoly
anatoly.burakov at intel.com
Mon May 14 10:09:36 CEST 2018
On 09-May-18 6:08 PM, Mike Stolarchuk wrote:
> Hello Dpdk,
>
> rte_eth_dev_socket_id() describes a -1 return value as:
>
> *Returns*
>
> The NUMA socket id to which the Ethernet device is connected or a default
> of zero if the socket could not be determined. -1 is returned is the
> port_id value is out of range.
>
> But, rte_eth_dev_socket_id() is implemented as:
>
> int
> rte_eth_dev_socket_id(uint16_t port_id)
> {
> RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -1);
> return rte_eth_devices[port_id].data->numa_node;
> }
>
> And numa_node here is set from /sys/bus/pci/<device>/numa_node.
> And https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-bus-pci
> documents numa_node as:
>
> What: /sys/bus/pci/devices/.../numa_node
> Date: Oct 2014
> Contact: Prarit Bhargava <prarit at redhat.com>
> Description:
> This file contains the NUMA node to which the PCI device is
> attached, or -1 if the node is unknown. The initial value
> comes from an ACPI _PXM method or a similar firmware
> source. If that is missing or incorrect, this file can be
> written to override the node. In that case, please report
> a firmware bug to the system vendor. Writing to this file
> taints the kernel with TAINT_FIRMWARE_WORKAROUND, which
> reduces the supportability of your system.
>
> in other words, a value of -1 for numa_node means the association of the
> pci device WRT socket is unknown.
> And as an example, in a KVM with e1000's.
> /sys/bus/pci/devices/<d>/numa_node can return -1.
>
> This means that rte_eth_dev_socket_id() returns -1 in situations other than
> 'port_id value is out of range'.
> And its not possible to identify whether the port_id is invalid, or whether
> the base system didn't
> announce the numa_node association.
>
> Perhaps a -1 return value should be an indication the the numa_node
> association isn't known,
> and a different return value, say -2, should indicate the port_id value is
> out of range.
>
>
> mts.
>
For cases like these, we have rte_errno - we could set it to EINVAL in
case of invalid value, and e.g. ENODEV (?) on invalid NUMA node.
--
Thanks,
Anatoly
More information about the dev
mailing list