[PATCH] Skip vfio in the scenario of non-privileged mode
Yang Ming
ming.1.yang at nokia-sbell.com
Wed Jan 22 09:15:03 CET 2025
On 2025/1/18 00:47, Stephen Hemminger wrote:
> Caution: This is an external email. Please be very careful when clicking links or opening attachments. See http://nok.it/nsb for additional information.
>
> On Fri, 17 Jan 2025 15:28:47 +0800
> Yang Ming <ming.1.yang at nokia-sbell.com> wrote:
>
>> DPDK detect vfio container according the existence of vfio
>> module. But for container with non-privileged mode, there is
>> possibility that no VFIO_DIR(/dev/vfio) mapping from host to
>> container when host have both Intel NIC and Mellanox NIC but
>> this conntainer only allocate VFs from Mellanox NIC.
>> In this case, vfio kernel module has already been loaded from
>> the host.
>> This scenario will cause the error log occurs in DPDK primary
>> process as below:
>> 'EAL: cannot open VFIO container, error 2 (No such file or
>> directory)'
>> 'EAL: VFIO support could not be initialized'
>> Because `rte_vfio_enable()` call `rte_vfio_get_container_fd()`
>> to execute `vfio_container_fd = open(VFIO_CONTAINER_PATH,
>> O_RDWR);` but VFIO_CONTAINER_PATH(/dev/vfio/vfio) doesn't exist
>> in this container.
>> This scenario will also lead to the delay of DPDK secondary
>> process because `default_vfio_cfg->vfio_enabled = 0` and
>> `default_vfio_cfg->vfio_container_fd = -1`, socket error will
>> be set in DPDK primary process when it sync this info to
>> the secondary process.
>> This patch use to skip this kind of useless detection for this
>> scenario.
>>
>> Signed-off-by: Yang Ming <ming.1.yang at nokia-sbell.com>
>> ---
>> lib/eal/linux/eal_vfio.c | 11 +++++++++++
>> 1 file changed, 11 insertions(+)
>>
>> diff --git a/lib/eal/linux/eal_vfio.c b/lib/eal/linux/eal_vfio.c
>> index 7132e24cba..1679d29263 100644
>> --- a/lib/eal/linux/eal_vfio.c
>> +++ b/lib/eal/linux/eal_vfio.c
>> @@ -7,6 +7,7 @@
>> #include <fcntl.h>
>> #include <unistd.h>
>> #include <sys/ioctl.h>
>> +#include <dirent.h>
>>
>> #include <rte_errno.h>
>> #include <rte_log.h>
>> @@ -1083,6 +1084,7 @@ rte_vfio_enable(const char *modname)
>> /* initialize group list */
>> int i, j;
>> int vfio_available;
>> + DIR *dir;
>> const struct internal_config *internal_conf =
>> eal_get_internal_configuration();
>>
>> @@ -1119,6 +1121,15 @@ rte_vfio_enable(const char *modname)
>> return 0;
>> }
>>
>> + /* return 0 if VFIO directory not exist for container with non-privileged mode */
>> + dir = opendir(VFIO_DIR);
>> + if (dir == NULL) {
>> + EAL_LOG(DEBUG,
>> + "VFIO directory not exist, skipping VFIO support...");
>> + return 0;
>> + }
>> + closedir(dir);
> You need to test the non-container cases.
> If vfio is loaded /dev/vfio is a character device (not a directory)
>
> Also looks suspicious that VFIO_DIR is defined but never used currently.
>
Hi Stephen,
For non-container test, /dev/vfio/vfio will be character device, not
/dev/vfio.
Here is the command result on my testing environment with Intel NIC.
[root at computer-1 testuser]# ls -l /dev/vfio
total 0
crw-rw-rw-. 1 root root 10, 196 Jan 22 01:50 vfio
[root at computer-1 testuser]# dpdk-devbind.py -b vfio-pci 0000:04:10.2
[root at computer-1 testuser]# ls -l /dev/vfio
total 0
crw-------. 1 root root 239, 0 Jan 22 01:52 59
crw-rw-rw-. 1 root root 10, 196 Jan 22 01:50 vfio
[root at computer-1 testuser]# dpdk-devbind.py -b ixgbevf 0000:04:10.2
[root at computer-1 testuser]# ls -l /dev/vfio
total 0
crw-rw-rw-. 1 root root 10, 196 Jan 22 01:50 vfio
Can you confirm your test scenario?
More information about the dev
mailing list