mlx5: imissed / out_of_buffer counter always 0
Daniel Östman
daniel.ostman at ericsson.com
Wed Nov 8 13:55:35 CET 2023
Hi,
Any input from Nvidia on this? Matan perhaps?
The question here is if it's expected to require capability SYS_RAWIO just to get the out of buffer counter?
If so, any plans on changing that?
Best regards,
Daniel
> -----Original Message-----
> From: Maxime Coquelin <maxime.coquelin at redhat.com>
> Sent: Wednesday, 4 October 2023 15:49
> To: Daniel Östman <daniel.ostman at ericsson.com>; Erez Ferber
> <erezferber at gmail.com>; Slava Ovsiienko <viacheslavo at nvidia.com>
> Cc: users at dpdk.org; Matan Azrad <matan at nvidia.com>;
> david.marchand at redhat.com
> Subject: Re: mlx5: imissed / out_of_buffer counter always 0
>
> Hi Daniel, Erez & Slava,
>
> My time to be sorry, I missed this email when coming back from vacation.
>
> On 8/18/23 14:04, Daniel Östman wrote:
> > Hi Maxime,
> >
> > Sorry for the late reply, I've been on vacation.
> > Please see my answer below.
> >
> > / Daniel
> >
> >> -----Original Message-----
> >> From: Maxime Coquelin <maxime.coquelin at redhat.com>
> >> Sent: Thursday, 22 June 2023 17:48
> >> To: Daniel Östman <daniel.ostman at ericsson.com>; Erez Ferber
> >> <erezferber at gmail.com>; Slava Ovsiienko <viacheslavo at nvidia.com>
> >> Cc: users at dpdk.org; Matan Azrad <matan at nvidia.com>;
> >> david.marchand at redhat.com
> >> Subject: Re: mlx5: imissed / out_of_buffer counter always 0
> >>
> >> Hi,
> >>
> >> On 6/21/23 22:22, Maxime Coquelin wrote:
> >>> Hi Daniel, all,
> >>>
> >>> On 6/5/23 16:00, Daniel Östman wrote:
> >>>> Hi Slava and Erez and thanks for your answers,
> >>>>
> >>>> Regarding the firmware, I’ve also deployed in a different OpenShift
> >>>> cluster were I see the exact same issue but with a different
> >>>> Mellanox
> >>>> NIC:
> >>>>
> >>>> Mellanox Technologies MT2892 Family - ConnectX-6 DX 2-port 100GbE
> >>>> QSFP56 PCIe Adapter
> >>>>
> >>>> driver: mlx5_core
> >>>>
> >>>> version: 5.0-0
> >>>> firmware-version: 22.36.1010 (DEL0000000027)
> >>>>
> >>>> From what I can see the firmware is relatively new on that one?
> >>>
> >>> With below configuration:
> >>> - ConnectX-6 Dx MT2892
> >>> - Kernel: 6.4.0-rc6
> >>> - FW version: 22.35.1012 (MT_0000000528)
> >>>
> >>> The out-of-buffer counter is fetched via
> >>> mlx5_devx_cmd_queue_counter_query():
> >>>
> >>> [pid 2942] ioctl(17, RDMA_VERBS_IOCTL, 0x7ffcb15bcd10) = 0 [pid
> >>> 2942] write(1, "\n ######################## NIC "..., 80) = 80 [pid
> >>> 2942] write(1, " RX-packets: 630997736 RX-miss"..., 70) = 70 [pid
> >>> 2942] write(1, " RX-errors: 0\n", 15) = 15 [pid 2942] write(1, "
> >>> RX-nombuf: 0 \n", 25) = 25 [pid 2942] write(1, "
> >>> TX-packets: 0 TX-erro"..., 60) = 60 [pid 2942] write(1,
> >>> "\n", 1) = 1 [pid 2942] write(1, " Throughput (since
> >>> last show)\n", 31) = 31 [pid 2942] write(1, " Rx-pps: 0
> >>> "..., 106) = 106 [pid 2942] write(1,"
> >>> ##############################"..., 79) = 79
> >>>
> >>> It looks like we may miss some mlx5 kernel patches so that we can
> >>> use
> >>> mlx5_devx_cmd_queue_counter_query() with RHEL?
> >>>
> >>> Erez, Slava, any idea on the patches that could be missing?
> >>
> >> Above test was on baremetal as root, I get the same "working"
> >> behaviour on RHEL as root.
> >>
> >> We managed to reproduce Daniel's with running the same within a
> >> container, enabling debug logs we have this warning:
> >>
> >> mlx5_common: DevX create q counter set failed errno=121 status=0x2
> >> syndrome=0x8975f1
> >> mlx5_net: Port 0 queue counter object cannot be created by DevX -
> >> fall-back to use the kernel driver global queue counter.
> >>
> >> Running the container as privileged solves the issue, and so does
> >> when adding SYS_RAWIO capability to the container.
> >>
> >> Erez, Slava, is that expected to require SYS_RAWIO just to get a stat
> counter?
>
> Erez & Slava, could it be possible to get the stats counters via devx without
> requiring SYS_RAWIO?
>
> >>
> >> Daniel, could you try adding SYS_RAWIO to your pod to confirm you
> >> face the same issue?
> >
> > Yes I can confirm what you are seeing when running in a cluster with
> Openshift 4.12 (RHEL 8.6) and with SYS_RAWIO or running as privileged.
> > But with privileged container I also need to run with UID 0 for it to work, is
> that what you are doing as well?
>
> I don't have an OCP setup at hand right now to test it, but IIRC yes we ran it
> with UID 0.
>
> > In both these cases the counter can be successfully retrieved through the
> DevX interface.
>
> Ok.
>
> > However, when running in a cluster with Openshift 4.10 (RHEL 8.4) I can not
> get it to work with any of these two approaches.
>
> I'm not sure this is Kernel related, as I tested on both RHEL-8.4.0 and latest
> RHEL_8.4 and I can get que q counters via ioctl().
>
> Maxime
>
> >> Thanks in advance,
> >> Maxime
> >>> Regards,
> >>> Maxime
> >>>
> >>>>
> >>>> I tried setting dv_flow_en=0 (and saw that it was propagated to
> >>>> config->dv_flow_en) but it didn’t seem to help.
> >>>>
> >>>> Erez, I’m not sure what you mean by shared or non-shared mode in
> >>>> this case, however it seems it could be related to the fact that
> >>>> the container is running in a separate network namespace. Because
> >>>> the hw_counter directory is available on the host (cluster node),
> >>>> but not in the pod container.
> >>>>
> >>>> Best regards,
> >>>>
> >>>> Daniel
> >>>>
> >>>> *From:*Erez Ferber <erezferber at gmail.com>
> >>>> *Sent:* Monday, 5 June 2023 12:29
> >>>> *To:* Slava Ovsiienko <viacheslavo at nvidia.com>
> >>>> *Cc:* Daniel Östman <daniel.ostman at ericsson.com>;
> users at dpdk.org;
> >>>> Matan Azrad <matan at nvidia.com>; maxime.coquelin at redhat.com;
> >>>> david.marchand at redhat.com
> >>>> *Subject:* Re: mlx5: imissed / out_of_buffer counter always 0
> >>>>
> >>>> Hi Daniel,
> >>>>
> >>>> is the container running in shared or non-shared mode ?
> >>>>
> >>>> For shared mode, I assume the kernel sysfs counters which DPDK
> >>>> relies on for imissed/out_of_buffer are not exposed.
> >>>>
> >>>> Best regards,
> >>>>
> >>>> Erez
> >>>>
> >>>> On Fri, 2 Jun 2023 at 18:07, Slava Ovsiienko
> >>>> <viacheslavo at nvidia.com <mailto:viacheslavo at nvidia.com>> wrote:
> >>>>
> >>>> Hi, Daniel
> >>>>
> >>>> I would recommend to take the following action:
> >>>>
> >>>> - update the firmware, 16.33.xxxx looks to be outdated a little bit.
> >>>> Please, try 16.35.1012 or later.
> >>>> mlx5_glue->devx_obj_create might succeed with the newer FW.
> >>>>
> >>>> - try to specify dv_flow_en=0 devarg, it forces mlx5 PMD to
> >>>> use
> >>>> rdma_core library for queue management
> >>>> and kernel driver will be aware about Rx queues being
> >>>> created and
> >>>> attach them to the kernel counter set
> >>>>
> >>>> With best regards,
> >>>> Slava
> >>>>
> >>>> *From:*Daniel Östman <daniel.ostman at ericsson.com
> >>>> <mailto:daniel.ostman at ericsson.com>>
> >>>> *Sent:* Friday, June 2, 2023 3:59 PM
> >>>> *To:* users at dpdk.org <mailto:users at dpdk.org>
> >>>> *Cc:* Matan Azrad <matan at nvidia.com
> >>>> <mailto:matan at nvidia.com>>;
> >>>> Slava Ovsiienko <viacheslavo at nvidia.com
> >>>> <mailto:viacheslavo at nvidia.com>>; maxime.coquelin at redhat.com
> >>>> <mailto:maxime.coquelin at redhat.com>;
> david.marchand at redhat.com
> >>>> <mailto:david.marchand at redhat.com>
> >>>> *Subject:* mlx5: imissed / out_of_buffer counter always 0
> >>>>
> >>>> Hi,
> >>>>
> >>>> I’m deploying a containerized DPDK application in an OpenShift
> >>>> Kubernetes environment using DPDK 21.11.3.
> >>>>
> >>>> The application uses a Mellanox ConnectX-5 100G NIC through VFs.
> >>>>
> >>>> The problem I have is that the ETH stats counter imissed
> >>>> (which
> >>>> seems to be mapped to “out_of_buffer” internally in mlx5 PMD
> >>>> driver)
> >>>> is 0 when I don’t expect it to be, i.e. when the application
> >>>> doesn’t
> >>>> read the packets fast enough.
> >>>>
> >>>> Using GDB I can see that it tries to access the counter
> >>>> through
> >>>>
> >>>> /sys/class/infiniband/mlx5_99/ports/1/hw_counters/out_of_buffer
> >>>> but
> >>>> the hw_counters directory is missing so it will just return a
> >>>> zero
> >>>> value. I don’t know why it is missing.
> >>>>
> >>>> When looking at mlx5_os_read_dev_stat() I can see that there
> >>>> is an
> >>>> alternative way of reading the counter, through
> >>>> mlx5_devx_cmd_queue_counter_query() but under the condition
> >>>> that
> >>>> priv->q_counters are set.
> >>>>
> >>>> It doesn’t get set in my case because
> >>>> mlx5_glue->devx_obj_create()
> >>>> fails (errno 22) in mlx5_devx_cmd_queue_counter_alloc().
> >>>>
> >>>> Have I missed something?
> >>>>
> >>>> NIC info:
> >>>>
> >>>> Mellanox Technologies MT27800 Family [ConnectX-5] - 100Gb
> >>>> 2-port
> >>>> QSFP28 MCX516A-CCHT
> >>>> driver: mlx5_core
> >>>> version: 5.0-0
> >>>> firmware-version: 16.33.1048 (MT_0000000417)
> >>>>
> >>>> Please let me know if I need to provide more information.
> >>>>
> >>>> Best regards,
> >>>>
> >>>> Daniel
> >>>>
More information about the users
mailing list