Net/igc's rte_eth_rx_burst() never returns packets

Ivan Malov ivan.malov at arknetworks.am
Fri Aug 2 17:14:14 CEST 2024


Hi Fabio,

How about running
find /sys/kernel/iommu_groups/ -type l
to identify devices that are in the same IOMMU group as 0000:09:00.0 ?

Thank you.

On Fri, 2 Aug 2024, Fabio Fernandes wrote:

> Hi Ivan,
>
> I'm using igb_uio because it's the one recommended for my target network card net/ena.
>
> I've tried both vfio-pci and uio_pci_generic, but they fail for different reasons.
>
> With vfio-pci, EAL tells me:
>
> ```
> EAL: PCI device 0000:09:00.0 on NUMA socket -1
> EAL:   probe driver: 8086:15f3 net_igc
> EAL: 0000:09:00.0 VFIO group is not viable! Not all devices in IOMMU group bound to VFIO or unbound
> EAL: Requested device 0000:09:00.0 cannot be used
> ```
>
> I tried adding kernel boot parameter `iommu=on` with no luck.
> I also tried unbinding my other cards:
>
> ```
> Network devices using DPDK-compatible driver
> ============================================
> 0000:09:00.0 'Ethernet Controller I225-V 15f3' drv=vfio-pci unused=igc,uio_pci_generic
>
> Other Network devices
> =====================
> 0000:08:00.0 'MT7922 802.11ax PCI Express Wireless Network Adapter 0616' unused=mt7921e,vfio-pci,uio_pci_generic
> 0000:0a:00.0 'AQtion AQC113CS NBase-T/IEEE 802.3an Ethernet Controller [Antigua 10G] 94c0' unused=atlantic,vfio-pci,uio_pci_generic
> ```
>
> Resulting rte_eth_dev_count_total() == 0, so nothing starts.
>
>
> Finally, I also tried `uio_pci_generic`:
>
> ```
> Network devices using DPDK-compatible driver
> ============================================
> 0000:09:00.0 'Ethernet Controller I225-V 15f3' drv=uio_pci_generic unused=igc,vfio-pci
> ```
>
> This time DPDK accepts the device, however, I see the same old dmesg error appearing again:
>
> ```
> [ 1449.570184] uio_pci_generic 0000:09:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0011 address=0x13397ff80 flags=0x0000]
> ```
>
> If you do have any further suggestions, please let me know.
> In any case, thank you for your feedback so far!
>
> Regards,
> Fabio
>
>
> Sent with Proton Mail secure email.
>
> On Thursday, August 1st, 2024 at 10:16 PM, Ivan Malov <ivan.malov at arknetworks.am> wrote:
>
>> Hi Fabio,
>>
>> With regard to endianness conversion, I'd rather expect that line
>> to be something like rte_le_to_cpu_32 as the source value is
>> declared __le32. But, as I noted before, this is likely a
>> don't care as your machine is probably little-endian, and
>> rte_cpu_to_le_32 thus might simply do nothing.
>>
>> Whereas your observation of the error in dmesg is indeed a
>> valuable clue. Since it comes from igb_uio, my question is:
>> why at all use igb_uio? People say it's an outdated driver.
>> Have you considered using vfio-pci or uio_pci_generic
>> instead? I suggest you try binding to vfio-pci and
>> re-check with unmodified PMD source first.
>>
>> Thank you.
>>
>> On Thu, 1 Aug 2024, Fabio Fernandes wrote:
>>
>>> Hi Ivan,
>>>
>>> Thank you for your response.
>>>
>>> I've ran it with the flags you suggested and attached the produced log.
>>>
>>> { sudo ./dpdk-testpmd --log-level=pmd.net.igc,debug 2>&1; } > testpmd_with_debug_and_rx_print.log;
>>>
>>> testpmd_with_debug_and_rx_print.log.zip
>>>
>>> However, the driver never reaches point[1] (nor [2]) and this debug line never got logged. I've placed break points to confirm that the loop always exits just before [1], at this check:
>>> `if (!(staterr & IGC_RXD_STAT_DD)) break;`
>>>
>>> I've also instrumented testpmd.h as below, to confirm in the log file that RX is called many times and never returns anything but zeros:
>>> ```
>>> static inline uint16_t
>>> common_fwd_stream_receive(struct fwd_stream *fs, struct rte_mbuf **burst,
>>> unsigned int nb_pkts)
>>> {
>>> uint16_t nb_rx;
>>>
>>> nb_rx = rte_eth_rx_burst(fs->rx_port, fs->rx_queue, burst, nb_pkts);
>>>
>>> // Instrumentation Begin
>>> {
>>> static uint64_t g_call_count = 0;
>>> static uint64_t g_rx_sum = 0;
>>> g_rx_sum += nb_rx;
>>> ++g_call_count;
>>> if (nb_rx)
>>> fprintf(stderr, "rte_eth_rx_burst: %u\n", nb_rx);
>>> if ((g_call_count % 100000000UL) == 0)
>>> fprintf(stderr, "g_rx_sum: %lu, g_call_count: %lu\n",
>>> g_rx_sum, g_call_count);
>>> }
>>> // Instrumentation End
>>>
>>> if (record_burst_stats)
>>> fs->rx_burst_stats.pkt_burst_spread[nb_rx]++;
>>> fs->rx_packets += nb_rx;
>>> return nb_rx;
>>> }
>>>
>>> ```
>>>
>>> In regards to [3], I've changed that to use rte_cpu_to_be_32() instead and rebuilt DPDK, but with same results and the loop still always exits there.
>>>
>>> I did, however, noticed something strange and this is probably a clue:
>>>
>>> Every time I step over this line of `igc_rx_init()` in the debugger:
>>> https://github.com/DPDK/dpdk/blob/v24.03/drivers/net/igc/igc_txrx.c#L1204
>>>
>>> `IGC_WRITE_REG(hw, IGC_RDT(rxq->reg_idx), rxq->nb_rx_desc - 1);`
>>>
>>> I get this in `dmesg` kernel, coming from the igb_uio kernel I've bound to the device I'm testing:
>>>
>>> `[26185.005945] igb_uio 0000:09:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0011 address=0x116141a00 flags=0x0000]`
>>>
>>> The address matches this in the debugger:
>>>
>>> ```
>>> rxq->rx_ring_phys_addr
>>> $1 = 0x116141a00
>>>
>>> rxq->reg_idx
>>> 0
>>>
>>> rxq->nb_rx_desc
>>> 1024
>>> ```
>>>
>>> What do you think?
>>>
>>> For more info, I'm on this exact DPDK commit:
>>> commit eeb0605f118dae66e80faa44f7b3e88748032353 (HEAD -> v23.11, tag: v23.11
>>>
>>> Thanks,
>>> Fabio
>>>
>>> Sent with Proton Mail secure email.
>>>
>>> On Thursday, August 1st, 2024 at 3:24 PM, Ivan Malov ivan.malov at arknetworks.am wrote:
>>>
>>>> Hi Fabio,
>>>>
>>>> Have you tried to specify EAL option --log-level="pmd.net.igc,debug"
>>>> or --log-level='.*',8 when running the application? Perhaps doing
>>>> so can trigger printouts [1], [2]. See if you can't observe those.
>>>>
>>>> Perhaps consider posting a brief excerpt of your code where
>>>> rte_eth_rx_burst() is invoked and return value is verified.
>>>>
>>>> Also, albeit unrelated, it's rather peculiar that the code
>>>> does CPU-to-LE conversion [3] of descriptor status, but
>>>> the field itslef is declared as __le32 already: [4].
>>>>
>>>> [1] https://github.com/DPDK/dpdk/blob/v24.03/drivers/net/igc/igc_txrx.c#L296
>>>> [2] https://github.com/DPDK/dpdk/blob/v24.03/drivers/net/igc/igc_txrx.c#L455
>>>> [3] https://github.com/DPDK/dpdk/blob/v24.03/drivers/net/igc/igc_txrx.c#L264
>>>> [4] https://github.com/DPDK/dpdk/blob/v24.03/drivers/net/igc/base/igc_base.h#L109
>>>>
>>>> Thank you.
>>>>
>>>> On Thu, 1 Aug 2024, Fabio Fernandes wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I have an issue with rte_eth_rx_burst() for IGC poll mode driver never returning any packets and need some advice.
>>>>> I have this network port:
>>>>> 09:00.0 Ethernet controller: Intel Corporation Ethernet Controller I225-V (rev 03)
>>>>>
>>>>> Bound to igb_uio:
>>>>> Network devices using DPDK-compatible driver
>>>>> ============================================
>>>>> 0000:09:00.0 'Ethernet Controller I225-V 15f3' drv=igb_uio unused=igc
>>>>>
>>>>> I'm testing this both with testpmd and my own app, which works fine with other drivers such as net/ena and net/i40e. I'm using single RX/TX queue pair with default configs
>>>>> with rte_eth_promiscuous_enable() and rte_eth_allmulticast_enable().
>>>>>
>>>>> The device seems to rte_eth_dev_start() fine, and rte_eth_stats_get() seem to be detecting inbound packets. Below is the output from testpmd:
>>>>>
>>>>> Press enter to exiteth_igc_interrupt_action(): Port 0: Link Up - speed 1000 Mbps - full-duplex
>>>>>
>>>>> Port 0: link state change event
>>>>> ^CTelling cores to stop...
>>>>> Waiting for lcores to finish...
>>>>>
>>>>> ---------------------- Forward statistics for port 0 ----------------------
>>>>> RX-packets: 129 RX-dropped: 800 RX-total: 929
>>>>> TX-packets: 0 TX-dropped: 0 TX-total: 0
>>>>> ----------------------------------------------------------------------------
>>>>>
>>>>> +++++++++++++++ Accumulated forward statistics for all ports+++++++++++++++
>>>>> RX-packets: 129 RX-dropped: 800 RX-total: 929
>>>>> TX-packets: 0 TX-dropped: 0 TX-total: 0
>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>
>>>>> Done.
>>>>>
>>>>> However, rte_eth_rx_burst() never returns anything, neither in testpmd nor in my own app.
>>>>>
>>>>> In my own app, I log both rte_eth_stats_get() and non-zero xstats from rte_eth_xstats_get_by_id():
>>>>>
>>>>> 07:02:13.406873186 INF stats.rx : 0
>>>>> 07:02:13.406892616 INF dev_stats.ipackets : 78
>>>>> 07:02:13.406903636 INF dev_stats.opackets : 0
>>>>> 07:02:13.406914166 INF dev_stats.imissed : 0
>>>>> 07:02:13.406924536 INF dev_stats.ierrors : 0
>>>>> 07:02:13.406934116 INF dev_stats.oerrors : 0
>>>>> 07:02:13.406943956 INF dev_stats.rx_nombuf : 0
>>>>> 07:02:13.407247777 INF xstats rx_good_packets : 78
>>>>> 07:02:13.407257147 INF xstats rx_good_bytes : 17205
>>>>> 07:02:13.407265267 INF xstats rx_size_64_packets : 6
>>>>> 07:02:13.407274627 INF xstats rx_size_65_to_127_packets : 31
>>>>> 07:02:13.407285757 INF xstats rx_size_128_to_255_packets : 22
>>>>> 07:02:13.407297537 INF xstats rx_size_256_to_511_packets : 16
>>>>> 07:02:13.407309127 INF xstats rx_size_512_to_1023_packets : 3
>>>>> 07:02:13.407321327 INF xstats rx_broadcast_packets : 8
>>>>> 07:02:13.407331597 INF xstats rx_multicast_packets : 64
>>>>> 07:02:13.407346357 INF xstats rx_total_packets : 78
>>>>> 07:02:13.407355547 INF xstats rx_total_bytes : 17205
>>>>> 07:02:13.407364127 INF xstats rx_sent_to_host_packets : 78
>>>>> 07:02:13.407375347 INF xstats interrupt_assert_count : 1
>>>>>
>>>>> Still, rte_eth_rx_burst() never returns anything.
>>>>>
>>>>> It's worthwhile to note that rte_eth_rx_burst() works fine when I, instead of net/igc, use net/ena (with ENA card) or net/i40e (Intel x710 card).
>>>>>
>>>>> The debug log from EAL and net/igc is attached, in case that helps.
>>>>> There's a warning "igc_rx_init(): forcing scatter mode", but I've already tried changing my mbuf sizes so that the warning goes away but that also didn't help.
>>>>>
>>>>> Any advice?
>>>>>
>>>>> Thanks,
>>>>> Fabio
>


More information about the users mailing list