Hairpin Queues Throughput ConnectX-6

Dmitry Kozlyuk dmitry.kozliuk at gmail.com
Thu Jul 4 23:03:15 CEST 2024

Previous message (by thread): Hairpin Queues Throughput ConnectX-6
Next message (by thread): [PATCH v2 0/3] fix bpf load hangs with six IPv6 addresses
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

2024-07-04 13:08 (UTC+0200), Mário Kuka:
[...]
> So I can't achieve my goal: traffic from the hairpin queues is not 
> dropped if the CPU queue is overloaded.
> Any idea how to achieve this in example 4?
> What is the problem, full packet buffers/memory in the device that are 
> shared between the hairpin and CPU queues?
> 
> Any guidance or suggestions on how to achieve this would be greatly 
> appreciated.

So you want priority traffic to use a dedicated HW buffer pool.
Good news: QoS is the mechanism to do it.
Bad news: flow rules cannot be used to determine priority,
so you need to mark packets with VLAN PCP or IPv4 DSCP.

I've reproduced your results roughly with 60 Mpps @ PCP 0
(10 Mpps with MAC for hairpin, 50 Mpps with MAC for normal RxQ),
then switched to 10 Mpps @ PCP 0 + 50 Mpps @ PCP 1
and this solved the issue (and so does 10 Mpps @ PCP 1 + 50 Mpps @ PCP 0).

I expected the need to tune --buffer_size and --prio2buffer with mlnx_qos [1],
but it appears to be unnecessary.
No idea why this works and what buffer size is used for PCP 1.

[1]: https://enterprise-support.nvidia.com/s/article/mlnx-qos

Previous message (by thread): Hairpin Queues Throughput ConnectX-6
Next message (by thread): [PATCH v2 0/3] fix bpf load hangs with six IPv6 addresses
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the dev mailing list