[RFC v2] net/af_packet: make stats reset reliable

Mattias Rönnblom hofors at lysator.liu.se
Wed May 8 09:48:23 CEST 2024


On 2024-05-07 18:00, Morten Brørup wrote:
>> From: Stephen Hemminger [mailto:stephen at networkplumber.org]
>> Sent: Tuesday, 7 May 2024 16.51
> 
>> I would prefer that the SW statistics be handled generically by ethdev
>> layers and used by all such drivers.
> 
> I agree.
> 
> Please note that maintaining counters in the ethdev layer might cause more cache misses than maintaining them in the hot parts of the individual drivers' data structures, so it's not all that simple. ;-)
> 
> Until then, let's find a short term solution, viable to implement across all software NIC drivers without API/ABI breakage.
> 
>>
>> The most complete version of SW stats now is in the virtio driver.
> 
> It looks like the virtio PMD maintains the counters; they are not retrieved from the host.
> 
> Considering a DPDK application running as a virtual machine (guest) on a host server...
> 
> If the host is unable to put a packet onto the guest's virtio RX queue - like when a HW NIC is out of RX descriptors - is it counted somewhere visible to the guest?
> 
> Similarly, if the guest is unable to put a packet onto its virtio TX queue, is it counted somewhere visible to the host?
> 
>> If reset needs to be reliable (debatable), then it needs to be done without
>> atomics.
> 
> Let's modify that slightly: Without performance degradation in the fast path.
> I'm not sure that all atomic operations are slow.

Relaxed atomic loads from and stores to naturally aligned addresses are 
for free on ARM and x86_64 up to at least 64 bits.

"For free" is not entirely true, since both C11 relaxed stores and 
stores through volatile may prevent vectorization in GCC. I don't see 
why, but in practice that seems to be the case. That is very much a 
corner case.

Also, as mentioned before, C11 atomic store effectively has volatile 
semantics, which in turn may prevent some compiler optimizations.

On 32-bit x86, 64-bit atomic stores use xmm registers, but those are 
going to be used anyway, since you'll have a 64-bit add.

> But you are right that it needs to be done without _Atomic counters; they seem to be slow.
> 

_Atomic is not slower than atomics without _Atomic, when you actually 
need atomic operations.


More information about the dev mailing list