[RFC v3] net/af_packet: make stats reset reliable

Mattias Rönnblom hofors at lysator.liu.se
Sun May 26 09:21:55 CEST 2024

Previous message (by thread): [RFC v3] net/af_packet: make stats reset reliable
Next message (by thread): [RFC v3] net/af_packet: make stats reset reliable
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 2024-05-08 17:23, Stephen Hemminger wrote:
> On Wed, 8 May 2024 09:19:02 +0200
> Mattias Rönnblom <hofors at lysator.liu.se> wrote:
> 
>> On 2024-05-04 00:00, Stephen Hemminger wrote:
>>> On Fri, 3 May 2024 16:45:47 +0100
>>> Ferruh Yigit <ferruh.yigit at amd.com> wrote:
>>>    
>>>> For stats reset, use an offset instead of zeroing out actual stats values,
>>>> get_stats() displays diff between stats and offset.
>>>> This way stats only updated in datapath and offset only updated in stats
>>>> reset function. This makes stats reset function more reliable.
>>>>
>>>> As stats only written by single thread, we can remove 'volatile' qualifier
>>>> which should improve the performance in datapath.
>>>>
>>>> While updating around, 'igb_stats' parameter renamed as 'stats'.
>>>>
>>>> Signed-off-by: Ferruh Yigit <ferruh.yigit at amd.com>
>>>> ---
>>>> Cc: Mattias Rönnblom <mattias.ronnblom at ericsson.com>
>>>> Cc: Stephen Hemminger <stephen at networkplumber.org>
>>>> Cc: Morten Brørup <mb at smartsharesystems.com>
>>>>
>>>> This update triggered by mail list discussion [1].
>>>>
>>>> [1]
>>>> https://inbox.dpdk.org/dev/3b2cf48e-2293-4226-b6cd-5f4dd3969f99@lysator.liu.se/
>>>
>>>
>>> NAK
>>>
>>> I did not hear a good argument why atomic or volatile was necessary in the first place.
>>> Why?
>>>    
>>
>> On the reader side, loads should be atomic.
>> On the writer side, stores should be atomic.
>>
>> Updates (stores) should actually occur in a timely manner. The complete
>> read-modify-write cycle need not be atomic, since we only have a single
>> writer. All this for the per-lcore counter case.
>>
>> If load or store tearing occurs, the counter values may occasionally
>> take totally bogus values. I think that should be avoided. Especially
>> since it will likely come at a very reasonable cost.
>>
>>   From what it seems to me, load or store tearing may well occur. GCC may
>> generate two 32-bit stores for a program-level 64-bit store on 32-bit
>> x86. If you have constant and immediate-data store instructions,
>> constant writes may also be end up teared. The kernel documentation has
>> some example of this. Add LTO, it's not necessarily going to be all that
>> clear what is storing-a-constant and what is not.
>>
>> Maybe you care a little less if statistics are occasionally broken, or
>> some transient, inconsistent state, but generally they should work, and
>> they should never have some totally bogus values. So, statistics aren't
>> snow flakes, mostly just business as usual.
>>
>> We can't both have a culture that promotes C11-style parallel
>> programming, or, at the extreme, push the C11 APIs as-is, and the say
>> "and btw you don't have to care about the standard when it comes to
>> statistics".
>>
>> We could adopt the Linux kernel's rules, programming model, and APIs
>> (ignoring legal issues). That would be very old school, maybe somewhat
>> over-engineered for our purpose, include a fair amount of inline
>> assembler, and also and may well depend on GCC or GCC-like compilers,
>> just like what I believe the kernel does.
>>
>> We could use something in-between, heavily inspired by C11 but still
>> with an opportunity to work around compiler issues, library issues, and
>> extend the API for our use case.
>>
>> I agree we shouldn't have to mark statistics _Atomic, or RTE_ATOMIC(),
>> rte_atomic64_t, or rte_sometimes_atomic_and_sometimes_not64_t. Just
>> keeping the usual C integer types seems like a better option to me.
>>
>>> Why is this driver special (a snowflake) compared to all the other drivers doing software
>>> statistics (tap, virtio, xdp, ring, memif, netvsc, vmware)?
>>
>> If a broken piece of code has been copied around, one place is going to
>> be the first to be fixed.
> 
> 
> I dislike when any driver does something completely different than valid precedent.
> No other driver in DPDK, Vpp, FreeBSD, Linux (and probably Windows) uses atomic for
> updating statistics. We even got performance benefit at MS from removing atomic
> increment of staistics in internal layers.
> 

All of those are using atomic stores when updating the statistics, I'm 
sure. Assuring a store being atomic is one thing, and assuring the whole 
read-modify-write cycle is atomic is a completely different (and very 
much more expensive) thing.

> The idea of load tearing is crazy talk of integral types. It would break so many things.
> It is the kind of stupid compiler thing that would send Linus on a rant and get
> the GCC compiler writers in trouble.
> 

On 32-bit x86, store tearing for 64-bit integral types is the order of 
the day.

For <64 bit types, I agree. The only cases (like the one listed in the 
kernel documentation), are going to be rare indeed.

> The DPDK has always favored performance over strict safety guard rails everywhere.
> Switching to making every statistic an atomic operation is not in the spirit of
> what is required. There is no strict guarantee necessary here.
> 

I think we agree but just use different terminology.

Previous message (by thread): [RFC v3] net/af_packet: make stats reset reliable
Next message (by thread): [RFC v3] net/af_packet: make stats reset reliable
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the dev mailing list