[PATCH] net/af_packet: cache align Rx/Tx structs

Mattias Rönnblom hofors at lysator.liu.se
Tue Apr 23 22:56:51 CEST 2024


On 2024-04-23 13:15, Ferruh Yigit wrote:
> On 4/23/2024 10:08 AM, Mattias Rönnblom wrote:
>> Cache align Rx and Tx queue struct to avoid false sharing.
>>
>> RX struct happens to be 64 bytes on x86_64 already, so cache alignment
>> makes no change there, but it does on 32-bit ISAs.
>>
>> TX struct is 56 bytes on x86_64.
>>
> 
> Hi Mattias,
> 
> No objection to the patch. Is the improvement theoretical or do you
> measure any improvement practically, if so how much is the improvement?
> 

I didn't run any benchmarks.

Two cores storing to a (falsely) shared cache line on a per-packet basis 
is going to be very expensive, at least for "light touch" applications.

>> Both structs keep counters, and in the RX case they are updated even
>> for empty polls.
>>
> 
> Do you think does it help if move 'rx_pkts' & 'rx_bytes' update within
> the loop?
> 

No, why? Wouldn't that be worse? Especially since rx_pkts and rx_bytes 
are declared volatile, so you are forcing a load-modify-store cycle for 
every increment.

I would drop "volatile", or replace it with an atomic (although *not* 
use an atomic add for incrementing, but rather atomic load + <n> 
non-atomic adds + atomic store).

>> Signed-off-by: Mattias Rönnblom <mattias.ronnblom at ericsson.com>
>> ---
>>   drivers/net/af_packet/rte_eth_af_packet.c | 5 +++--
>>   1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/net/af_packet/rte_eth_af_packet.c b/drivers/net/af_packet/rte_eth_af_packet.c
>> index 397a32db58..28aeb7d08e 100644
>> --- a/drivers/net/af_packet/rte_eth_af_packet.c
>> +++ b/drivers/net/af_packet/rte_eth_af_packet.c
>> @@ -6,6 +6,7 @@
>>    * All rights reserved.
>>    */
>>   
>> +#include <rte_common.h>
>>   #include <rte_string_fns.h>
>>   #include <rte_mbuf.h>
>>   #include <ethdev_driver.h>
>> @@ -53,7 +54,7 @@ struct pkt_rx_queue {
>>   
>>   	volatile unsigned long rx_pkts;
>>   	volatile unsigned long rx_bytes;
>> -};
>> +} __rte_cache_aligned;
>>   
> 
> Latest location for '__rte_cache_aligned' tag is at the beginning of the
> struct [1], so something like:
> `struct __rte_cache_aligned pkt_rx_queue {`
> 
> [1]
> https://patchwork.dpdk.org/project/dpdk/list/?series=31746&state=%2A&archive=both


More information about the dev mailing list