[PATCH v2 3/4] hash: reduce architecture special cases
    Stephen Hemminger 
    stephen at networkplumber.org
       
    Tue Aug 26 16:22:10 CEST 2025
    
    
  
On Tue, 26 Aug 2025 16:13:29 +0200
Morten Brørup <mb at smartsharesystems.com> wrote:
> > From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> > Sent: Tuesday, 26 August 2025 15.42
> > 
> > On Tue, 26 Aug 2025 08:55:23 +0200
> > Morten Brørup <mb at smartsharesystems.com> wrote:
> >   
> > > > +static int
> > > > +rte_hash_k64_cmp_eq(const void *key1, const void *key2, size_t  
> > key_len)  
> > > > +{
> > > > +	return rte_hash_k32_cmp_eq(key1, key2, key_len) |  
> > >
> > > Is the "|" instead of "||", to compare in blocks of 64 bytes instead  
> > of 32, intentional?
> > 
> > The cost of the conditional branch is usually higher than the cost of
> > doing
> > a few more instructions on cached data.  
> 
> I agree the key being looked up is very likely cached.
> 
> But the key in the hash table might not be, and the 64 byte comparison might cross a cache line.
> I think using half a cache line as the breakpoint (for using conditional branch instead of unconditional load and compare) seems like a better tradeoff than a full cache line. I have no data to back this up, just a hunch.
> In reality, it depends on the use case. If a cache hit is more likely than a cache miss, then the unconditional comparison is preferable.
> No strong opinion. I mainly wanted to ensure this was intentional.
> 
> 
> While looking into this, I noticed a few instances where your assumption about key1 being aligned is wrong, and the key1/key2 parameters should be swapped:
> https://elixir.bootlin.com/dpdk/v25.07/source/lib/hash/rte_cuckoo_hash.c#L763
> https://elixir.bootlin.com/dpdk/v25.07/source/lib/hash/rte_cuckoo_hash.c#L1319
> https://elixir.bootlin.com/dpdk/v25.07/source/lib/hash/rte_cuckoo_hash.c#L1357
> E.g. when calling rte_hash_add_key_with_hash(), the key parameter can be unaligned, and it ripples down and becomes key1:
> https://elixir.bootlin.com/dpdk/v25.07/source/lib/hash/rte_cuckoo_hash.c#L1259
Ok, lets make 32 byte the watershed as comprimise.
Good catch, best to mark all keys as unaligned, but it won't matter on x86 or Arm64
    
    
More information about the dev
mailing list