[RFC 3/3] hash: add support for common small key sizes
Morten Brørup
mb at smartsharesystems.com
Fri Aug 22 20:57:25 CEST 2025
> >>> +static inline int
> >>> +rte_hash_k2_cmp_eq(const void *key1, const void *key2, size_t key_len
> >> __rte_unused)
> >>> +{
> >>> + const uint16_t *k1 = key1;
> >>> + const uint16_t *k2 = key2;
> >>> +
> >>
> >> What we do now is to require the keys are 16-bit aligned (which wasn't
> >> the case before).
> >>
> >> You could
> >>
> >> uint16_t k1;
> >> memcpy(&k1, key1, sizeof(uint16_t));
> >> instead.
> >>
> >> Would generate the same code, but be safe from any future alignment issues.
> >
> > Or use the unaligned types, e.g.:
> > const unaligned_uint16_t *k1 = key1;
> > const unaligned_uint16_t *k2 = key2;
> >
>
> Could you explain why that is safe? Doesn't
> __attribute__((__aligned__(1)))
> just say specify the object doesn't have any alignment requirements,
> without asking the compiler to deal with it?
It is safe because the compiler does deal with it.
Here's a simple example:
https://godbolt.org/z/r39zdeEcx
I don't know how MSVC deals with it, but it doesn't support alignment sensitive architectures, so no real problem.
>
> >>
> >> Anyway, maybe it's safe to assume the keys are aligned, so this is not
> >> an issue.
> >
> > Lots of DPDK code ignores alignment issues.
> >
> >>
> >>> + return k1[0] ^ k2[0];
> >>> +}
> >>
> >> Haven't you implemented "neq" rather than "eq" here? If the keys are
> >> equal, the result is 0. Should be != 0.
> >
> > Not a bug.
>
> Well, the function body doesn't do what the function name tells it. :)
Agree. They really should be renamed to _cmp_neq.
And _cmp_eq wrappers could be kept for backwards API compatibility. Possibly marked deprecated.
>
> > These hash compare functions are in fact "neq", not "eq".
> > Having "_cmp_eq" in the function names is extremely unfortunate and
> misleading.
> >
> >>
> >> Would it be worth adding a comment like "use XOR to make this
> >> branch-free"? It may not be obvious to all readers.
> >>
> >> That said, I’m not sure this trick will actually change the generated
> >> object code - especially if the result of the eq function is still used
> >> in a conditional afterward. Anyway, keeping it seems like a good
> >> conservative approach.
> >
> > I wonder if any compiler is clever enough to use a different memcmp
> implementation if we inform the compiler at build time that we don't care if
> key1 is less than or greater key2, only if they differ or not.
>
> All what is needed is a constant-size length. (Only tested with the most
> recent GCC and clang.)
Yes. But that was not what I was thinking about... I was wondering if "memcmp()!=0" compiles to code that calls some other memcmp implementation that doesn't check which of the two strings is lower, but only tests if they differ.
>
> At least GCC will emit a cmp instruction though (so not branch free), if
> that matters.
>
> > If so, the OTHER_BYTES handler shouldn't call memcmp() directly, but a
> wrapper around it:
> >
> > rte_hash_k_cmp_eq(const void *key1, const void *key2, size_t key_len
> __rte_unused)
> > {
> > return memcmp(key1, key2, key_len) != 0;
> > }
> >
>
> Should work. (Remove __rte_unused.)
I just tested it in Compiler Explorer, and the function calls the same memcmp(), regardless of the !=0.
So no benefit compared to calling memcmp() directly from the jump table.
More information about the dev
mailing list