[dpdk-dev] [PATCH 1/5] hash: add new toeplitz hash implementation
Ananyev, Konstantin
konstantin.ananyev at intel.com
Fri Oct 15 12:55:09 CEST 2021
> >> +/**
> >> + * Calculate Toeplitz hash.
> >> + *
> >> + * @warning
> >> + * @b EXPERIMENTAL: this API may change without prior notice.
> >> + *
> >> + * @param m
> >> + * Pointer to the matrices generated from the corresponding
> >> + * RSS hash key using rte_thash_complete_matrix().
> >> + * @param tuple
> >> + * Pointer to the data to be hashed. Data must be in network byte order.
> >> + * @param len
> >> + * Length of the data to be hashed.
> >> + * @return
> >> + * Calculated Toeplitz hash value.
> >> + */
> >> +__rte_experimental
> >> +static inline uint32_t
> >> +rte_thash_gfni(uint64_t *m, uint8_t *tuple, int len)
> >> +{
> >> + uint32_t val, val_zero;
> >> +
> >> + __m512i xor_acc = __rte_thash_gfni(m, tuple, NULL, len);
> >> + __rte_thash_xor_reduce(xor_acc, &val, &val_zero);
> >> +
> >> + return val;
> >> +}
> >> +
> >> +/**
> >> + * Calculate Toeplitz hash for two independent data buffers.
> >> + *
> >> + * @warning
> >> + * @b EXPERIMENTAL: this API may change without prior notice.
> >> + *
> >> + * @param m
> >> + * Pointer to the matrices generated from the corresponding
> >> + * RSS hash key using rte_thash_complete_matrix().
> >> + * @param tuple_1
> >> + * Pointer to the data to be hashed. Data must be in network byte order.
> >> + * @param tuple_2
> >> + * Pointer to the data to be hashed. Data must be in network byte order.
> >> + * @param len
> >> + * Length of the largest data buffer to be hashed.
> >> + * @param val_1
> >> + * Pointer to uint32_t where to put calculated Toeplitz hash value for
> >> + * the first tuple.
> >> + * @param val_2
> >> + * Pointer to uint32_t where to put calculated Toeplitz hash value for
> >> + * the second tuple.
> >> + */
> >> +__rte_experimental
> >> +static inline void
> >> +rte_thash_gfni_x2(uint64_t *mtrx, uint8_t *tuple_1, uint8_t *tuple_2, int len,
> >> + uint32_t *val_1, uint32_t *val_2)
> >
> > Why just two?
> > Why not uint8_t *tuple[]
> > ?
> >
>
> x2 version was added because there was unused space inside the ZMM which
> holds input key (input tuple) bytes for a second input key, so it helps
> to improve performance in some cases.
> Bulk version wasn't added because for the vast majority of cases it will
> be used with a single input key.
> Hiding this function inside .c will greatly affect performance, because
> it takes just a few cycles to calculate the hash for the most popular
> key sizes.
Ok, but it still unclear to me why for 2 only?
What stops you from doing:
static inline void
rte_thash_gfni_bulk(const uint64_t *mtrx, uint32_t len, uint8_t *tuple[], uint32_t val[], uint32_t num)
{
for (i = 0; i != (num & ~1); i += 2) {
xor_acc = __rte_thash_gfni(mtrx, tuple[i], tuple[i+ 1], len);
__rte_thash_xor_reduce(xor_acc, val + i, val + i + 1);
}
If (num & 1) {
xor_acc = __rte_thash_gfni(mtrx, tuple[i], NULL, len);
__rte_thash_xor_reduce(xor_acc, val + i, &val_zero);
}
}
?
More information about the dev
mailing list