[dpdk-dev] [PATCH v4] lib/table: fix cache alignment issue
Dumitrescu, Cristian
cristian.dumitrescu at intel.com
Wed Jul 29 15:13:57 CEST 2020
> -----Original Message-----
> From: David Marchand <david.marchand at redhat.com>
> Sent: Wednesday, July 29, 2020 1:01 PM
> To: Xu, Ting <ting.xu at intel.com>; Dumitrescu, Cristian
> <cristian.dumitrescu at intel.com>
> Cc: dev <dev at dpdk.org>; dpdk stable <stable at dpdk.org>
> Subject: Re: [dpdk-dev] [PATCH v4] lib/table: fix cache alignment issue
>
> On Wed, Jul 22, 2020 at 4:13 AM Ting Xu <ting.xu at intel.com> wrote:
> >
> > When create softnic hash table with 16 keys, it failed on 32-bit
> > environment, because the pointer field in structure rte_bucket_4_16
> > is only 32 bits. Add a padding field in 32-bit environment to keep
> > the structure to a multiple of 64 bytes. Apply this to 8-byte and
> > 32-byte key hash function as well.
>
> Please correct me if I am wrong, but it simply means this part of the
> table library never worked for 32-bit.
> It seems more adding 32-bit support rather than a fix and then I
> wonder if it has its place in rc3.
>
Functionally. the code works, but performance is affected.
The only thing that prevents the code from working is the check in the table create function that checks the size of the above structure is 64 bytes, which caught this issue.
>
>
> Now, looking at the details:
>
> For 64-bit on my x86, we have:
>
> struct rte_bucket_4_8 {
> uint64_t signature; /* 0 8 */
> uint64_t lru_list; /* 8 8 */
> struct rte_bucket_4_8 * next; /* 16 8 */
> uint64_t next_valid; /* 24 8 */
> uint64_t key[4]; /* 32 32 */
> /* --- cacheline 1 boundary (64 bytes) --- */
> uint8_t data[]; /* 64 0 */
>
> /* size: 64, cachelines: 1, members: 6 */
> };
>
>
> For 32-bit, we have:
>
> struct rte_bucket_4_8 {
> uint64_t signature; /* 0 8 */
> uint64_t lru_list; /* 8 8 */
> struct rte_bucket_4_8 * next; /* 16 4 */
> uint64_t next_valid; /* 20 8 */
> uint64_t key[4]; /* 28 32 */
> uint8_t data[]; /* 60 0 */
>
> /* size: 60, cachelines: 1, members: 6 */
> /* last cacheline: 60 bytes */
> } __attribute__((__packed__));
>
> ^^ it is interesting that a packed attribute ends up here.
> I saw no such attribute in the library code.
> Compiler black magic at work I guess...
>
Where do you see the packet attribute? I don't see it in the code.
A packet attribute would explain this issue, i.e. why did the compiler decide not to insert an expected padfing of 4 bytes right after the "next" field, that would allow the field "next_valid" to be aligned to its natural boundary of 8 bytes.
>
> >
> > Fixes: 8aa327214c ("table: hash")
> > Cc: stable at dpdk.org
> >
> > Signed-off-by: Ting Xu <ting.xu at intel.com>
> >
> > ---
> > v3->v4: Change design based on comment
> > v2->v3: Rebase
> > v1->v2: Correct patch time
> > ---
> > lib/librte_table/rte_table_hash_key16.c | 17 +++++++++++++++++
> > lib/librte_table/rte_table_hash_key32.c | 17 +++++++++++++++++
> > lib/librte_table/rte_table_hash_key8.c | 16 ++++++++++++++++
> > 3 files changed, 50 insertions(+)
> >
> > diff --git a/lib/librte_table/rte_table_hash_key16.c
> b/lib/librte_table/rte_table_hash_key16.c
> > index 2cca1c924..c4384b114 100644
> > --- a/lib/librte_table/rte_table_hash_key16.c
> > +++ b/lib/librte_table/rte_table_hash_key16.c
> > @@ -33,6 +33,7 @@
> >
> > #endif
> >
> > +#ifdef RTE_ARCH_64
> > struct rte_bucket_4_16 {
> > /* Cache line 0 */
> > uint64_t signature[4 + 1];
> > @@ -46,6 +47,22 @@ struct rte_bucket_4_16 {
> > /* Cache line 2 */
> > uint8_t data[0];
> > };
> > +#else
> > +struct rte_bucket_4_16 {
> > + /* Cache line 0 */
> > + uint64_t signature[4 + 1];
> > + uint64_t lru_list;
> > + struct rte_bucket_4_16 *next;
> > + uint32_t pad;
> > + uint64_t next_valid;
> > +
> > + /* Cache line 1 */
> > + uint64_t key[4][2];
> > +
> > + /* Cache line 2 */
> > + uint8_t data[0];
> > +};
> > +#endif
>
> The change could simply be:
>
> @@ -38,6 +38,9 @@ struct rte_bucket_4_16 {
> uint64_t signature[4 + 1];
> uint64_t lru_list;
> struct rte_bucket_4_16 *next;
> +#ifndef RTE_ARCH_64
> + uint32_t pad;
> +#endif
> uint64_t next_valid;
>
> /* Cache line 1 */
>
> It avoids duplicating the whole structure definition (we could miss
> updating one side of the #ifdef later).
> Idem for the other "8" and "32" structures.
>
>
> --
> David Marchand
More information about the dev
mailing list