[PATCH v11 06/16] eal: use prefetch intrinsics

Morten Brørup mb at smartsharesystems.com
Thu Aug 24 14:46:54 CEST 2023


> From: David Marchand [mailto:david.marchand at redhat.com]
> Sent: Thursday, 24 August 2023 14.06
> 
> On Fri, Aug 11, 2023 at 9:21 PM Tyler Retzlaff
> <roretzla at linux.microsoft.com> wrote:
> >
> > Inline assembly is not supported for MSVC x64 instead use _mm_prefetch
> > and _mm_cldemote intrinsics.
> >
> > Signed-off-by: Tyler Retzlaff <roretzla at linux.microsoft.com>
> > Acked-by: Bruce Richardson <bruce.richardson at intel.com>
> > Acked-by: Morten Brørup <mb at smartsharesystems.com>
> > Acked-by: Konstantin Ananyev <konstantin.v.ananyev at yandex.ru>
> > ---
> >  lib/eal/x86/include/rte_prefetch.h | 25 +++++++++++++++++++++----
> >  1 file changed, 21 insertions(+), 4 deletions(-)
> >
> > diff --git a/lib/eal/x86/include/rte_prefetch.h
> b/lib/eal/x86/include/rte_prefetch.h
> > index 7fd01c4..7a6988e 100644
> > --- a/lib/eal/x86/include/rte_prefetch.h
> > +++ b/lib/eal/x86/include/rte_prefetch.h
> > @@ -9,30 +9,38 @@
> >  extern "C" {
> >  #endif
> >
> > +#include <emmintrin.h>
> > +
> >  #include <rte_compat.h>
> >  #include <rte_common.h>
> >  #include "generic/rte_prefetch.h"
> >
> > +#pragma GCC diagnostic push
> > +#pragma GCC diagnostic ignored "-Wcast-qual"
> > +
> >  static inline void rte_prefetch0(const volatile void *p)
> >  {
> > -       asm volatile ("prefetcht0 %[p]" : : [p] "m" (*(const volatile char
> *)p));
> > +       _mm_prefetch((const void *)p, _MM_HINT_T0);
> >  }
> >
> 
> Quite surprisingly, this part (exactly) breaks 32bits build in the
> vhost library:
> 
> ccache cc -Ilib/librte_vhost.a.p -Ilib
> -I../../../git/pub/dpdk.org/main/lib -Ilib/vhost
> -I../../../git/pub/dpdk.org/main/lib/vhost -I.
> -I../../../git/pub/dpdk.org/main -Iconfig
> -I../../../git/pub/dpdk.org/main/config -Ilib/eal/include
> -I../../../git/pub/dpdk.org/main/lib/eal/include
> -Ilib/eal/linux/include
> -I../../../git/pub/dpdk.org/main/lib/eal/linux/include
> -Ilib/eal/x86/include
> -I../../../git/pub/dpdk.org/main/lib/eal/x86/include -Ilib/eal/common
> -I../../../git/pub/dpdk.org/main/lib/eal/common -Ilib/eal
> -I../../../git/pub/dpdk.org/main/lib/eal -Ilib/kvargs
> -I../../../git/pub/dpdk.org/main/lib/kvargs -Ilib/log
> -I../../../git/pub/dpdk.org/main/lib/log -Ilib/metrics
> -I../../../git/pub/dpdk.org/main/lib/metrics -Ilib/telemetry
> -I../../../git/pub/dpdk.org/main/lib/telemetry -Ilib/ethdev
> -I../../../git/pub/dpdk.org/main/lib/ethdev -Ilib/net
> -I../../../git/pub/dpdk.org/main/lib/net -Ilib/mbuf
> -I../../../git/pub/dpdk.org/main/lib/mbuf -Ilib/mempool
> -I../../../git/pub/dpdk.org/main/lib/mempool -Ilib/ring
> -I../../../git/pub/dpdk.org/main/lib/ring -Ilib/meter
> -I../../../git/pub/dpdk.org/main/lib/meter -Ilib/cryptodev
> -I../../../git/pub/dpdk.org/main/lib/cryptodev -Ilib/rcu
> -I../../../git/pub/dpdk.org/main/lib/rcu -Ilib/hash
> -I../../../git/pub/dpdk.org/main/lib/hash -Ilib/pci
> -I../../../git/pub/dpdk.org/main/lib/pci -Ilib/dmadev
> -I../../../git/pub/dpdk.org/main/lib/dmadev -fdiagnostics-color=always
> -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Wextra -Werror -std=c11
> -O2 -g -include rte_config.h -Wcast-qual -Wdeprecated -Wformat
> -Wformat-nonliteral -Wformat-security -Wmissing-declarations
> -Wmissing-prototypes -Wnested-externs -Wold-style-definition
> -Wpointer-arith -Wsign-compare -Wstrict-prototypes -Wundef
> -Wwrite-strings -Wno-address-of-packed-member -Wno-packed-not-aligned
> -Wno-missing-field-initializers -Wno-zero-length-bounds
> -Wno-pointer-to-int-cast -D_GNU_SOURCE -m32 -fPIC -march=native -mrtm
> -DALLOW_EXPERIMENTAL_API -DALLOW_INTERNAL_API -Wno-format-truncation
> -DVHOST_GCC_UNROLL_PRAGMA -fno-strict-aliasing -DVHOST_HAS_VDUSE
> -DRTE_LOG_DEFAULT_LOGTYPE=lib.vhost -MD -MQ
> lib/librte_vhost.a.p/vhost_virtio_net.c.o -MF
> lib/librte_vhost.a.p/vhost_virtio_net.c.o.d -o
> lib/librte_vhost.a.p/vhost_virtio_net.c.o -c
> ../../../git/pub/dpdk.org/main/lib/vhost/virtio_net.c
> In function ‘mbuf_to_desc’,
>     inlined from ‘vhost_enqueue_async_packed’ at
> ../../../git/pub/dpdk.org/main/lib/vhost/virtio_net.c:1948:6,
>     inlined from ‘virtio_dev_rx_async_packed’ at
> ../../../git/pub/dpdk.org/main/lib/vhost/virtio_net.c:1965:6,
>     inlined from ‘virtio_dev_rx_async_submit_packed’ at
> ../../../git/pub/dpdk.org/main/lib/vhost/virtio_net.c:2127:7:
> ../../../git/pub/dpdk.org/main/lib/vhost/virtio_net.c:1230:18: error:
> ‘buf_vec[0].buf_addr’ may be used uninitialized
> [-Werror=maybe-uninitialized]
>  1230 |         buf_addr = buf_vec[vec_idx].buf_addr;
>       |         ~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~
> ../../../git/pub/dpdk.org/main/lib/vhost/virtio_net.c: In function
> ‘virtio_dev_rx_async_submit_packed’:
> ../../../git/pub/dpdk.org/main/lib/vhost/virtio_net.c:1963:27: note:
> ‘buf_vec’ declared here
>  1963 |         struct buf_vector buf_vec[BUF_VECTOR_MAX];
>       |                           ^~~~~~~
> In function ‘mbuf_to_desc’,
>     inlined from ‘vhost_enqueue_async_packed’ at
> ../../../git/pub/dpdk.org/main/lib/vhost/virtio_net.c:1948:6,
>     inlined from ‘virtio_dev_rx_async_packed’ at
> ../../../git/pub/dpdk.org/main/lib/vhost/virtio_net.c:1965:6,
>     inlined from ‘virtio_dev_rx_async_submit_packed’ at
> ../../../git/pub/dpdk.org/main/lib/vhost/virtio_net.c:2127:7:
> ../../../git/pub/dpdk.org/main/lib/vhost/virtio_net.c:1231:18: error:
> ‘buf_vec[0].buf_iova’ may be used uninitialized
> [-Werror=maybe-uninitialized]
>  1231 |         buf_iova = buf_vec[vec_idx].buf_iova;
>       |         ~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~
> ../../../git/pub/dpdk.org/main/lib/vhost/virtio_net.c: In function
> ‘virtio_dev_rx_async_submit_packed’:
> ../../../git/pub/dpdk.org/main/lib/vhost/virtio_net.c:1963:27: note:
> ‘buf_vec’ declared here
>  1963 |         struct buf_vector buf_vec[BUF_VECTOR_MAX];
>       |                           ^~~~~~~
> In function ‘mbuf_to_desc’,
>     inlined from ‘vhost_enqueue_async_packed’ at
> ../../../git/pub/dpdk.org/main/lib/vhost/virtio_net.c:1948:6,
>     inlined from ‘virtio_dev_rx_async_packed’ at
> ../../../git/pub/dpdk.org/main/lib/vhost/virtio_net.c:1965:6,
>     inlined from ‘virtio_dev_rx_async_submit_packed’ at
> ../../../git/pub/dpdk.org/main/lib/vhost/virtio_net.c:2127:7:
> ../../../git/pub/dpdk.org/main/lib/vhost/virtio_net.c:1232:35: error:
> ‘buf_vec[0].buf_len’ may be used uninitialized
> [-Werror=maybe-uninitialized]
>  1232 |         buf_len = buf_vec[vec_idx].buf_len;
>       |                   ~~~~~~~~~~~~~~~~^~~~~~~~
> ../../../git/pub/dpdk.org/main/lib/vhost/virtio_net.c: In function
> ‘virtio_dev_rx_async_submit_packed’:
> ../../../git/pub/dpdk.org/main/lib/vhost/virtio_net.c:1963:27: note:
> ‘buf_vec’ declared here
>  1963 |         struct buf_vector buf_vec[BUF_VECTOR_MAX];
>       |                           ^~~~~~~
> cc1: all warnings being treated as errors
> ninja: build stopped: subcommand failed.
> 
> I had a look at the vhost library and I think the compiler thinks size may be
> 0.

In 32 bit architecture, "size" could be zero, ref. line 1909:

uint64_t size = pkt->pkt_len + sizeof(struct virtio_net_hdr_mrg_rxbuf);

The values on both sides of the addition are 32 bit unsigned, so the addition will be performed as 32 bit unsigned, and the result could in theory be 0. Casting one of them to 64 bit, e.g. (uint64_t)sizeof(struct virtio_net_hdr_mrg_rxbuf), might fix the warning if the compiler is clever enough to know that adding a non-zero 64 bit value to any 32 bit value cannot give zero as the result.

Side note, off track: Why is "size" uint64_t, wouldn't uint32_t suffice?

> Changing the loop on size with a do { } while (size > 0); resolves the
> warning.
> I can post a change for this, as we hit a similar issue in the past
> and the code does not make sense comparing size on the first iteration
> of this loop.
> 
> However, I am a bit puzzled why the prefetch change makes the compiler
> consider this loop differently.
> We have the same constructs everywhere in this library and x86_64
> builds are fine...

That is indeed the relevant question here!

Perhaps the compiler somehow ignores the "const" part of the parameter given to the "asm" (in rte_prefetch0()) for 64 bit arch, but not for 32 bit arch?

> 
> 
> 
> --
> David Marchand



More information about the dev mailing list