[PATCH v11 06/16] eal: use prefetch intrinsics

David Marchand david.marchand at redhat.com
Thu Aug 24 16:18:06 CEST 2023


On Thu, Aug 24, 2023 at 2:47 PM Morten Brørup <mb at smartsharesystems.com> wrote:
> > From: David Marchand [mailto:david.marchand at redhat.com]
> > However, I am a bit puzzled why the prefetch change makes the compiler
> > consider this loop differently.
> > We have the same constructs everywhere in this library and x86_64
> > builds are fine...
>
> That is indeed the relevant question here!
>
> Perhaps the compiler somehow ignores the "const" part of the parameter given to the "asm" (in rte_prefetch0()) for 64 bit arch, but not for 32 bit arch?

It is possible to reproduce the issue with current DPDK tree (with
existing prefetch implementation in asm) and removing all
rte_prefetch0 calls from the async rx path:

diff --git a/lib/vhost/virtio_net.c b/lib/vhost/virtio_net.c
index d7624d18c8..6f941cf27d 100644
--- a/lib/vhost/virtio_net.c
+++ b/lib/vhost/virtio_net.c
@@ -748,8 +748,6 @@ map_one_desc(struct virtio_net *dev, struct
vhost_virtqueue *vq,
                if (unlikely(!desc_addr))
                        return -1;

-               rte_prefetch0((void *)(uintptr_t)desc_addr);
-
                buf_vec[vec_id].buf_iova = desc_iova;
                buf_vec[vec_id].buf_addr = desc_addr;
                buf_vec[vec_id].buf_len  = desc_chunck_len;
@@ -1808,8 +1806,6 @@ virtio_dev_rx_async_submit_split(struct
virtio_net *dev, struct vhost_virtqueue
         */
        avail_head = __atomic_load_n(&vq->avail->idx, __ATOMIC_ACQUIRE);

-       rte_prefetch0(&vq->avail->ring[vq->last_avail_idx & (vq->size - 1)]);
-
        async_iter_reset(async);

        for (pkt_idx = 0; pkt_idx < count; pkt_idx++) {
@@ -1997,7 +1993,6 @@ virtio_dev_rx_async_packed_batch_enqueue(struct
virtio_net *dev,
        uint16_t i;

        vhost_for_each_try_unroll(i, 0, PACKED_BATCH_SIZE) {
-               rte_prefetch0((void *)(uintptr_t)desc_addrs[i]);
                desc = vhost_iova_to_vva(dev, vq, desc_addrs[i],
&lens[i], VHOST_ACCESS_RW);
                hdrs[i] = (struct virtio_net_hdr_mrg_rxbuf *)(uintptr_t)desc;
                lens[i] = pkts[i]->pkt_len +
@@ -2106,8 +2101,6 @@ virtio_dev_rx_async_submit_packed(struct
virtio_net *dev, struct vhost_virtqueue
        uint16_t i;

        do {
-               rte_prefetch0(&vq->desc_packed[vq->last_avail_idx]);
-
                if (count - pkt_idx >= PACKED_BATCH_SIZE) {
                        if (!virtio_dev_rx_async_packed_batch(dev, vq,
&pkts[pkt_idx],
                                        dma_id, vchan_id)) {


If any prefetch is left, the warning disappears.
So the asm usage probably impacts/disables some compiler feature, for
this code path.


-- 
David Marchand



More information about the dev mailing list