[dpdk-dev] [PATCH 1/6] ring: change head and tail to pointer-width size
Stephen Hemminger
stephen at networkplumber.org
Fri Jan 11 05:38:33 CET 2019
On Thu, 10 Jan 2019 15:01:17 -0600
Gage Eads <gage.eads at intel.com> wrote:
> For 64-bit architectures, doubling the head and tail index widths greatly
> increases the time it takes for them to wrap-around (with current CPU
> speeds, it won't happen within the author's lifetime). This is important in
> avoiding the ABA problem -- in which a thread mistakes reading the same
> tail index in two accesses to mean that the ring was not modified in the
> intervening time -- in the upcoming non-blocking ring implementation. Using
> a 64-bit index makes the possibility of this occurring effectively zero.
>
> I tested this commit's performance impact with an x86_64 build on a
> dual-socket Xeon E5-2699 v4 using ring_perf_autotest, and the change made
> no significant difference -- the few differences appear to be system noise.
> (The test ran on isolcpus cores using a tickless scheduler, but some
> variation was stll observed.) Each test was run three times and the results
> were averaged:
>
> | 64b head/tail cycle cost minus
> Test | 32b head/tail cycle cost
> ------------------------------------------------------------------
> SP/SC single enq/dequeue | 0.33
> MP/MC single enq/dequeue | 0.00
> SP/SC burst enq/dequeue (size 8) | 0.00
> MP/MC burst enq/dequeue (size 8) | 1.00
> SP/SC burst enq/dequeue (size 32) | 0.00
> MP/MC burst enq/dequeue (size 32) | -1.00
> SC empty dequeue | 0.01
> MC empty dequeue | 0.00
>
> Single lcore:
> SP/SC bulk enq/dequeue (size 8) | -0.36
> MP/MC bulk enq/dequeue (size 8) | 0.99
> SP/SC bulk enq/dequeue (size 32) | -0.40
> MP/MC bulk enq/dequeue (size 32) | -0.57
>
> Two physical cores:
> SP/SC bulk enq/dequeue (size 8) | -0.49
> MP/MC bulk enq/dequeue (size 8) | 0.19
> SP/SC bulk enq/dequeue (size 32) | -0.28
> MP/MC bulk enq/dequeue (size 32) | -0.62
>
> Two NUMA nodes:
> SP/SC bulk enq/dequeue (size 8) | 3.25
> MP/MC bulk enq/dequeue (size 8) | 1.87
> SP/SC bulk enq/dequeue (size 32) | -0.44
> MP/MC bulk enq/dequeue (size 32) | -1.10
>
> An earlier version of this patch changed the head and tail indexes to
> uint64_t, but that caused a performance drop on 32-bit builds. With
> uintptr_t, no performance difference is observed on an i686 build.
>
> Signed-off-by: Gage Eads <gage.eads at intel.com>
> ---
> lib/librte_eventdev/rte_event_ring.h | 6 +++---
> lib/librte_ring/rte_ring.c | 10 +++++-----
> lib/librte_ring/rte_ring.h | 20 ++++++++++----------
> lib/librte_ring/rte_ring_generic.h | 16 +++++++++-------
> 4 files changed, 27 insertions(+), 25 deletions(-)
>
> diff --git a/lib/librte_eventdev/rte_event_ring.h b/lib/librte_eventdev/rte_event_ring.h
> index 827a3209e..eae70f904 100644
> --- a/lib/librte_eventdev/rte_event_ring.h
> +++ b/lib/librte_eventdev/rte_event_ring.h
> @@ -1,5 +1,5 @@
> /* SPDX-License-Identifier: BSD-3-Clause
> - * Copyright(c) 2016-2017 Intel Corporation
> + * Copyright(c) 2016-2019 Intel Corporation
> */
>
> /**
> @@ -88,7 +88,7 @@ rte_event_ring_enqueue_burst(struct rte_event_ring *r,
> const struct rte_event *events,
> unsigned int n, uint16_t *free_space)
> {
> - uint32_t prod_head, prod_next;
> + uintptr_t prod_head, prod_next;
> uint32_t free_entries;
>
> n = __rte_ring_move_prod_head(&r->r, r->r.prod.single, n,
> @@ -129,7 +129,7 @@ rte_event_ring_dequeue_burst(struct rte_event_ring *r,
> struct rte_event *events,
> unsigned int n, uint16_t *available)
> {
> - uint32_t cons_head, cons_next;
> + uintptr_t cons_head, cons_next;
> uint32_t entries;
>
> n = __rte_ring_move_cons_head(&r->r, r->r.cons.single, n,
> diff --git a/lib/librte_ring/rte_ring.c b/lib/librte_ring/rte_ring.c
> index d215acecc..b15ee0eb3 100644
> --- a/lib/librte_ring/rte_ring.c
> +++ b/lib/librte_ring/rte_ring.c
> @@ -1,6 +1,6 @@
> /* SPDX-License-Identifier: BSD-3-Clause
> *
> - * Copyright (c) 2010-2015 Intel Corporation
> + * Copyright (c) 2010-2019 Intel Corporation
> * Copyright (c) 2007,2008 Kip Macy kmacy at freebsd.org
> * All rights reserved.
> * Derived from FreeBSD's bufring.h
> @@ -227,10 +227,10 @@ rte_ring_dump(FILE *f, const struct rte_ring *r)
> fprintf(f, " flags=%x\n", r->flags);
> fprintf(f, " size=%"PRIu32"\n", r->size);
> fprintf(f, " capacity=%"PRIu32"\n", r->capacity);
> - fprintf(f, " ct=%"PRIu32"\n", r->cons.tail);
> - fprintf(f, " ch=%"PRIu32"\n", r->cons.head);
> - fprintf(f, " pt=%"PRIu32"\n", r->prod.tail);
> - fprintf(f, " ph=%"PRIu32"\n", r->prod.head);
> + fprintf(f, " ct=%"PRIuPTR"\n", r->cons.tail);
> + fprintf(f, " ch=%"PRIuPTR"\n", r->cons.head);
> + fprintf(f, " pt=%"PRIuPTR"\n", r->prod.tail);
> + fprintf(f, " ph=%"PRIuPTR"\n", r->prod.head);
> fprintf(f, " used=%u\n", rte_ring_count(r));
> fprintf(f, " avail=%u\n", rte_ring_free_count(r));
> }
> diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
> index af5444a9f..12af64e13 100644
> --- a/lib/librte_ring/rte_ring.h
> +++ b/lib/librte_ring/rte_ring.h
> @@ -1,6 +1,6 @@
> /* SPDX-License-Identifier: BSD-3-Clause
> *
> - * Copyright (c) 2010-2017 Intel Corporation
> + * Copyright (c) 2010-2019 Intel Corporation
> * Copyright (c) 2007-2009 Kip Macy kmacy at freebsd.org
> * All rights reserved.
> * Derived from FreeBSD's bufring.h
> @@ -65,8 +65,8 @@ struct rte_memzone; /* forward declaration, so as not to require memzone.h */
>
> /* structure to hold a pair of head/tail values and other metadata */
> struct rte_ring_headtail {
> - volatile uint32_t head; /**< Prod/consumer head. */
> - volatile uint32_t tail; /**< Prod/consumer tail. */
> + volatile uintptr_t head; /**< Prod/consumer head. */
> + volatile uintptr_t tail; /**< Prod/consumer tail. */
> uint32_t single; /**< True if single prod/cons */
> };
Isn't this a major ABI change which will break existing applications?
More information about the dev
mailing list