[dpdk-dev] [PATCH v1 1/2] lib/lpm: memory orderings to avoid race conditions for v1604
Medvedkin, Vladimir
vladimir.medvedkin at intel.com
Wed Jun 5 12:50:16 CEST 2019
Hi Wang,
On 05/06/2019 06:54, Ruifeng Wang wrote:
> When a tbl8 group is getting attached to a tbl24 entry, lookup
> might fail even though the entry is configured in the table.
>
> For ex: consider a LPM table configured with 10.10.10.1/24.
> When a new entry 10.10.10.32/28 is being added, a new tbl8
> group is allocated and tbl24 entry is changed to point to
> the tbl8 group. If the tbl24 entry is written without the tbl8
> group entries updated, a lookup on 10.10.10.9 will return
> failure.
>
> Correct memory orderings are required to ensure that the
> store to tbl24 does not happen before the stores to tbl8 group
> entries complete.
>
> The orderings have impact on LPM performance test.
> On Arm A72 platform, delete operation has 2.7% degradation, while
> add / lookup has no notable performance change.
> On x86 E5 platform, add operation has 4.3% degradation, delete
> operation has 2.2% - 10.2% degradation, lookup has no performance
> change.
I think it is possible to avoid add/del performance degradation
1. Explicitly mark struct rte_lpm_tbl_entry 4-byte aligned
2. Cast value to uint32_t (uint16_t for 2.0 version) on memory write
3. Use rte_wmb() after memory write
>
> Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli at arm.com>
> Signed-off-by: Ruifeng Wang <ruifeng.wang at arm.com>
> ---
> lib/librte_lpm/rte_lpm.c | 32 +++++++++++++++++++++++++-------
> lib/librte_lpm/rte_lpm.h | 4 ++++
> 2 files changed, 29 insertions(+), 7 deletions(-)
>
> diff --git a/lib/librte_lpm/rte_lpm.c b/lib/librte_lpm/rte_lpm.c
> index 6b7b28a2e..6ec450a08 100644
> --- a/lib/librte_lpm/rte_lpm.c
> +++ b/lib/librte_lpm/rte_lpm.c
> @@ -806,7 +806,8 @@ add_depth_small_v1604(struct rte_lpm *lpm, uint32_t ip, uint8_t depth,
> /* Setting tbl24 entry in one go to avoid race
> * conditions
> */
> - lpm->tbl24[i] = new_tbl24_entry;
> + __atomic_store(&lpm->tbl24[i], &new_tbl24_entry,
> + __ATOMIC_RELEASE);
>
> continue;
> }
> @@ -1017,7 +1018,11 @@ add_depth_big_v1604(struct rte_lpm *lpm, uint32_t ip_masked, uint8_t depth,
> .depth = 0,
> };
>
> - lpm->tbl24[tbl24_index] = new_tbl24_entry;
> + /* The tbl24 entry must be written only after the
> + * tbl8 entries are written.
> + */
> + __atomic_store(&lpm->tbl24[tbl24_index], &new_tbl24_entry,
> + __ATOMIC_RELEASE);
>
> } /* If valid entry but not extended calculate the index into Table8. */
> else if (lpm->tbl24[tbl24_index].valid_group == 0) {
> @@ -1063,7 +1068,11 @@ add_depth_big_v1604(struct rte_lpm *lpm, uint32_t ip_masked, uint8_t depth,
> .depth = 0,
> };
>
> - lpm->tbl24[tbl24_index] = new_tbl24_entry;
> + /* The tbl24 entry must be written only after the
> + * tbl8 entries are written.
> + */
> + __atomic_store(&lpm->tbl24[tbl24_index], &new_tbl24_entry,
> + __ATOMIC_RELEASE);
>
> } else { /*
> * If it is valid, extended entry calculate the index into tbl8.
> @@ -1391,6 +1400,7 @@ delete_depth_small_v1604(struct rte_lpm *lpm, uint32_t ip_masked,
> /* Calculate the range and index into Table24. */
> tbl24_range = depth_to_range(depth);
> tbl24_index = (ip_masked >> 8);
> + struct rte_lpm_tbl_entry zero_tbl24_entry = {0};
>
> /*
> * Firstly check the sub_rule_index. A -1 indicates no replacement rule
> @@ -1405,7 +1415,8 @@ delete_depth_small_v1604(struct rte_lpm *lpm, uint32_t ip_masked,
>
> if (lpm->tbl24[i].valid_group == 0 &&
> lpm->tbl24[i].depth <= depth) {
> - lpm->tbl24[i].valid = INVALID;
> + __atomic_store(&lpm->tbl24[i],
> + &zero_tbl24_entry, __ATOMIC_RELEASE);
> } else if (lpm->tbl24[i].valid_group == 1) {
> /*
> * If TBL24 entry is extended, then there has
> @@ -1450,7 +1461,8 @@ delete_depth_small_v1604(struct rte_lpm *lpm, uint32_t ip_masked,
>
> if (lpm->tbl24[i].valid_group == 0 &&
> lpm->tbl24[i].depth <= depth) {
> - lpm->tbl24[i] = new_tbl24_entry;
> + __atomic_store(&lpm->tbl24[i], &new_tbl24_entry,
> + __ATOMIC_RELEASE);
> } else if (lpm->tbl24[i].valid_group == 1) {
> /*
> * If TBL24 entry is extended, then there has
> @@ -1713,8 +1725,11 @@ delete_depth_big_v1604(struct rte_lpm *lpm, uint32_t ip_masked,
> tbl8_recycle_index = tbl8_recycle_check_v1604(lpm->tbl8, tbl8_group_start);
>
> if (tbl8_recycle_index == -EINVAL) {
> - /* Set tbl24 before freeing tbl8 to avoid race condition. */
> + /* Set tbl24 before freeing tbl8 to avoid race condition.
> + * Prevent the free of the tbl8 group from hoisting.
> + */
> lpm->tbl24[tbl24_index].valid = 0;
> + __atomic_thread_fence(__ATOMIC_RELEASE);
> tbl8_free_v1604(lpm->tbl8, tbl8_group_start);
> } else if (tbl8_recycle_index > -1) {
> /* Update tbl24 entry. */
> @@ -1725,8 +1740,11 @@ delete_depth_big_v1604(struct rte_lpm *lpm, uint32_t ip_masked,
> .depth = lpm->tbl8[tbl8_recycle_index].depth,
> };
>
> - /* Set tbl24 before freeing tbl8 to avoid race condition. */
> + /* Set tbl24 before freeing tbl8 to avoid race condition.
> + * Prevent the free of the tbl8 group from hoisting.
> + */
> lpm->tbl24[tbl24_index] = new_tbl24_entry;
> + __atomic_thread_fence(__ATOMIC_RELEASE);
> tbl8_free_v1604(lpm->tbl8, tbl8_group_start);
> }
> #undef group_idx
> diff --git a/lib/librte_lpm/rte_lpm.h b/lib/librte_lpm/rte_lpm.h
> index b886f54b4..6f5704c5c 100644
> --- a/lib/librte_lpm/rte_lpm.h
> +++ b/lib/librte_lpm/rte_lpm.h
> @@ -354,6 +354,10 @@ rte_lpm_lookup(struct rte_lpm *lpm, uint32_t ip, uint32_t *next_hop)
> ptbl = (const uint32_t *)(&lpm->tbl24[tbl24_index]);
> tbl_entry = *ptbl;
>
> + /* Memory ordering is not required in lookup. Because dataflow
> + * dependency exists, compiler or HW won't be able to re-order
> + * the operations.
> + */
> /* Copy tbl8 entry (only if needed) */
> if (unlikely((tbl_entry & RTE_LPM_VALID_EXT_ENTRY_BITMASK) ==
> RTE_LPM_VALID_EXT_ENTRY_BITMASK)) {
--
Regards,
Vladimir
More information about the dev
mailing list