[dpdk-dev] [PATCH v3 09/12] service: avoid race condition for MT unsafe service
Van Haaren, Harry
harry.van.haaren at intel.com
Fri Apr 3 13:58:08 CEST 2020
> From: Phil Yang <phil.yang at arm.com>
> Sent: Tuesday, March 17, 2020 1:18 AM
> To: thomas at monjalon.net; Van Haaren, Harry <harry.van.haaren at intel.com>;
> Ananyev, Konstantin <konstantin.ananyev at intel.com>;
> stephen at networkplumber.org; maxime.coquelin at redhat.com; dev at dpdk.org
> Cc: david.marchand at redhat.com; jerinj at marvell.com; hemant.agrawal at nxp.com;
> Honnappa.Nagarahalli at arm.com; gavin.hu at arm.com; ruifeng.wang at arm.com;
> joyce.kong at arm.com; nd at arm.com; Honnappa Nagarahalli
> <honnappa.nagarahalli at arm.com>; stable at dpdk.org
> Subject: [PATCH v3 09/12] service: avoid race condition for MT unsafe service
>
> From: Honnappa Nagarahalli <honnappa.nagarahalli at arm.com>
>
> There has possible that a MT unsafe service might get configured to
> run on another core while the service is running currently. This
> might result in the MT unsafe service running on multiple cores
> simultaneously. Use 'execute_lock' always when the service is
> MT unsafe.
>
> Fixes: e9139a32f6e8 ("service: add function to run on app lcore")
> Cc: stable at dpdk.org
>
> Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli at arm.com>
> Reviewed-by: Phil Yang <phil.yang at arm.com>
> Reviewed-by: Gavin Hu <gavin.hu at arm.com>
We should put "fix" in the title, once converged on an implementation.
Regarding Fixes and stable backport, we should consider if
fixing this in stable with a performance degradation, fixing with more
complex solution, or documenting a known issue a better solution.
This fix (always taking the atomic lock) will have a negative performance
impact on existing code using services. We should investigate a way
to fix it without causing datapath performance degradation.
I think there is a way to achieve this by moving more checks/time
to the control path (lcore updating the map), and not forcing the
datapath lcore to always take an atomic.
In this particular case, we have a counter for number of iterations
that a service has done. If this increments we know that the lcore
running the service has re-entered the critical section, so would
see an updated "needs atomic" flag.
This approach may introduce a predictable branch on the datapath,
however the cost of a predictable branch vs always taking an atomic
is order(s?) of magnitude, so a branch is much preferred.
It must be possible to avoid the datapath overhead using a scheme
like this. It will likely be more complex than your proposed change
below, however if it avoids datapath performance drops I feel that
a more complex solution is worth investigating at least.
A unit test is required to validate a fix like this - although perhaps
found by inspection/review, a real-world test to validate would give
confidence.
Thoughts on such an approach?
> ---
> lib/librte_eal/common/rte_service.c | 11 +++++------
> 1 file changed, 5 insertions(+), 6 deletions(-)
>
> diff --git a/lib/librte_eal/common/rte_service.c
> b/lib/librte_eal/common/rte_service.c
> index 557b5a9..32a2f8a 100644
> --- a/lib/librte_eal/common/rte_service.c
> +++ b/lib/librte_eal/common/rte_service.c
> @@ -50,6 +50,10 @@ struct rte_service_spec_impl {
> uint8_t internal_flags;
>
> /* per service statistics */
> + /* Indicates how many cores the service is mapped to run on.
> + * It does not indicate the number of cores the service is running
> + * on currently.
> + */
> rte_atomic32_t num_mapped_cores;
> uint64_t calls;
> uint64_t cycles_spent;
> @@ -370,12 +374,7 @@ service_run(uint32_t i, struct core_state *cs, uint64_t
> service_mask,
>
> cs->service_active_on_lcore[i] = 1;
>
> - /* check do we need cmpset, if MT safe or <= 1 core
> - * mapped, atomic ops are not required.
> - */
> - const int use_atomics = (service_mt_safe(s) == 0) &&
> - (rte_atomic32_read(&s->num_mapped_cores) > 1);
> - if (use_atomics) {
> + if (service_mt_safe(s) == 0) {
> if (!rte_atomic32_cmpset((uint32_t *)&s->execute_lock, 0, 1))
> return -EBUSY;
>
> --
> 2.7.4
More information about the dev
mailing list