[dpdk-dev] [PATCH v1 6/6] distributor: fix handshake deadlock

David Hunt david.hunt at intel.com
Thu Sep 17 15:28:37 CEST 2020


Hi Lukasz,

On 15/9/2020 8:34 PM, Lukasz Wojciechowski wrote:
> Synchronization of data exchange between distributor and worker cores
> is based on 2 handshakes: retptr64 for returning mbufs from workers
> to distributor and bufptr64 for passing mbufs to workers.
>
> Without proper order of verifying those 2 handshakes a deadlock may
> occur. This can happen when worker core want to return back mbufs
> and waits for retptr handshake to be cleared and distributor core
> wait for bufptr to send mbufs to worker.
>
> This can happen as worker core first returns mbufs to distributor
> and later gets new mbufs, while distributor first release mbufs
> to worker and later handle returning packets.
>
> This patch fixes possibility of the deadlock by always taking care
> of returning packets first on the distributor side and handling
> packets while waiting to release new.
>
> Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
> Cc: david.hunt at intel.com
> Cc: stable at dpdk.org
>
> Signed-off-by: Lukasz Wojciechowski <l.wojciechow at partner.samsung.com>
> ---
>   lib/librte_distributor/rte_distributor.c | 9 ++++++---
>   1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
> index 89493c331..12b3db33c 100644
> --- a/lib/librte_distributor/rte_distributor.c
> +++ b/lib/librte_distributor/rte_distributor.c
> @@ -321,12 +321,14 @@ release(struct rte_distributor *d, unsigned int wkr)
>   	struct rte_distributor_buffer *buf = &(d->bufs[wkr]);
>   	unsigned int i;
>   
> +	handle_returns(d, wkr);
> +
>   	/* Sync with worker on GET_BUF flag */
>   	while (!(__atomic_load_n(&(d->bufs[wkr].bufptr64[0]), __ATOMIC_ACQUIRE)
> -		& RTE_DISTRIB_GET_BUF))
> +		& RTE_DISTRIB_GET_BUF)) {
> +		handle_returns(d, wkr);
>   		rte_pause();
> -
> -	handle_returns(d, wkr);
> +	}
>   
>   	buf->count = 0;
>   
> @@ -376,6 +378,7 @@ rte_distributor_process(struct rte_distributor *d,
>   		/* Flush out all non-full cache-lines to workers. */
>   		for (wid = 0 ; wid < d->num_workers; wid++) {
>   			/* Sync with worker on GET_BUF flag. */
> +			handle_returns(d, wid);
>   			if (__atomic_load_n(&(d->bufs[wid].bufptr64[0]),
>   				__ATOMIC_ACQUIRE) & RTE_DISTRIB_GET_BUF) {
>   				release(d, wid);

Makes sense. Thanks for the series.  Again, no degradation in 
performance on my systems.

Acked-by: David Hunt <david.hunt at intel.com>





More information about the dev mailing list