[dpdk-dev] [RFC PATCH] test/distributor: fix burst flush on worker quit

David Hunt david.hunt at intel.com
Wed Apr 28 14:50:56 CEST 2021


On 28/4/2021 8:46 AM, Lukasz Wojciechowski wrote:
> Hi Stanislaw,
>
> W dniu 26.04.2021 o 18:33, Stanislaw Kardach pisze:
>> While working on RISC-V port I have encountered a situation where worker
>> threads get stuck in the rte_distributor_return_pkt() function in the
>> burst test.
>> After investigation some of the threads enter this function with
>> flag RTE_DISTRIB_GET_BUF set in the d->retptr64[0]. At the same time
>> main thread has already passed rte_distributor_process() so nobody will
>> clear this flag and hence workers can't return.
>>
>> What I've noticed is that adding a flush just after the last _process(),
>> similarly to how quit_workers() function is written in the
>> test_distributor.c fixes the issue.
>> Additionally the issue disappears when I remove the rdtsc delay code
>> inside the rte_distributor_request_pkt().
>> However I can't get this to reproduce on x86 (even with SIMD forced
>> off) and with artificial delays, which is why I wonder whether I'm not
>> actually hiding some other issue.
> I was able to reproduce the issue on x86 arch using VM with 32 emulated
> CPU cores.
> I guess it would be enough to just have more than 8 worker threads, so
> not all of them would be awaken.
>> Looking at the implementation of the distributor, it is based on
>> __atomic_* builtins and the only platform related bit in the fast-path
>> is the rte_rdtsc() and rte_pause(). There may be some issues in the
>> toolchain (I've tried so far with the Ubuntu one - 10.2.0-8ubuntu1).
>> I should add that all unit tests for distributor are passing so either
>> there's some coverage corner case or the implementation works on RISC-V.
>> As for RDTSC I'm using a sleep-stable time counter with 1MHz frequency
>> and switching to high resolution cycle counter also removes the issue
>> but that's the same as removing the rdtsc delay as mentioned above.
>>
>> I'd love to hear from You if this fix makes any sense.
> Yes your patch fixes the issue and is perfectly fine.
>> While modifying this test, I've also pulled in a fix from
>> test_distributor.c which ensures that each thread gets his own wakeup
>> packet as it's possible that when sending a burst of packets, they won't
>> be spread over all the workers.
>>
>> Signed-off-by: Stanislaw Kardach <kda at semihalf.com>
>> Fixes: 7c3287a10535 ("test/distributor: add performance test for burst mode")
>> Cc: david.hunt at intel.com
>> Cc: l.wojciechow at partner.samsung.com
>> Cc: David Marchand <david.marchand at redhat.com>
>> ---
>>    app/test/test_distributor_perf.c | 6 ++++--
>>    1 file changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/app/test/test_distributor_perf.c b/app/test/test_distributor_perf.c
>> index b25f79a34..fdbeae6d2 100644
>> --- a/app/test/test_distributor_perf.c
>> +++ b/app/test/test_distributor_perf.c
>> @@ -188,13 +188,15 @@ quit_workers(struct rte_distributor *d, struct rte_mempool *p)
>>    	rte_mempool_get_bulk(p, (void *)bufs, num_workers);
>>    
>>    	quit = 1;
>> -	for (i = 0; i < num_workers; i++)
>> +	for (i = 0; i < num_workers; i++) {
>>    		bufs[i]->hash.usr = i << 1;
>> -	rte_distributor_process(d, bufs, num_workers);
>> +		rte_distributor_process(d, &bufs[i], 1);
>> +	}
>>    
>>    	rte_mempool_put_bulk(p, (void *)bufs, num_workers);
>>    
>>    	rte_distributor_process(d, NULL, 0);
>> +	rte_distributor_flush(d);
>>    	rte_eal_mp_wait_lcore();
>>    	quit = 0;
>>    	worker_idx = 0;
>       Tested-by: Lukasz Wojciechowski <l.wojciechow at partner.samsung.com>
>       Reviewed-by: Lukasz Wojciechowski <l.wojciechow at partner.samsung.com>

Thanks, Stanislaw, Lukasz.

Acked-by: David Hunt <david.hunt at intel.com>






More information about the dev mailing list