[dpdk-dev] [PATCH v3] hash: added a new API to hash to query key id
Aaron Conole
aconole at redhat.com
Tue Nov 26 16:58:36 CET 2019
"Van Haaren, Harry" <harry.van.haaren at intel.com> writes:
> Hi Aaron,
>
>> -----Original Message-----
>> From: Aaron Conole <aconole at redhat.com>
>> Sent: Monday, November 25, 2019 10:54 PM
>> To: Thomas Monjalon <thomas at monjalon.net>
>> Cc: Van Haaren, Harry <harry.van.haaren at intel.com>; Amber, Kumar
>> <kumar.amber at intel.com>; dev at dpdk.org; Wang, Yipeng1
>> <yipeng1.wang at intel.com>; Yigit, Ferruh <ferruh.yigit at intel.com>; Thakur,
>> Sham Singh <sham.singh.thakur at intel.com>; David Marchand
>> <dmarchan at redhat.com>
>> Subject: Re: [dpdk-dev] [PATCH v3] hash: added a new API to hash to query
>> key id
>>
>> Aaron Conole <aconole at redhat.com> writes:
>>
>> > Thomas Monjalon <thomas at monjalon.net> writes:
>> >
>> >>> From: Aaron Conole <aconole at redhat.com>
>> >>> > - if (!service_valid(id))
>> >>> > + if (id >= RTE_SERVICE_NUM_MAX || !service_valid(id))
>> >>
>> >> Why not adding this check in service_valid()?
>> >
>> > I think the best fix is to use SERVICE_VALID_GET_OR_ERR_RET() in these
>> > places. For this, I at least want to try and show that there aren't any
>> > further errors. And my test loop has been running for a while now
>> > without any more errors or segfaults, so I guess it's okay to build a
>> > proper patch.
>>
>> This popped up:
>>
>> EAL: Test assert service_lcore_en_dis_able line 487 failed: Ex-service core
>> function call had no effect.
>>
>> So I'll spend some time in this area, it seems.
>
>
> The below diff makes it 100% reproducible here, failing every time.
>
> It seems like the main thread is returning, before the service thread has returned.
>
> The rte_eal_mp_wait_lcore() call seems to not wait on the service-core, which allows
> the main thread to read the "service_remote_launch_flag" value as 0 (before the service-thread writes it to 1).
>
> Adding the delay between the service launch and service write being performed makes this issue much much more likely to occur - so the above description I have confidence in.
>
> What I'm not clear on (yet) is why the eal_mp_wait_lcore() isn't waiting...
As I wrote in the other thread, it's because eal_mp_wait_lcore won't
look at lcores with ROLE_SERVICE.
> -H
I've been running something similar to the suggested patch for 24
minutes now with no failure. I've also removed the eal_mp_wait_lcore()
call in other areas throughout the test and switched to individual core
waiting "just in case." I don't think it's the right fix, though.
More information about the dev
mailing list