[dpdk-dev] [PATCH v5 00/15] fix distributor synchronization issues

Lukasz Wojciechowski l.wojciechow at partner.samsung.com
Sat Oct 10 01:25:29 CEST 2020


W dniu 09.10.2020 o 23:41, Lukasz Wojciechowski pisze:
>
> Hi David,
>
> W dniu 09.10.2020 o 14:53, David Marchand pisze:
>> Hello Lukasz,
>>
>> On Thu, Oct 8, 2020 at 11:17 PM Lukasz Wojciechowski
>> <l.wojciechow at partner.samsung.com>  wrote:
>>> I'm here if you have any questions or suggestions
>> Unfortunately, I can see a timeout on the distributor autotest in Travis:
>> https://travis-ci.com/github/ovsrobot/dpdk/jobs/396703415#L1151
>>
>> Can you have a look?
> I took a look and I don't know the cause of test hanging and timeout.
> I run today more than 200000 iteration of distributor tests and didn't 
> get a single failure or lock.
> David Hunt run the series tests today also, when checking impact on 
> performance and I guess he didn't got the issue.
> @DavidHunt, Am I right?
>
> The failure happened in only one configuration and tests were run by 
> travis using different compilers, architecture, etc.
>
> The test did not wrote anything on the stdout or stderr:
> --- stdout ---
> EAL: Probing VFIO support...
> APP: HPET is not enabled, using TSC as default timer
> RTE>>distributor_autotest
> --- stderr ---
> EAL: Detected 2 lcore(s)
> EAL: Detected 1 NUMA nodes
> EAL: Multi-process socket /var/run/dpdk/distributor_autotest/mp_socket
> EAL: Selected IOVA mode 'PA'
> EAL: No available hugepages reported in hugepages-1048576kB
> -------
> That's quite strange before the first test that is run: sanity_test, 
> starts with printing information about the start.
>
> Before that there is only the initialization code of the distributor 
> structure and creation of mempool.
>
> The only modifications I made to initialization of distributor 
> structure was initialization of active and active sum fields of 
> distributor:
>
>     memset(d->active, 0, sizeof(d->active));
>     d->activesum = 0;
>
> That's seems not to be the reason.
>
> I don't know what could be.
>
>
> Is there a way to trigger travis job manually to see if the timeout 
> reproduces ?
>
>> Btw, did you receive a notification about this from the robot?
> Yes, I got it.
> But I interpreted it badly. I downloaded the log and start reading it 
> up from the end and when I saw:
>
>     Compiler stderr:^M
>      /usr/bin/ld: cannot find -lvirt^M
>     collect2: error: ld returned 1 exit status^M
>
>  I thought that was it. Sorry for that.
>
>
> BTW I'm going to publish v6 with changes suggested by Honnappa 
> Nagarahalli (RELAXED memory mode) and David Hunt (indentations)
>

More bad news - same issue just appeared on travis for v6.
Good news we can reproduce it.

Is there a way to delegate a job for travis other way than sending a new 
patch version?

> Best regards
>
> Lukasz
>
> -- 
> Lukasz Wojciechowski
> Principal Software Engineer
>
> Samsung R&D Institute Poland
> Samsung Electronics
> Office +48 22 377 88 25
> l.wojciechow at partner.samsung.com
>
-- 
Lukasz Wojciechowski
Principal Software Engineer

Samsung R&D Institute Poland
Samsung Electronics
Office +48 22 377 88 25
l.wojciechow at partner.samsung.com



More information about the dev mailing list