[dpdk-dev] DPDK compilation on arm is failing in Travis

Michael Santana Francisco msantana at redhat.com
Thu Jun 6 00:38:31 CEST 2019


On 6/5/19 5:36 PM, Honnappa Nagarahalli wrote:
>>> Thomas Monjalon <thomas at monjalon.net> writes:
>>>
>>>> 05/06/2019 21:40, Aaron Conole:
>>>>> Thomas Monjalon <thomas at monjalon.net> writes:
>>>>>
>>>>>> The compilation of the master branch is failing for aarch64:
>>>>>> 	https://travis-ci.com/DPDK/dpdk
>>>>>> The log is so much verbose that I am not able to understand what
>>>>>> is really wrong.
>>>>>> Please help to diagnose and fix, thanks.
>>>>> A discussion about this:
>>>>>
>>>>> http://mails.dpdk.org/archives/dev/2019-June/134012.html
>>>> I see the error now.
>>>> It is printing the full log after the error, so I missed the error
>>>> at the top.
>>>>
>>>> I've read your comment about a possible error with the patch
>>>> removing weak functions but neither me nor Bruce were able to reproduce
>> it.
>>>> What is the condition to see this compiler warning?
>>> It is only on ARM, and only when the neon intrinsics are in use.
>> I am not able to reproduce it from the tip of master.
>>
>> I am using:
>> gcc (Ubuntu 8.3.0-6ubuntu1~18.04) 8.3.0
>>
>>  From the log on Travis, looks like the compiler is:
>> gcc (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609
>>
>> Is this the issue?
>>
>> Why are we seeing the error now?
> I tested with gcc-5 (Ubuntu/Linaro 5.5.0-12ubuntu1) 5.5.0 20171010, it works fine. I cannot get hold of 5.4.0. Not sure if needs to be supported.
> Are there any issues in upgrading to 7 or 8?
I have tested it on my ubuntu 16.04 vm on commit 
8cb511bb94ad92a76990f175cac76bb13d51daba (head of master seems to be 
failing for other reasons on my vm).
I tested the following gcc versions:

gcc 5.5.0 "cc (Ubuntu 5.5.0-12ubuntu1~16.04) 5.5.0 20171010"
gcc 7.4.0 "cc (Ubuntu 7.4.0-1ubuntu1~16.04~ppa1) 7.4.0"
gcc 8.1.0 "cc (Ubuntu 8.1.0-5ubuntu1~16.04) 8.1.0"

All tested versions failed on the exact same error shown in travis. I 
don't know if the compiler is at fault here. Maybe Aaron's patch is a 
viable option?

>
>>> The issue is the vector lane setting code looks like:
>>>
>>>     lval = lane_set(scalar, rval, lane id)
>>>
>>> In this case, 'rval' is being used before it is ever set, but it
>>> really could be just 0 for the first lane setting code.  Thereafter,
>>> we use the old value of input as the rval, but each time a different lane is set.
>>>
>>> It would be nice if there were an intrinsic that formatted correctly
>>> from the start (something we could call like lval =
>> lane_set_from_array(scalar_array)).
>>> Then 'input' would never appear as an rval before it was set.
>>>
>>> I thought Jerin Jacob (CC'd) would have some opinion on the right fix.
>>> There are three 'fixes' I know exist - one is to squelch the warning
>>> (but I don't like it because it could hide future code that introduces
>>> this), one is to create a static and use assignment, one is to replace
>>> the first call and pass in a 0'd lane for the first one.
>>>
>>> Actually, I think I have a patch that could work to not introduce an
>>> assignment, but squelch the warning.  Something like the following (not
>> tested).
>>> ---
>>>
>>> diff --git a/lib/librte_acl/acl_run_neon.h
>>> b/lib/librte_acl/acl_run_neon.h index 01b9766d8..37c984fef 100644
>>> --- a/lib/librte_acl/acl_run_neon.h
>>> +++ b/lib/librte_acl/acl_run_neon.h
>>> @@ -165,6 +165,7 @@ search_neon_8(const struct rte_acl_ctx *ctx, const
>>> uint8_t **data,
>>>   	uint64_t index_array[8];
>>>   	struct completion cmplt[8];
>>>   	struct parms parms[8];
>>> +	static int32x4_t ZEROVAL;
>>>   	int32x4_t input0, input1;
>>>
>>>   	acl_set_flow(&flows, cmplt, RTE_DIM(cmplt), data, results, @@ -
>>> 181,8 +182,8 @@ search_neon_8(const struct rte_acl_ctx *ctx, const
>>> uint8_t **data,
>>>
>>>   	while (flows.started > 0) {
>>>   		/* Gather 4 bytes of input data for each stream. */
>>> -		input0 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 0), input0,
>>> 0);
>>> -		input1 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 4), input1,
>>> 0);
>>> +		input0 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 0),
>>> ZEROVAL, 0);
>>> +		input1 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 4),
>>> ZEROVAL, 0);
>>>
>>>   		input0 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 1), input0,
>> 1);
>>>   		input1 = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 5), input1,
>> 1); @@
>>> -227,6 +228,7 @@ search_neon_4(const struct rte_acl_ctx *ctx, const
>>> uint8_t **data,
>>>   	uint64_t index_array[4];
>>>   	struct completion cmplt[4];
>>>   	struct parms parms[4];
>>> +	static int32x4_t ZEROVAL;
>>>   	int32x4_t input;
>>>
>>>   	acl_set_flow(&flows, cmplt, RTE_DIM(cmplt), data, results, @@ -
>>> 242,7 +244,7 @@ search_neon_4(const struct rte_acl_ctx *ctx, const
>>> uint8_t **data,
>>>
>>>   	while (flows.started > 0) {
>>>   		/* Gather 4 bytes of input data for each stream. */
>>> -		input = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 0), input, 0);
>>> +		input = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 0),
>>> ZEROVAL, 0);
>>>   		input = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 1), input, 1);
>>>   		input = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 2), input, 2);
>>>   		input = vsetq_lane_s32(GET_NEXT_4BYTES(parms, 3), input, 3);
>>> --
>>> 2.21.0




More information about the dev mailing list