<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<div class="moz-cite-prefix">
<div class="moz-cite-prefix">Considering this problem further, I don't see a way to avoid the CLANG compiler error with a function implementation. We would need a macro implementation similar to CLANGS arm_neon.h. In addition, it may be necessary to provide
separate implementations for CLANG and non-CLANG compilers since the builtins between the toolchains are different. One way to address this would be keep the existing function implementation, and add a new macro implementation for CLANG.
</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">For example, something like:</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix"><br>
<blockquote><font face="monospace">#if !defined(RTE_CC_CLANG)<br>
#if (defined(RTE_ARCH_ARM) && defined(RTE_ARCH_32)) || \<br>
(defined(RTE_ARCH_ARM64) && RTE_CC_IS_GNU && (GCC_VERSION < 70000))<br>
/* NEON intrinsic vcopyq_laneq_u32() is not supported in ARMv7-A(AArch32)<br>
* On AArch64, this intrinsic is supported since GCC version 7.<br>
*/<br>
static inline uint32x4_t<br>
vcopyq_laneq_u32(uint32x4_t a, const int lane_a,<br>
uint32x4_t b, const int lane_b)<br>
{<br>
return vsetq_lane_u32(vgetq_lane_u32(b, lane_b), a, lane_a);<br>
}<br>
#endif<br>
#else<br>
#if defined(RTE_ARCH_ARM) && defined(RTE_ARCH_32)<br>
/* NEON intrinsic vcopyq_laneq_u32() is not supported in ARMv7-A(AArch32)<br>
* On AArch64, this intrinsic is supported<br>
*/<br>
#ifdef LITTLE_ENDIAN<br>
#define vcopyq_laneq_u32(__arg1, __arg2, __arg3, __arg4) __extension__ ({ \<br>
uint32x4_t __ret; \<br>
uint32x4_t __lcl_arg1 = __arg1; \<br>
uint32x4_t __lcl_arg3 = __arg3; \<br>
__ret = vsetq_lane_u32(vgetq_lane_u32(__lcl_arg3, __arg4), __lcl_arg1, __arg2); \<br>
__ret; \<br>
})<br>
#else<br>
#define __noswap_vsetq_lane_u32(__arg1, __arg2, __arg3) __extension__ ({ \<br>
uint32x4_t __ret; \<br>
uint32_t __lcl_arg1 = __arg1; \<br>
uint32x4_t __lcl_arg2 = __arg2; \<br>
__ret = (uint32x4_t) __builtin_neon_vsetq_lane_i32(__lcl_arg1, (int32x4_t)__lcl_arg2, __arg3); \<br>
__ret; \<br>
})<br>
#define __noswap_vgetq_lane_u32(__arg1, __arg2) __extension__ ({ \<br>
uint32_t __ret; \<br>
uint32x4_t __lcl_arg1 = __arg1; \<br>
__ret = (uint32_t) __builtin_neon_vgetq_lane_i32((int32x4_t)__lcl_arg1, __arg2); \<br>
__ret; \<br>
})<br>
#define vcopyq_laneq_u32(__arg1, __arg2, __arg3, __arg4) __extension__ ({ \<br>
uint32x4_t __ret; \<br>
uint32x4_t __lcl_arg1 = __arg1; \<br>
uint32x4_t __lcl_arg3 = __arg3; \<br>
uint32x4_t __rev1; \<br>
uint32x4_t __rev3; \<br>
__rev1 = __builtin_shufflevector(__lcl_arg1, __lcl_arg1, 3, 2, 1, 0); \<br>
__rev3 = __builtin_shufflevector(__lcl_arg3, __lcl_arg3, 3, 2, 1, 0); \<br>
__ret = __noswap_vsetq_lane_u32(__noswap_vgetq_lane_u32(__rev3, __arg4), __rev1, __arg2); \<br>
__ret = __builtin_shufflevector(__ret, __ret, 3, 2, 1, 0); \<br>
__ret; \<br>
})<br>
#endif<br>
#endif<br>
#endif </font><br>
</blockquote>
</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">NOTE1: I saw no reason the CLANG arm_neon.h AARCH64 macros would not work for AARCH32, so the macros in this sample implementation are copies CLANG originals modified for (my) readability. I'm not an attorney, but if used, it
may be necessary to include the banner from the CLANG arm_neon.h.<br>
<br>
NOTE2: While I can build the CLANG ARM implementation, I lack the hardware to test it.<br>
</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">Regards,</div>
<div class="moz-cite-prefix">Roger</div>
<div class="moz-cite-prefix"><br>
</div>
</div>
<div class="moz-cite-prefix">On 12/3/24 7:37 PM, Roger Melton (rmelton) wrote:<br>
</div>
<blockquote type="cite" cite="mid:0da20131-67d8-4012-ba00-d777bf50a1f1@cisco.com">
<div class="moz-cite-prefix">After looking at this a bit closer today, I realize that my assertion that CLANG14 does support vcopyq_laneq_u32() for 32bit ARM was incorrect. It does not. The reason that disabling the implementation in rte_vect.h works for
our clang builds is that we do not build the l3fwd app nor the ixgbe PMD for our application, and they are the only libraries that reference that function.</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">The clang compile errors appear to be related to how clang handles compile time constants, but I'm am again unsure how to resolve them in a way that would work for both GNU and clang.</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">Any suggestions?<br>
</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">Regards,</div>
<div class="moz-cite-prefix">Roger</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">On 12/2/24 8:26 PM, Ruifeng Wang wrote:<br>
</div>
<blockquote type="cite" cite="mid:AS8PR08MB708004C682011D52122CE4D69E362@AS8PR08MB7080.eurprd08.prod.outlook.com">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style>@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}@font-face
{font-family:DengXian;
panose-1:2 1 6 0 3 1 1 1 1 1;}@font-face
{font-family:Aptos;
panose-1:2 11 0 4 2 2 2 2 2 4;}@font-face
{font-family:Consolas;
panose-1:2 11 6 9 2 2 4 3 2 4;}@font-face
{font-family:"\@DengXian";
panose-1:2 1 6 0 3 1 1 1 1 1;}p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
font-size:10.0pt;
font-family:"Aptos",sans-serif;}a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}pre
{mso-style-priority:99;
mso-style-link:"HTML Preformatted Char";
margin:0cm;
font-size:10.0pt;
font-family:"Courier New";}span.cp
{mso-style-name:cp;}span.HTMLPreformattedChar
{mso-style-name:"HTML Preformatted Char";
mso-style-priority:99;
mso-style-link:"HTML Preformatted";
font-family:"Consolas",serif;}span.cm
{mso-style-name:cm;}span.k
{mso-style-name:k;}span.w
{mso-style-name:w;}span.kr
{mso-style-name:kr;}span.n
{mso-style-name:n;}span.nf
{mso-style-name:nf;}span.p
{mso-style-name:p;}span.kt
{mso-style-name:kt;}span.EmailStyle30
{mso-style-type:personal-reply;
font-family:"Aptos",sans-serif;
color:windowtext;}.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;
mso-ligatures:none;}div.WordSection1
{page:WordSection1;}</style>
<div class="WordSection1">
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt">+Arm folks.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
<div id="mail-editor-reference-message-container">
<div>
<div>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal" style="margin-bottom:12.0pt"><b><span style="font-size:12.0pt;color:black">From:
</span></b><span style="font-size:12.0pt;color:black">Roger Melton (rmelton) <a class="moz-txt-link-rfc2396E" href="mailto:rmelton@cisco.com" moz-do-not-send="true">
<rmelton@cisco.com></a><br>
<b>Date: </b>Tuesday, December 3, 2024 at 3:39</span><span style="font-size:12.0pt;font-family:"Arial",sans-serif;color:black"> </span><span style="font-size:12.0pt;color:black">AM<br>
<b>To: </b><a class="moz-txt-link-abbreviated moz-txt-link-freetext" href="mailto:dev@dpdk.org" moz-do-not-send="true">dev@dpdk.org</a>
<a class="moz-txt-link-rfc2396E" href="mailto:dev@dpdk.org" moz-do-not-send="true">
<dev@dpdk.org></a>, Ruifeng Wang <a class="moz-txt-link-rfc2396E" href="mailto:Ruifeng.Wang@arm.com" moz-do-not-send="true">
<Ruifeng.Wang@arm.com></a><br>
<b>Subject: </b>lib/eal/arm/include/rte_vect.h fails to compile with clang14 for 32bit ARM<o:p></o:p></span></p>
</div>
<p>Hey folks,<o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:12.0pt">We are building DPDK with clang14 for a 32bit armv8-a based CPU and ran into a compile error with the following from lib/eal/arm/include/rte_vect.h:<o:p></o:p></span></p>
<p><br>
<br>
<o:p></o:p></p>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<pre><span class="cp">#if (defined(RTE_ARCH_ARM) && defined(RTE_ARCH_32)) || \</span><o:p></o:p></pre>
<pre><span class="cp">(defined(RTE_ARCH_ARM64) && <a href="https://elixir.bootlin.com/dpdk/v24.11/C/ident/RTE_CC_IS_GNU" moz-do-not-send="true">RTE_CC_IS_GNU</a> && (<a href="https://elixir.bootlin.com/dpdk/v24.11/C/ident/GCC_VERSION" moz-do-not-send="true">GCC_VERSION</a> < 70000))</span><o:p></o:p></pre>
<pre><span class="cm">/* NEON intrinsic vcopyq_laneq_u32() is not supported in ARMv7-A(AArch32)</span><o:p></o:p></pre>
<pre><span class="cm"> * On AArch64, this intrinsic is supported since GCC version 7.</span><o:p></o:p></pre>
<pre><span class="cm"> */</span><o:p></o:p></pre>
<pre><span class="k">static</span><span class="w"> </span><span class="kr">inline</span><span class="w"> </span><span class="n">uint32x4_t</span><o:p></o:p></pre>
<pre><span class="nf"><a href="https://elixir.bootlin.com/dpdk/v24.11/C/ident/vcopyq_laneq_u32" moz-do-not-send="true">vcopyq_laneq_u32</a></span><span class="p">(</span><span class="n">uint32x4_t</span><span class="w"> </span><span class="n">a</span><span class="p">,</span><span class="w"> </span><span class="k">const</span><span class="w"> </span><span class="kt">int</span><span class="w"> </span><span class="n">lane_a</span><span class="p">,</span><o:p></o:p></pre>
<pre><span class="w"> </span><span class="n">uint32x4_t</span><span class="w"> </span><span class="n">b</span><span class="p">,</span><span class="w"> </span><span class="k">const</span><span class="w"> </span><span class="kt">int</span><span class="w"> </span><span class="n">lane_b</span><span class="p">)</span><o:p></o:p></pre>
<pre><span class="p">{</span><o:p></o:p></pre>
<pre><span class="w"> </span><span class="k">return</span><span class="w"> </span><span class="n">vsetq_lane_u32</span><span class="p">(</span><span class="n">vgetq_lane_u32</span><span class="p">(</span><span class="n">b</span><span class="p">,</span><span class="w"> </span><span class="n">lane_b</span><span class="p">),</span><span class="w"> </span><span class="n">a</span><span class="p">,</span><span class="w"> </span><span class="n">lane_a</span><span class="p">);</span><o:p></o:p></pre>
<pre><span class="p">}</span><o:p></o:p></pre>
<pre><span class="cp">#endif</span><o:p></o:p></pre>
</blockquote>
<p class="MsoNormal"><span style="font-size:12.0pt"><br>
<span class="cp">clang14 compile fails as follows:</span><br>
<br>
<o:p></o:p></span></p>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<p class="MsoNormal"><span class="cp"><span style="font-size:12.0pt;font-family:"Courier New"">In file included from ../../../../../../cisco-dpdk-upstream-arm-clang-fixes.git/lib/eal/common/eal_common_options.c:36:</span></span><span style="font-size:12.0pt;font-family:"Courier New""><br>
<span class="cp">../../../../../../cisco-dpdk-upstream-arm-clang-fixes.git/lib/eal/arm/include/rte_vect.h:80:24: error: argument to '__builtin_neon_vgetq_lane_i32' must be a constant integer</span><br>
<span class="cp">return vsetq_lane_u32(vgetq_lane_u32(b, lane_b), a, lane_a);</span><br>
<span class="cp">^ ~~~~~~</span><br>
<span class="cp">/auto/binos-tools/llvm14/llvm-14.0-p24/lib/clang/14.0.5/include/arm_neon.h:7697:22: note: expanded from macro 'vgetq_lane_u32'</span><br>
<span class="cp">__ret = (uint32_t) __builtin_neon_vgetq_lane_i32((int32x4_t)__s0, __p1); \</span><br>
<span class="cp">^ ~~~~</span><br>
<span class="cp">/auto/binos-tools/llvm14/llvm-14.0-p24/lib/clang/14.0.5/include/arm_neon.h:24148:19: note: expanded from macro 'vsetq_lane_u32'</span><br>
<span class="cp">uint32_t __s0 = __p0; \</span><br>
<span class="cp">^~~~</span><br>
<span class="cp">In file included from ../../../../../../cisco-dpdk-upstream-arm-clang-fixes.git/lib/eal/common/eal_common_options.c:36:</span><br>
<span class="cp">../../../../../../cisco-dpdk-upstream-arm-clang-fixes.git/lib/eal/arm/include/rte_vect.h:80:9: error: argument to '__builtin_neon_vsetq_lane_i32' must be a constant integer</span><br>
<span class="cp">return vsetq_lane_u32(vgetq_lane_u32(b, lane_b), a, lane_a);</span><br>
<span class="cp">^ ~~~~~~</span><br>
<span class="cp">/auto/binos-tools/llvm14/llvm-14.0-p24/lib/clang/14.0.5/include/arm_neon.h:24150:24: note: expanded from macro 'vsetq_lane_u32'</span><br>
<span class="cp">__ret = (uint32x4_t) __builtin_neon_vsetq_lane_i32(__s0, (int32x4_t)__s1, __p2); \</span><br>
<span class="cp">^ ~~~~</span><br>
<span class="cp">2 errors generated.</span></span><span style="font-size:12.0pt"><o:p></o:p></span></p>
</blockquote>
<p><o:p> </o:p></p>
<p>clang14 does appear to support the vcopyq_laneq_u32() intrinsic, s0 we want to skip the conditional implementation.<o:p></o:p></p>
<p>Two approaches I have tested to resolve the error are:<o:p></o:p></p>
<p>1) skip if building with clang:<o:p></o:p></p>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<p class="MsoNormal"><span style="font-size:12.0pt"><br>
</span><span class="cp"><span style="font-size:12.0pt;font-family:"Courier New"">#if !defined(__clang__) && ((defined(RTE_ARCH_ARM) && defined(RTE_ARCH_32)) || \</span></span><span style="font-size:12.0pt"><br>
</span><span class="cp"><span style="font-size:12.0pt;font-family:"Courier New"">72 (defined(RTE_ARCH_ARM64) && RTE_CC_IS_GNU && (GCC_VERSION < 70000)))</span></span><span style="font-size:12.0pt;font-family:"Courier New""><br>
<br>
<br>
</span><span style="font-size:12.0pt"><o:p></o:p></span></p>
</blockquote>
<p class="MsoNormal"><span class="cp"><span style="font-size:12.0pt">2) skip if not building for ARMv7:</span></span><span style="font-size:12.0pt"><br>
<br>
<o:p></o:p></span></p>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<p class="MsoNormal"><span style="font-size:12.0pt;font-family:"Courier New""><br>
<span class="cp">#if (defined(RTE_ARCH_ARMv7) && defined(RTE_ARCH_32)) || \</span><br>
<span class="cp">(defined(RTE_ARCH_ARM64) && RTE_CC_IS_GNU && (GCC_VERSION < 70000))</span><br>
<br>
</span><span style="font-size:12.0pt"><o:p></o:p></span></p>
</blockquote>
<p><span class="cp">Both address our immediate problem, but may not be a appropriate for all cases.</span><o:p></o:p></p>
<p>Can anyone suggest the proper way to address this? I'll be submitting an patch once I have a solution that is acceptable to the community.<o:p></o:p></p>
<p class="MsoNormal"><span class="cp"><span style="font-size:12.0pt">Regards,</span></span><span style="font-size:12.0pt"><br>
<span class="cp">Roger</span><br>
<br>
<o:p></o:p></span></p>
<p><br>
<br>
<o:p></o:p></p>
<p><o:p> </o:p></p>
<p><o:p> </o:p></p>
<p class="MsoNormal"><span style="font-size:12.0pt"><br>
<br>
<br>
<o:p></o:p></span></p>
</div>
</div>
</div>
</div>
</blockquote>
<p><br>
</p>
</blockquote>
<p><br>
</p>
</body>
</html>