[dpdk-dev] 18.08 build error on ppc64el - bool as vector type

Adrien Mazarguil adrien.mazarguil at 6wind.com
Tue Aug 28 17:02:35 CEST 2018


On Tue, Aug 28, 2018 at 02:38:35PM +0200, Christian Ehrhardt wrote:
> On Tue, Aug 28, 2018 at 1:44 PM Adrien Mazarguil <adrien.mazarguil at 6wind.com>
> wrote:
> 
> > On Tue, Aug 28, 2018 at 01:30:12PM +0200, Christian Ehrhardt wrote:
> > > On Mon, Aug 27, 2018 at 2:22 PM Adrien Mazarguil <
> > adrien.mazarguil at 6wind.com>
> > > wrote:
> > >
> > > > Hi Christian,
> > > >
> > > > On Wed, Aug 22, 2018 at 05:11:41PM +0200, Christian Ehrhardt wrote:
> > > > > Just FYI the simple change hits similar issues later on.
> > > > >
> > > > > The (not really) proposed patch would have to be extended to be as
> > > > > following.
> > > > > We really need a better solution (or somebody has to convince me
> > that my
> > > > > change is better than a band aid).
> > > >
> > > > Thanks for reporting. I've made a quick investigation on my own and
> > believe
> > > > it's a toolchain issue which may affect more than this PMD;
> > potentially all
> > > > users of stdbool.h (C11) on this platform.
> > > >
> > >
> > > Yeah I assumed as much, which is why I was hoping that some of the arch
> > > experts would jump in and say "yeah this is a common thing and correctly
> > > handled like <FOO>"
> > > I'll continue trying to reach out to people that should know better still
> > > ...
> > >
> > >
> > > > C11's stdbool.h defines a bool macro as _Bool (big B) along with
> > > > true/false. On PPC targets, another file (altivec.h) defines bool as
> > _bool
> > > > (small b) but not true/false:
> > > >
> > > >  #if !defined(__APPLE_ALTIVEC__)
> > > >  /* You are allowed to undef these for C++ compatibility.  */
> > > >  #define vector __vector
> > > >  #define pixel __pixel
> > > >  #define bool __bool
> > > >  #endif
> > > >
> > > > mlx5_nl.c explicitly includes stdbool.h to get the above definitions
> > then
> > > > includes mlx5.h -> rte_ether.h -> ppc_64/rte_memcpy.h -> altivec.h.
> > > >
> > > > For some reason the conflicting bool redefinition doesn't seem to
> > raise any
> > > > warnings, but results in mismatching bool and true/false definitions;
> > an
> > > > integer value cannot be assigned to a bool variable anymore, hence the
> > > > build
> > > > failure.
> > > >
> > > > The inability to assign integer values to bool is, in my opinion, a
> > > > fundamental issue caused by altivec.h. If there is no way to fix this
> > on
> > > > the
> > > > system, there are a couple of workarounds for DPDK, by order of
> > preference:
> > > >
> > > > 1. Always #undef bool after including altivec.h in
> > > >    lib/librte_eal/common/include/arch/ppc_64/rte_memcpy.h. I do not
> > think
> > > >    anyone expects this type to be unusable with true/false or integer
> > > > values
> > > >    anyway. The version of altivec.h I have doesn't rely on this macro
> > at
> > > >    all so it's probably not a big loss.
> > > >
> > >
> > > The undef of a definition in header A by hedaer B can lead to most
> > > interesting, still broken effects.
> > > If e.g. one does
> > > #include <stdbool.h>
> > > #include "mlx5.h"
> > >
> > > or similar then it would undefine that of stdbool as well right?
> > > In any case, the undefine not only would be suspicious it also fails
> > right
> > > away:
> > >
> > > In file included from
> > > /home/ubuntu/deb_dpdk/lib/librte_eal/common/malloc_heap.c:27:
> > > /home/ubuntu/deb_dpdk/lib/librte_eal/common/eal_memalloc.h:30:15:
> > > error: unknown
> > > type name ‘bool’; did you mean ‘_Bool’?
> > >   int socket, bool exact);
> > >               ^~~~
> > >               _Bool
> > > [...]
> > >
> > >
> > >
> > > >    Ditto for "pixel" and "vector" keywords. Alternatively you could
> > #define
> > > >    __APPLE_ALTIVEC__ before including altivec.h to prevent them from
> > > > getting
> > > >    defined in the first place.
> > > >
> > >
> > > Interesting I got plenty of these:
> > > In file included from
> > > /home/ubuntu/deb_dpdk/lib/librte_eal/common/eal_common_options.c:25:
> > > /home/ubuntu/deb_dpdk/debian/build/static-root/include/rte_memcpy.h:39:
> > > warning:
> > > "__APPLE_ALTIVEC__" redefined
> > > #define __APPLE_ALTIVEC__
> > >
> > > With a few of it being even errors, but the position of the original
> > define
> > > is interesting.
> > > /home/ubuntu/deb_dpdk/debian/build/static-root/include/rte_memcpy.h:39:
> > error:
> > > "__APPLE_ALTIVEC__" redefined [-Werror]
> > > #define __APPLE_ALTIVEC__
> > > <built-in>: note: this is the location of the previous definition
> > >
> > > So if being a built-in, shouldn't it ALWAYS be defined and never
> > > over-declare the bool type?
> > >
> > > Checking GCC on the platform:
> > > $ gcc -dM -E - < /dev/null | grep ALTI
> > > #define __ALTIVEC__ 1
> > > #define __APPLE_ALTIVEC__ 1
> > >
> > >
> > > I added an #error in the header and dropped all dpdk changes.
> > > if !defined(__APPLE_ALTIVEC__)
> > > /* You are allowed to undef these for C++ compatibility.  */
> > > #error WOULD REDECLARE BOOL
> > > #define vector __vector
> > >
> > > And I get:
> > > gcc -Wp,-MD,./.mlx4.o.d.tmp -Wdate-time -D_FORTIFY_SOURCE=2 -m64 -pthread
> > >   -DRTE_MACHINE_CPUFLAG_PPC64 -DRTE_MACHINE_CPUFLAG_ALTIVEC
> > > -DRTE_MACHINE_CPUFLAG_VSX  -I/home/ubuntu/deb_dpdk/debia
> > > n/build/static-root/include -include
> > > /home/ubuntu/deb_dpdk/debian/build/static-root/include/rte_config.h -O3
> > > -std=c11 -Wall -Wextra -g -I. -D_BSD_SOURCE -D_DEFAULT_SOURCE
> > > -D_XOPEN_SOURCE=600
> > > -W -Wall -Wstrict-prototypes -Wmissing-prototypes -Wmissing-declarations
> > > -Wold-style-definition -Wpointer-arith -Wcast-align -Wnested-externs
> > > -Wcast-qual -Wformat-nonliteral -Wformat-securi
> > > ty -Wundef -Wwrite-strings -Wdeprecated -Wimplicit-fallthrough=2
> > > -Wno-format-truncation -DALLOW_EXPERIMENTAL_API -Wno-error=cast-qual
> > > -DNDEBUG -UPEDANTIC   -g -g -o mlx4.o -c /home/ubuntu/de
> > > b_dpdk/drivers/net/mlx4/mlx4.c
> > > In file included from
> > > /home/ubuntu/deb_dpdk/debian/build/static-root/include/rte_memcpy.h:39,
> > >                 from
> > > /home/ubuntu/deb_dpdk/debian/build/static-root/include/rte_ether.h:21,
> > >                 from
> > > /home/ubuntu/deb_dpdk/debian/build/static-root/include/rte_ethdev.h:158,
> > >                 from
> > >
> > /home/ubuntu/deb_dpdk/debian/build/static-root/include/rte_ethdev_driver.h:18,
> > >
> > >                 from /home/ubuntu/deb_dpdk/drivers/net/mlx4/mlx4.c:35:
> > > /usr/lib/gcc/powerpc64le-linux-gnu/8/include/altivec.h:44:2: error:
> > #error
> > > WOULD REDECLARE BOOL
> > > #error WOULD REDECLARE BOOL
> > >
> > > But a quick sanity test with a hello world including this special altivec
> > > header did build not reaching the #error.
> > > So something in DPDK undefines __ALTIVEC__ ?!
> > >
> > > After being that close I found which of our usual build does the switch
> > to
> > > trigger this.
> > > It is "-std=c11"
> > >
> > > And in fact
> > > $ gcc -std=c11 -dM -E - < /dev/null | grep ALTI
> > > #define __ALTIVEC__ 1
> > >
> > > But no __APPLE_ALTIVEC__
> > >
> > > The header says
> > > /* You are allowed to undef these for C++ compatibility.  */
> > >
> > > But I thought "wait a minute, didn't we just undefine it above and it
> > > failed?"
> > > Yes we did, and it turns out not all gcc calls have --std=c11 and due to
> > > that it failed for those.
> > >
> > >
> > >
> > > 2. Add Altivec detection to impacted users of stdbool.h, which #undef and
> > > >    redefine bool as _Bool on their own with a short comment about
> > broken
> > > >    toolchains.
> > > >
> > >
> > > I tested a few versions of this after my findings on #1 above.
> > > A few extra loops to jump like to make drivers/net/cxgbe/cxgbe_compat.h
> > > usage of bool happy.
> > > I eventually came up with the following as a fix that seems to work:
> > >
> > > --- a/lib/librte_eal/common/include/arch/ppc_64/rte_memcpy.h
> > > +++ b/lib/librte_eal/common/include/arch/ppc_64/rte_memcpy.h
> > > @@ -37,6 +37,19 @@
> > > #include <string.h>
> > > /*To include altivec.h, GCC version must  >= 4.8 */
> > > #include <altivec.h>
> > > +/*
> > > + * if built with std=c11 stdbool and vector bool will conflict.
> > > + * But if std is not c11 (which is true for some of our gcc calls) it
> > will
> > > + * have __APPLE_ALTIVEC__ defined which will make it not define the
> > types
> > > + * at all.
> > > + * Furthermore we need to be careful to only redefine as stdbool would
> > have
> > > + * done if it was included - otherwise we might conflict with other
> > > intended
> > > + * meanings of "bool".
> > > + */
> > > +#if defined(__STDC_VERSION__) && __STDC_VERSION__ >= 201112L &&
> > > defined(_STDBOOL_H)
> > > +#undef bool
> > > +#define bool _Bool
> > > +#endif
> > >
> > > #ifdef __cplusplus
> > > extern "C" {
> > >
> > >
> > > In turn I have only checked this modification on ppc64 so far, but
> > anyway I
> > > still have the feeling we are only trying to poke at things with a stick
> > > and someone with subject matter experience would just tell us.
> > > Opinions on the change above as a "v2 RFC"?
> >
> > Thanks for the detailed analysis :)
> >
> > I'm afraid this suggestion can still break since stdbool.h won't be
> > necessarily included before altivec.h. How about this instead?
> >
> >  /* Blurb */
> >  #ifndef __APPLE_ALTIVEC__
> >  #define __APPLE_ALTIVEC__ 1
> >  #endif
> >  #include <altivec.h>
> >
> 
> I was there before in my experiments - even a bit safer with the following
> to only do so in C11 mode to avoid cases where one might have "undefined"
> it in non C11 for a reason so we do not switch it on again unexpectedly.
> 
> --- a/lib/librte_eal/common/include/arch/ppc_64/rte_memcpy.h
> +++ b/lib/librte_eal/common/include/arch/ppc_64/rte_memcpy.h
> @@ -36,6 +36,14 @@
> #include <stdint.h>
> #include <string.h>
> /*To include altivec.h, GCC version must  >= 4.8 */
> +/*
> + * If built with std=c11 stdbool and altivec bool will conflict.
> + * The altivec bool type is not needed at the moment, to avoid the conflict
> + * define __APPLE_ALTIVEC__ so that the conflict will not happen.
> + */
> +#if defined(__STDC_VERSION__) && __STDC_VERSION__ >= 201112L &&
> !defined(__APPLE_ALTIVEC__)
> +#define __APPLE_ALTIVEC__
> +#endif
> #include <altivec.h>
> 
> #ifdef __cplusplus
> 
> But it turned out we are not allowed to switch of other things as vector
> (and probably some more code than the type) is actually used:
> With your suggestion or mine above it will break on:
> 
> x5.o -c /home/ubuntu/deb_dpdk/drivers/net/mlx5/mlx5.c
> In file included from /home/ubuntu/deb_dpdk/drivers/net/mlx5/mlx5_prm.h:21,
>                 from /home/ubuntu/deb_dpdk/drivers/net/mlx5/mlx5_rxtx.h:37,
>                 from /home/ubuntu/deb_dpdk/drivers/net/mlx5/mlx5.h:36,
>                 from /home/ubuntu/deb_dpdk/drivers/net/mlx5/mlx5.c:42:
> /home/ubuntu/deb_dpdk/debian/build/static-root/include/rte_vect.h:43:15: error:
> expected ‘;’ before ‘signed’
> typedef vector signed int xmm_t;
>               ^~~~~~~
>               ;
> /home/ubuntu/deb_dpdk/debian/build/static-root/include/rte_vect.h:49:2: error:
> expected specifier-qualifier-list before ‘xmm_t’
>  xmm_t    x;
>  ^~~~~
> 
> I have no much better suggestion for the ordering issue that you raised.
> To test what would happen I moved the stdbool include after all other
> includes in drivers/net/mlx5/mlx5_nl.c
> I also moved mlx5.h (which eventually brings in altivec) right at the top.
> This works to build, but such a check is always subtle as one of the other
> includes might have pulled in stdbool before altivec still.
> For a bit of confidence I picked said gcc call and ran it with -E.
> The output suggests altivec really was included before stdbool.

How about making altivec.h users (rte_vect.h and rte_memcpy.h) rely on
"__vector" directly instead of the "vector" macro to make it transparent for
others then?

I think we can assume they have internal knowledge of this file in order to
deal with __APPLE_ALTIVEC__ anyway.

Also I would suggest not to make this workaround C11-only. I suspect the
same issue will be encountered with -std=c99 or -std=c90. Keep in mind DPDK
applications are free to specify their own CFLAGS.

> $ grep -e 'stdbool.h' -e 'altivec.h' mlx5_nl.E
> # 1 "/usr/lib/gcc/powerpc64le-linux-gnu/8/include/altivec.h" 1 3 4
> # 1 "/usr/lib/gcc/powerpc64le-linux-gnu/8/include/stdbool.h" 1 3 4
> 
> Still the build worked with the fix as suggested in my last mail:
> --- a/lib/librte_eal/common/include/arch/ppc_64/rte_memcpy.h
> +++ b/lib/librte_eal/common/include/arch/ppc_64/rte_memcpy.h
> @@ -37,6 +37,19 @@
> #include <string.h>
> /*To include altivec.h, GCC version must  >= 4.8 */
> #include <altivec.h>
> +/*
> + * if built with std=c11 stdbool and vector bool will conflict.
> + * But if std is not c11 (which is true for some of our gcc calls) it will
> + * have __APPLE_ALTIVEC__ defined which will make it not define the types
> + * at all.
> + * Furthermore we need to be careful to only redefine as stdbool would have
> + * done if it was included - otherwise we might conflict with other
> intended
> + * meanings of "bool".
> + */
> +#if defined(__STDC_VERSION__) && __STDC_VERSION__ >= 201112L &&
> defined(_STDBOOL_H)
> +#undef bool
> +#define bool _Bool
> +#endif
> 
> #ifdef __cplusplus
> extern "C" {
> 
> Which is odd, as I'd have expected the stdbool.h inclusion would then
> trigger a redefinition of the bool type.

Yeah, seems like internal GCC headers in /usr/lib are exempt from warnings
on conflicting definitions or some such.

-- 
Adrien Mazarguil
6WIND


More information about the dev mailing list