[dpdk-dev] AVX512 bug on SkyLake

Stephen Hemminger stephen at networkplumber.org
Fri Nov 9 19:46:54 CET 2018


On Thu, 08 Nov 2018 16:59:22 +0100
Thomas Monjalon <thomas at monjalon.net> wrote:

> Hi,
> 
> We need to gather more information about this bug.
> More below.
> 
> 07/11/2018 10:04, Wiles, Keith:
> > > On Nov 6, 2018, at 9:30 PM, Yongseok Koh <yskoh at mellanox.com> wrote:  
> > >> On Nov 5, 2018, at 6:06 AM, Wiles, Keith <keith.wiles at intel.com> wrote:  
> > >>> On Nov 2, 2018, at 9:04 PM, Yongseok Koh <yskoh at mellanox.com> wrote:
> > >>> 
> > >>> This is a workaround to prevent a crash, which might be caused by
> > >>> optimization of newer gcc (7.3.0) on Intel Skylake.  
> > >> 
> > >> Should the code below not also test for the gcc version and
> > >> the Sky Lake processor, maybe I am wrong but it seems it is
> > >> turning AVX512 for all GCC builds  
> > > 
> > > I didn't want to check gcc version as 7.3.0 is very new. Only gcc 8 is newly up since then (gcc 8.2).
> > > Also, I wasn't able to test every gcc versions and I wanted to be a bit conservative for this crash.
> > > Performance drop (if any) by disabling a new (experimental) feature would be less risky than unaccountable crash.
> > > And, it does disable the feature only if CONFIG_RTE_ENABLE_AVX512=n. Please refer to v3.  
> > 
> > Are you not turning off all of the GCC versions for AVX512.
> > And you can test for range or greater then GCC version and
> > it just seems like we are turning off every gcc version, is that true?  
> 
> Do we know exactly which GCC versions are affected?
> 
> > >> Also bug 97 seems a bit obscure reference, maybe you know
> > >> the bug report, but more details would be good?  
> > > 
> > > I sent out the report to dev list two month ago.
> > > And I created the Bug 97 in order to reference it
> > > in the commit message.
> > > I didn't want to repeat same message here and there,
> > > but it would've been better to have some sort of summary
> > > of the Bug, although v3 has a few more words.
> > > However, v3 has been merged.  
> > 
> > Still this is too obscure if nothing else give a link to
> > a specific bug not just 97.  
> 
> The URL is
> 	https://bugs.dpdk.org/show_bug.cgi?id=97
> The bug is also pointing to an email:
> 	https://mails.dpdk.org/archives/dev/2018-September/111522.html
> 
> Summary:
> 	- CPU: Intel Skylake
> 	- Linux environment: Ubuntu 18.04
> 	- Compiler: gcc-7.3 (Ubuntu 7.3.0-16ubuntu3)
> 	- Scenario: testpmd crashes when it starts forwarding
> 	- Behaviour: AVX2 version of rte_memcpy() optimized with 512b instructions
> 	- Fix: disable AVX512 optimization with -mno-avx512f
> 
> It seems to have been reproduced only when using mlx5 PMD so far.
> Any other experience?
> 
> 

I did a little checking, there are only a few machine types on Azure that
are Skylake.   These are obviously the high end lateset ones...
 https://azure.microsoft.com/en-us/blog/fv2-vms-are-now-available-the-fastest-vms-on-azure/


More information about the dev mailing list