Rationale is similar and related to better performance of 256-bit code for some of the cases. In case of Skylake Server the Optimization Manual has following "Since port 0 and port 1 are 256-bits wide, Intel AVX-512 operations that will be dispatched to port 0 will execute on both port 0 and port 1" (chapter 2.1 in https://software.intel.com/sites/default/files/managed/9e/bc/64-ia-32-architectures-optimization-manual.pdf). It is platform and case specific and not applicable to every AVX512-available platform. Option purpose is to make this code generation more convenient. For some platforms it may become default. This patch is the base one to introduce the option. I will provide other patches for performance tuning for different platforms and cases. Sergey
-----Original Message----- From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-ow...@gcc.gnu.org] On Behalf Of Jakub Jelinek Sent: Thursday, September 14, 2017 2:36 PM To: Shalnov, Sergey <sergey.shal...@intel.com> Cc: gcc-patches@gcc.gnu.org; ubiz...@gmail.com; kirill.yuk...@gmail.com; Koval, Julia <julia.ko...@intel.com>; Senkevich, Andrew <andrew.senkev...@intel.com> Subject: Re: [PATCH, i386] Enable option -mprefer-avx256 added for Intel AVX512 configuration On Thu, Sep 14, 2017 at 12:10:50PM +0000, Shalnov, Sergey wrote: > GCC has the option "mprefer-avx128" to use 128-bit AVX registers > instead of 256-bit AVX registers in the auto-vectorizer. > This patch enables the command line option "mprefer-avx256" that > reduces 512-bit registers usage in "march=skylake-avx512" mode. This > is the initial implementation of the option. Currently, 512-bit > registers might appears in some cases. I have a plan to continue fix > the cases where 512-bit registers are appear. Sergey What is the rationale for this? -mprefer-avx128 has been added because some (older) AMD CPUs implement AVX by performing 256-bit ops as two 128-bit uops and thus it is faster to emit 128-bit only code. Is that the case for any AVX512 implementations too? Jakub