RE: [PATCH, i386] Enable option -mprefer-avx256 added for Intel AVX512 configuration

Shalnov, Sergey Fri, 15 Sep 2017 02:09:07 -0700

Rationale is similar and related to better performance of 256-bit code for some 
of the cases. In case of Skylake Server the Optimization Manual has following 
"Since port 0 and port 1 are 256-bits wide, Intel AVX-512 operations that will 
be dispatched to port 0 will execute on both port 0 and port 1" (chapter 2.1 in 
https://software.intel.com/sites/default/files/managed/9e/bc/64-ia-32-architectures-optimization-manual.pdf).
 It is platform and case specific and not applicable to every AVX512-available 
platform. Option purpose is to make this code generation more convenient. For 
some platforms it may become default.
This patch is the base one to introduce the option. I will provide other 
patches for performance tuning for different platforms and cases.
Sergey



-----Original Message-----
From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-ow...@gcc.gnu.org] On 
Behalf Of Jakub Jelinek
Sent: Thursday, September 14, 2017 2:36 PM
To: Shalnov, Sergey <sergey.shal...@intel.com>
Cc: gcc-patches@gcc.gnu.org; ubiz...@gmail.com; kirill.yuk...@gmail.com; Koval, 
Julia <julia.ko...@intel.com>; Senkevich, Andrew <andrew.senkev...@intel.com>
Subject: Re: [PATCH, i386] Enable option -mprefer-avx256 added for Intel AVX512 
configuration

On Thu, Sep 14, 2017 at 12:10:50PM +0000, Shalnov, Sergey wrote:
> GCC has the option "mprefer-avx128" to use 128-bit AVX registers 
> instead of 256-bit AVX registers in the auto-vectorizer.

> This patch enables the command line option "mprefer-avx256" that 
> reduces 512-bit registers usage in "march=skylake-avx512" mode.  This 
> is the initial implementation of the option.  Currently, 512-bit 
> registers might appears in some cases.  I have a plan to continue fix 
> the cases where 512-bit registers are appear.  Sergey

What is the rationale for this?  -mprefer-avx128 has been added because some
(older) AMD CPUs implement AVX by performing 256-bit ops as two 128-bit uops 
and thus it is faster to emit 128-bit only code.
Is that the case for any AVX512 implementations too?

        Jakub

RE: [PATCH, i386] Enable option -mprefer-avx256 added for Intel AVX512 configuration

Reply via email to