Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2014-05-30 Thread H.J. Lu
On Thu, May 22, 2014 at 10:38 AM, H.J. Lu wrote: > On Thu, May 22, 2014 at 2:01 AM, Kirill Yukhin > wrote: >> Hello, >> On 20 May 08:24, H.J. Lu wrote: >>> ABI alignment should be sufficient for correctness. Bigger alignments >>> are supposed to give better performance. Can you try this patch o

Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2014-05-22 Thread H.J. Lu
On Thu, May 22, 2014 at 2:01 AM, Kirill Yukhin wrote: > Hello, > On 20 May 08:24, H.J. Lu wrote: >> ABI alignment should be sufficient for correctness. Bigger alignments >> are supposed to give better performance. Can you try this patch on >> HSW and SLM to see if it has any impact on performance

Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2014-05-22 Thread Kirill Yukhin
Hello, On 20 May 08:24, H.J. Lu wrote: > ABI alignment should be sufficient for correctness. Bigger alignments > are supposed to give better performance. Can you try this patch on > HSW and SLM to see if it has any impact on performance? Here is perf. data of your patch. Only HSW so far HSW, 64

Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2014-05-20 Thread H.J. Lu
On Tue, May 20, 2014 at 5:00 AM, Kirill Yukhin wrote: > Hello, > On 19 May 09:58, H.J. Lu wrote: >> On Mon, May 19, 2014 at 9:45 AM, Uros Bizjak wrote: >> > On Mon, May 19, 2014 at 6:42 PM, H.J. Lu wrote: >> > >> Uros, >> I am looking into libreoffice size and the data alignment seems

Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2014-05-20 Thread Kirill Yukhin
Hello, On 19 May 09:58, H.J. Lu wrote: > On Mon, May 19, 2014 at 9:45 AM, Uros Bizjak wrote: > > On Mon, May 19, 2014 at 6:42 PM, H.J. Lu wrote: > > > Uros, > I am looking into libreoffice size and the data alignment seems to make > huge > difference. Data section has grown f

Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2014-05-19 Thread H.J. Lu
On Mon, May 19, 2014 at 9:45 AM, Uros Bizjak wrote: > On Mon, May 19, 2014 at 6:42 PM, H.J. Lu wrote: > Uros, I am looking into libreoffice size and the data alignment seems to make huge difference. Data section has grown from 5.8MB to 6.3MB in between GCC 4.8 and 4.9,

Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2014-05-19 Thread Uros Bizjak
On Mon, May 19, 2014 at 6:42 PM, H.J. Lu wrote: >>> Uros, >>> I am looking into libreoffice size and the data alignment seems to make huge >>> difference. Data section has grown from 5.8MB to 6.3MB in between GCC 4.8 >>> and 4.9, >>> while clang produces 5.2MB. >>> >>> The two patches I posted t

Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2014-05-19 Thread H.J. Lu
On Mon, May 19, 2014 at 9:14 AM, Uros Bizjak wrote: > On Mon, May 19, 2014 at 6:48 AM, Jan Hubicka wrote: >>> > Thanks for the pointer, there is indeed the recommendation in >>> > optimization manual [1], section 3.6.4, where it is said: >>> > >>> > --quote-- >>> > Misaligned data access can incu

Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2014-05-19 Thread Uros Bizjak
On Mon, May 19, 2014 at 6:48 AM, Jan Hubicka wrote: >> > Thanks for the pointer, there is indeed the recommendation in >> > optimization manual [1], section 3.6.4, where it is said: >> > >> > --quote-- >> > Misaligned data access can incur significant performance penalties. >> > This is particular

Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2014-05-18 Thread Jan Hubicka
> > Thanks for the pointer, there is indeed the recommendation in > > optimization manual [1], section 3.6.4, where it is said: > > > > --quote-- > > Misaligned data access can incur significant performance penalties. > > This is particularly true for cache line > > splits. The size of a cache line

Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2014-01-17 Thread Uros Bizjak
On Fri, Jan 17, 2014 at 3:15 PM, Jakub Jelinek wrote: > On Tue, Jan 14, 2014 at 08:12:41PM +0100, Jakub Jelinek wrote: >> For 4.9, if what you've added is what you want to do for performance >> reasons, then I'd do something like: > > Ok, here it is in a form of patch, bootstrapped/regtested on x8

Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2014-01-17 Thread Jakub Jelinek
On Tue, Jan 14, 2014 at 08:12:41PM +0100, Jakub Jelinek wrote: > For 4.9, if what you've added is what you want to do for performance > reasons, then I'd do something like: Ok, here it is in a form of patch, bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2014-01-17 Jakub Je

Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2014-01-14 Thread Jakub Jelinek
On Tue, Jan 14, 2014 at 07:37:33PM +0100, Uros Bizjak wrote: > OK, let's play safe. I'll revert these two changes (modulo size of > nocona prefetch block). Thanks. > > opt we never return a smaller number from ix86_data_alignment than > > we did in 4.8 and earlier, because otherwise if you have 4

Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2014-01-14 Thread H.J. Lu
On Tue, Jan 14, 2014 at 10:37 AM, Uros Bizjak wrote: > On Tue, Jan 14, 2014 at 6:09 PM, Jakub Jelinek wrote: > >>> On a second thought, the crossing of 16-byte boundaries is mentioned >>> for the data *access* (the instruction itself) if it is not naturally >>> aligned (please see example 3-40 an

Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2014-01-14 Thread Uros Bizjak
On Tue, Jan 14, 2014 at 6:09 PM, Jakub Jelinek wrote: >> On a second thought, the crossing of 16-byte boundaries is mentioned >> for the data *access* (the instruction itself) if it is not naturally >> aligned (please see example 3-40 and fig 3-2), which is *NOT* in our >> case. >> >> So, we don'

Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2014-01-14 Thread Jakub Jelinek
On Fri, Jan 03, 2014 at 05:04:39PM +0100, Uros Bizjak wrote: > On a second thought, the crossing of 16-byte boundaries is mentioned > for the data *access* (the instruction itself) if it is not naturally > aligned (please see example 3-40 and fig 3-2), which is *NOT* in our > case. > > So, we don'

Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2014-01-03 Thread Uros Bizjak
On Fri, Jan 3, 2014 at 3:02 PM, Uros Bizjak wrote: >>> Like in the patch below. Please note, that the block_tune setting for >>> the nocona is wrong, -march=native on my trusted old P4 returns: >>> >>> --param "l1-cache-size=16" --param "l1-cache-line-size=64" --param >>> "l2-cache-size=2048" "-m

Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2014-01-03 Thread Jakub Jelinek
On Fri, Jan 03, 2014 at 03:35:11PM +0100, Uros Bizjak wrote: > On Fri, Jan 3, 2014 at 3:13 PM, Jakub Jelinek wrote: > > On Fri, Jan 03, 2014 at 03:02:51PM +0100, Uros Bizjak wrote: > >> Please note that previous value was based on earlier (pre P4) > >> recommendation and it was appropriate for old

Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2014-01-03 Thread Uros Bizjak
On Fri, Jan 3, 2014 at 3:13 PM, Jakub Jelinek wrote: > On Fri, Jan 03, 2014 at 03:02:51PM +0100, Uros Bizjak wrote: >> Please note that previous value was based on earlier (pre P4) >> recommendation and it was appropriate for older chips with 32byte >> cache line. The value should be updated long

Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2014-01-03 Thread Jakub Jelinek
On Fri, Jan 03, 2014 at 03:02:51PM +0100, Uros Bizjak wrote: > Please note that previous value was based on earlier (pre P4) > recommendation and it was appropriate for older chips with 32byte > cache line. The value should be updated long ago, when 64bit cache > lines were introduced, but was prob

Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2014-01-03 Thread Uros Bizjak
On Fri, Jan 3, 2014 at 2:43 PM, Jakub Jelinek wrote: > On Fri, Jan 03, 2014 at 02:35:36PM +0100, Uros Bizjak wrote: >> Like in the patch below. Please note, that the block_tune setting for >> the nocona is wrong, -march=native on my trusted old P4 returns: >> >> --param "l1-cache-size=16" --param

Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2014-01-03 Thread Jakub Jelinek
On Fri, Jan 03, 2014 at 02:35:36PM +0100, Uros Bizjak wrote: > Like in the patch below. Please note, that the block_tune setting for > the nocona is wrong, -march=native on my trusted old P4 returns: > > --param "l1-cache-size=16" --param "l1-cache-line-size=64" --param > "l2-cache-size=2048" "-mt

Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2014-01-03 Thread Uros Bizjak
On Fri, Jan 3, 2014 at 1:27 PM, Uros Bizjak wrote: >>> I am testing a patch that removes "max_align" part from ix86_data_alignment. >> >> That looks like unnecessary pessimization. Note the hunk in question is >> guarded with opt, which means it is an optimization rather than ABI issue, >> it ca

Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2014-01-03 Thread Uros Bizjak
On Fri, Jan 3, 2014 at 12:59 PM, Jakub Jelinek wrote: > On Fri, Jan 03, 2014 at 12:25:00PM +0100, Uros Bizjak wrote: >> I am testing a patch that removes "max_align" part from ix86_data_alignment. > > That looks like unnecessary pessimization. Note the hunk in question is > guarded with opt, whic

Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2014-01-03 Thread Jakub Jelinek
On Fri, Jan 03, 2014 at 12:25:00PM +0100, Uros Bizjak wrote: > I am testing a patch that removes "max_align" part from ix86_data_alignment. That looks like unnecessary pessimization. Note the hunk in question is guarded with opt, which means it is an optimization rather than ABI issue, it can inc

Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2014-01-03 Thread Uros Bizjak
On Fri, Jan 3, 2014 at 12:20 PM, Eric Botcazou wrote: >> When compiled with -m32 -mavx, we get: >> >> .align 32 >> .type a, @object >> .size a, 32 >> a: >> >> so, the alignment was already raised elsewhere. We get .align 16 for >> -msse -m32 when vectorizing. >> >> with

Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2014-01-03 Thread Eric Botcazou
> When compiled with -m32 -mavx, we get: > > .align 32 > .type a, @object > .size a, 32 > a: > > so, the alignment was already raised elsewhere. We get .align 16 for > -msse -m32 when vectorizing. > > without -msse (and consequently without vectorizing), we get for -m

Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2014-01-03 Thread Uros Bizjak
On Thu, Jan 2, 2014 at 11:18 PM, Eric Botcazou wrote: >> Note that it has unexpected side-effects: previously, in 32-bit mode, >> 256-bit aggregate objects would have been given 256-bit alignment; now, >> they will fall back to default alignment, for example 32-bit only. > > In case this wasn't cl

Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2014-01-02 Thread Eric Botcazou
> Note that it has unexpected side-effects: previously, in 32-bit mode, > 256-bit aggregate objects would have been given 256-bit alignment; now, > they will fall back to default alignment, for example 32-bit only. In case this wasn't clear enough, just compile in 32-bit mode: int a[8] = { 1, 2,

Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2014-01-02 Thread Jan Hubicka
> > x86-64 ABI has clause about aligning static vars to 128bit boundary at a > > given size. This was introduced to aid compiler to generate aligned > > vector store/load even if the object may bind to other object file. > > This is set to stone and can not be changed for AVX/SSE. > > Yes, but th

Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2014-01-02 Thread Eric Botcazou
> x86-64 ABI has clause about aligning static vars to 128bit boundary at a > given size. This was introduced to aid compiler to generate aligned > vector store/load even if the object may bind to other object file. > This is set to stone and can not be changed for AVX/SSE. Yes, but that's irrelev

Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2014-01-02 Thread Jan Hubicka
> > Frankly speaking, I do not understand, what's wrong here. > > IMHO, this change is pretty mechanical: we just extend maximal aligment > > available. Because of 512-bit data types we now extend maximal aligment to > > 512 bits. > > Nothing wrong per se, but... > > > I suspect that an issue is

Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2014-01-02 Thread Eric Botcazou
> Frankly speaking, I do not understand, what's wrong here. > IMHO, this change is pretty mechanical: we just extend maximal aligment > available. Because of 512-bit data types we now extend maximal aligment to > 512 bits. Nothing wrong per se, but... > I suspect that an issue is here: > if (op

Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2014-01-02 Thread Kirill Yukhin
Hello Eric, On 02 Jan 00:07, Eric Botcazou wrote: > The change is actually to ix86_data_alignment, not to ix86_constant_alignment: > > @@ -26219,7 +26433,8 @@ ix86_constant_alignment (tree exp, int align) > int > ix86_data_alignment (tree type, int align, bool opt) > { > - int max_align = opti

Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2014-01-01 Thread Eric Botcazou
> gcc/ > 2013-12-30 Alexander Ivchenko > Maxim Kuznetsov > Sergey Lega > Anna Tikhonova > Ilya Tocar > Andrey Turetskiy > Ilya Verbin > Kirill Yukhin > Michael Zolotukhin > > * config/i386/i386

Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2013-12-30 Thread Kirill Yukhin
Hello Uroš, Jakub, On 22 Dec 11:47, Uros Bizjak wrote: > The x86 part is OK for mainline. You will also need approval from the > middle-end reviewer for tree-* parts. Thanks, I'am testing (in agreed volume, bootstrap passed so far) patch in the bottom. If no more inputs - I'll check it in to main

Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2013-12-22 Thread Jakub Jelinek
On Sun, Dec 22, 2013 at 11:47:52AM +0100, Uros Bizjak wrote: > * tree-vect-stmts.c (vectorizable_load): Support AVX512's gathers. > * tree-vectorizer.h (MAX_VECTORIZATION_FACTOR): Extend for 512 > bit vectors. > > I assumed the same testing procedure as described in the original su

Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2013-12-22 Thread Uros Bizjak
On Wed, Dec 18, 2013 at 2:07 PM, Kirill Yukhin wrote: > Hello, > > On 02 Dec 16:13, Kirill Yukhin wrote: >> Hello, >> On 19 Nov 12:14, Kirill Yukhin wrote: >> > Hello, >> > On 15 Nov 20:10, Kirill Yukhin wrote: >> > > > Is it ok to commit to main trunk? >> > > Ping. >> > Ping. >> Ping. > Ping. > >

Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2013-12-18 Thread Kirill Yukhin
Hello, On 02 Dec 16:13, Kirill Yukhin wrote: > Hello, > On 19 Nov 12:14, Kirill Yukhin wrote: > > Hello, > > On 15 Nov 20:10, Kirill Yukhin wrote: > > > > Is it ok to commit to main trunk? > > > Ping. > > Ping. > Ping. Ping. Updated patch in the bottom. -- Thanks, K --- gcc/config/i386/i386.c

Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2013-12-02 Thread Kirill Yukhin
Hello, On 19 Nov 12:14, Kirill Yukhin wrote: > Hello, > On 15 Nov 20:10, Kirill Yukhin wrote: > > > Is it ok to commit to main trunk? > > Ping. > Ping. Ping. -- Thanks, K

Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2013-11-19 Thread Kirill Yukhin
Hello, On 15 Nov 20:10, Kirill Yukhin wrote: > > Is it ok to commit to main trunk? > Ping. Ping. -- Thanks, K

Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2013-11-15 Thread Kirill Yukhin
Hello, On 12 Nov 15:36, Kirill Yukhin wrote: > Hello, > Patch in the bottom extends some hooks toward AVX-512 support. > This patch decrease icount for Spec2006 FP suite (ref set): > > Optset was: -static -m64 -fstrict-aliasing -fno-prefetch-loop-arrays > -Ofast -funroll-loops -flto -march=core-av

[PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2013-11-12 Thread Kirill Yukhin
Hello, Patch in the bottom extends some hooks toward AVX-512 support. This patch decrease icount for Spec2006 FP suite (ref set): Optset was: -static -m64 -fstrict-aliasing -fno-prefetch-loop-arrays -Ofast -funroll-loops -flto -march=core-avx2 -mtune=core-avx2 Lower is better. Test\ArchI