[Bug middle-end/26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

2020-06-16 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
Bug 26163 depends on bug 68627, which changed state.

Bug 68627 Summary: [i386, AVX-512] Illegal insn generated while compiling 
spec2k6/437.leslie3d for KNL
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68627

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

[Bug target/68627] [i386, AVX-512] Illegal insn generated while compiling spec2k6/437.leslie3d for KNL

2020-06-16 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68627

Kirill Yukhin  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Kirill Yukhin  ---
Fixed.

[Bug other/84613] [meta-bug] SPEC compiler performance issues

2020-06-16 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84613
Bug 84613 depends on bug 68627, which changed state.

Bug 68627 Summary: [i386, AVX-512] Illegal insn generated while compiling 
spec2k6/437.leslie3d for KNL
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68627

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

[Bug target/68633] [i386, AVX-512] Spec2006/434.zeus miscompares when executed on KNL

2020-06-16 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68633

Kirill Yukhin  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #3 from Kirill Yukhin  ---
Fixed.

[Bug other/84613] [meta-bug] SPEC compiler performance issues

2020-06-16 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84613
Bug 84613 depends on bug 68633, which changed state.

Bug 68633 Summary: [i386, AVX-512] Spec2006/434.zeus miscompares when executed 
on KNL
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68633

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

[Bug middle-end/26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

2020-06-16 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
Bug 26163 depends on bug 68633, which changed state.

Bug 68633 Summary: [i386, AVX-512] Spec2006/434.zeus miscompares when executed 
on KNL
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68633

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

[Bug target/95144] Many AVX-512 functions take an int instead of unsigned int

2020-06-16 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95144

Kirill Yukhin  changed:

   What|Removed |Added

   Last reconfirmed||2020-06-16
 CC||kyukhin at gcc dot gnu.org
 Ever confirmed|0   |1
 Status|UNCONFIRMED |ASSIGNED

--- Comment #2 from Kirill Yukhin  ---
Similar bug https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65744

[Bug target/95766] Failure to directly use vpbroadcastd for _mm_set1_epi32 when passing unsigned short

2020-08-03 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95766

--- Comment #11 from Kirill Yukhin  ---
(In reply to Jakub Jelinek from comment #10)
> Kirill, any thoughts on that?

I'd prefer your variant, w/o unspecs.

[Bug target/58269] [4.9 Regression] ICE when building libobjc on x86_64-apple-darwin* after revision 201915

2013-09-06 Thread kyukhin at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58269

--- Comment #8 from Kirill Yukhin  ---
Author: kyukhin
Date: Fri Sep  6 10:36:30 2013
New Revision: 202318

URL: http://gcc.gnu.org/viewcvs?rev=202318&root=gcc&view=rev
Log:
PR target/58269
* config/i386/i386.c (ix86_conditional_register_usage):
Proper initialize extended SSE registers.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/i386.c


[Bug rtl-optimization/47698] CMOV accessing volatile memory with read side effect

2011-11-07 Thread kyukhin at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47698

--- Comment #6 from Kirill Yukhin  2011-11-07 
08:42:00 UTC ---
Author: kyukhin
Date: Mon Nov  7 08:41:55 2011
New Revision: 181075

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=181075
Log:
gcc/
PR rtl-optimization/47698
* ifconv.c (noce_operand_ok): prevent CMOV generation
for volatile mem.

gcc/testsuite/
PR rtl-optimization/47698
* gcc.target/i386/47698.c: New test.


Added:
trunk/gcc/testsuite/gcc.target/i386/47698.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/ifcvt.c
trunk/gcc/testsuite/ChangeLog


[Bug target/50962] Additional opportunity for AGU stall avoidance optimization for Atom processor

2011-11-07 Thread kyukhin at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50962

--- Comment #4 from Kirill Yukhin  2011-11-07 
08:47:18 UTC ---
Author: kyukhin
Date: Mon Nov  7 08:47:15 2011
New Revision: 181077

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=181077
Log:
gcc/
PR target/50962
* config/i386/i386-protos.h (ix86_use_lea_for_mov): New.
* config/i386/i386.c (ix86_use_lea_for_mov): Likewise.
* config/i386/i386.md (movsi_internal): Emit lea if profitable.
(movdi_internal_rex64): Likewise.


Modified:
trunk/gcc/config/i386/i386-protos.h
trunk/gcc/config/i386/i386.c
trunk/gcc/config/i386/i386.md
trunk/gcc/testsuite/ChangeLog


[Bug target/53201] [4.8 Regression] unrecognized command line option '-mno-lzcnt-mno-hle

2012-05-02 Thread kyukhin at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53201

--- Comment #4 from Kirill Yukhin  2012-05-03 
06:50:25 UTC ---
Author: kyukhin
Date: Thu May  3 06:50:16 2012
New Revision: 187075

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=187075
Log:

PR target/53201
* config/i386/driver-i386.c (host_detect_local_cpu): Add space to
"-mno-hle".


Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/driver-i386.c


[Bug target/53435] (ix86_expand_vec_perm) and (ix86_expand_vec_perm) do not pass arguments to avx2_permvar8s[f,i] correctly

2012-05-25 Thread kyukhin at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53435

--- Comment #4 from Kirill Yukhin  2012-05-25 
13:03:21 UTC ---
Author: kyukhin
Date: Fri May 25 13:03:18 2012
New Revision: 187881

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=187881
Log:
2012-05-21  Alexander Ivchenko  

   PR target/53435
   * config/i386/i386.c (ix86_expand_vec_perm): Use correct op.
   (ix86_expand_vec_perm): Use int mode instead of float.
   (expand_vec_perm_pshufb): Remove handling of useseless type
   conversion.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/i386.c


[Bug target/53435] (ix86_expand_vec_perm) and (ix86_expand_vec_perm) do not pass arguments to avx2_permvar8s[f,i] correctly

2012-05-25 Thread kyukhin at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53435

--- Comment #5 from Kirill Yukhin  2012-05-25 
13:34:12 UTC ---
Author: kyukhin
Date: Fri May 25 13:34:07 2012
New Revision: 187882

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=187882
Log:
2012-05-25  Alexander Ivchenko  

PR target/53435
* config/i386/i386.c (ix86_expand_vec_perm): Use correct op.
(ix86_expand_vec_perm): Use int mode instead of float.


Modified:
branches/gcc-4_7-branch/gcc/ChangeLog
branches/gcc-4_7-branch/gcc/config/i386/i386.c


[Bug target/53877] __lzcnt_u16/__lzcnt_u32/__lzcnt_u64 aren't implemented

2012-07-20 Thread kyukhin at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53877

--- Comment #1 from Kirill Yukhin  2012-07-20 
08:24:35 UTC ---
Author: kyukhin
Date: Fri Jul 20 08:24:24 2012
New Revision: 189703

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=189703
Log:
2012-07-20  Kirill Yukhin  

PR target/53877
* config/i386/lzcntintrin.h (_lzcnt_u32): New.
(_lzcnt_u64): Ditto.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/lzcntintrin.h


[Bug target/53877] __lzcnt_u16/__lzcnt_u32/__lzcnt_u64 aren't implemented

2012-07-20 Thread kyukhin at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53877

--- Comment #2 from Kirill Yukhin  2012-07-20 
08:57:09 UTC ---
Author: kyukhin
Date: Fri Jul 20 08:57:04 2012
New Revision: 189706

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=189706
Log:
2012-07-20  Kirill Yukhin  

PR target/53877
* config/i386/lzcntintrin.h (_lzcnt_u32): New.
(_lzcnt_u64): Ditto.


Modified:
branches/gcc-4_7-branch/gcc/ChangeLog
branches/gcc-4_7-branch/gcc/config/i386/lzcntintrin.h


[Bug target/57491] [ia64] internal compiler error: in ia64_split_tmode -O2, quadmath

2013-11-14 Thread kyukhin at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57491

--- Comment #1 from Kirill Yukhin  ---
Author: kyukhin
Date: Thu Nov 14 08:33:21 2013
New Revision: 204777

URL: http://gcc.gnu.org/viewcvs?rev=204777&root=gcc&view=rev
Log:
PR target/57491
* config/ia64/ia64.c (ia64_split_tmode_move): Relax `dead'
flag setting.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/ia64/ia64.c


[Bug target/57756] Function target attribute is retaining state of previously seen function

2013-11-20 Thread kyukhin at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57756

--- Comment #10 from Kirill Yukhin  ---
Author: kyukhin
Date: Wed Nov 20 11:59:05 2013
New Revision: 205104

URL: http://gcc.gnu.org/viewcvs?rev=205104&root=gcc&view=rev
Log:
PR target/57756
* config/i386/i386.c (ix86_option_override_internal): Add missed
argument prefix for 'ix86_fpmath'.
* config/i386/ssemath.h: Add missed definition of
TARGET_FPMATH_DEFAULT_P macros.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/i386.c
trunk/gcc/config/i386/ssemath.h


[Bug target/51287] [4.7 regression] 252.eon compfail with -march=atom

2011-11-25 Thread kyukhin at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51287

--- Comment #1 from Kirill Yukhin  2011-11-25 
09:46:31 UTC ---
Author: kyukhin
Date: Fri Nov 25 09:46:27 2011
New Revision: 181713

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=181713
Log:
   PR target/51287
   * i386.c (distance_non_agu_define): Fix insn attr check.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/i386.c


[Bug target/51287] [4.7 regression] 252.eon compfail with -march=atom

2011-11-25 Thread kyukhin at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51287

--- Comment #2 from Kirill Yukhin  2011-11-25 
10:29:46 UTC ---
Author: kyukhin
Date: Fri Nov 25 10:29:42 2011
New Revision: 181714

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=181714
Log:
2011-11-24  Enkovich Ilya  

   PR target/51287
   * i386.c (distance_non_agu_define): Fix insn attr check.


Modified:
branches/gcc-4_6-branch/gcc/ChangeLog
branches/gcc-4_6-branch/gcc/config/i386/i386.c


[Bug target/51524] New: [BMI2] New regression on 182266 vs 182257

2011-12-12 Thread kyukhin at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51524

 Bug #: 51524
   Summary: [BMI2] New regression on 182266 vs 182257
Classification: Unclassified
   Product: gcc
   Version: 4.7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: kyuk...@gcc.gnu.org


Hi,
Seems we've got new regression on trunk:
FAIL: gcc.target/i386/bmi2-mulx32-1a.c scan-assembler-times bmi2_umulsidi3_1 1
FAIL: gcc.target/i386/bmi2-mulx32-2a.c scan-assembler-times mulx[ \\t]+[^\n]* 1


[Bug target/50038] redundant zero extensions

2011-12-21 Thread kyukhin at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50038

--- Comment #8 from Kirill Yukhin  2011-12-21 
11:52:32 UTC ---
Author: kyukhin
Date: Wed Dec 21 11:52:27 2011
New Revision: 182574

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=182574
Log:
gcc/

2011-12-21  Enkovich Ilya  

PR target/50038
* implicit-zee.c: Delete.
* ree.c: New file.
* Makefile.in: Replace implicit-zee.c with ree.c.
* config/i386/i386.c (ix86_option_override_internal): Rename
flag_zee to flag_ree.
* common.opt (fzee): Ignored.
(free): New.
* passes.c (init_optimization_passes): Replace pass_implicit_zee
with pass_ree.
* tree-pass.h (pass_implicit_zee): Delete.
(pass_ree): New.
* timevar.def (TV_ZEE): Delete.
(TV_REE): New.
* doc/invoke.texi: Add -free description.

gcc/testsuite/

2011-12-21  Enkovich Ilya  

PR target/50038


Added:
trunk/gcc/ree.c
trunk/gcc/testsuite/gcc.dg/pr50038.c
Removed:
trunk/gcc/implicit-zee.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/Makefile.in
trunk/gcc/common.opt
trunk/gcc/config/i386/i386.c
trunk/gcc/doc/invoke.texi
trunk/gcc/passes.c
trunk/gcc/testsuite/ChangeLog
trunk/gcc/timevar.def
trunk/gcc/tree-pass.h


[Bug target/83828] FAIL: gcc.target/i386/avx512vpopcntdqvl-vpopcntq-1.c execution test

2018-01-29 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83828

Kirill Yukhin  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
 CC||kyukhin at gcc dot gnu.org

--- Comment #6 from Kirill Yukhin  ---
Looks like avx512bw demand is excessive in avx512bitalgintrin.h

[Bug target/83828] FAIL: gcc.target/i386/avx512vpopcntdqvl-vpopcntq-1.c execution test

2018-01-29 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83828

--- Comment #7 from Kirill Yukhin  ---
On the other hand, if masked variant of vpopcnt[w,q] is being issued: there's
no way for reload to put 32/64 bit mask into mask register, since kmov[d,q] are
only available  under -mavx512bw switch.

We can insist user to issue -mavx512bw along w/ -mavx512bitalg if she is going
to use masked variants of corresponding intrinsics. Then only tests need to be
fixed.

[Bug target/83828] FAIL: gcc.target/i386/avx512vpopcntdqvl-vpopcntq-1.c execution test

2018-01-30 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83828

--- Comment #8 from Kirill Yukhin  ---
Author: kyukhin
Date: Tue Jan 30 08:21:22 2018
New Revision: 257173

URL: https://gcc.gnu.org/viewcvs?rev=257173&root=gcc&view=rev
Log:
Fix AVX-512BITALG test failures

gcc/testsuite
PR target/83828
* gcc.target/i386/avx512bitalg-vpopcntb-1.c: Fix test.
* gcc.target/i386/avx512bitalg-vpopcntw-1.c: Ditto.
* gcc.target/i386/avx512bitalgvl-vpopcntb-1.c: Ditto.
* gcc.target/i386/avx512bitalgvl-vpopcntw-1.c: Ditto.


Modified:
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.target/i386/avx512bitalg-vpopcntb-1.c
trunk/gcc/testsuite/gcc.target/i386/avx512bitalg-vpopcntw-1.c
trunk/gcc/testsuite/gcc.target/i386/avx512bitalgvl-vpopcntb-1.c
trunk/gcc/testsuite/gcc.target/i386/avx512bitalgvl-vpopcntw-1.c

[Bug target/83828] FAIL: gcc.target/i386/avx512vpopcntdqvl-vpopcntq-1.c execution test

2018-02-05 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83828

--- Comment #10 from Kirill Yukhin  ---
HJ, I cannot reproduce this fail on recent SDE.

Here's what I have in gcc.log:

spawn -ignore SIGHUP /export/kyukhin/gcc/bld-svn/build-x86_64-linux/gcc/xgcc
-B/export/kyukhin/gcc/bld-svn/build-x86_64-linux/gcc/
/export/kyukhin/gcc/svn/trunk/gcc/testsuite/gcc.target/i386/avx512bitalgvl-vpopc\
ntb-1.c -fno-diagnostics-show-caret -fdiagnostics-color=never -O2 -mavx512vl
-mavx512bitalg -mavx512bw -lm -o ./avx512bitalgvl-vpopcntb-1.exe^M
PASS: gcc.target/i386/avx512bitalgvl-vpopcntb-1.c (test for excess errors)
Setting LD_LIBRARY_PATH to
:/export/kyukhin/gcc/bld-svn/build-x86_64-linux/gcc:/export/kyukhin/gcc/bld-svn/build-x86_64-linux/gcc/32:
spawn /home/kyukhin/bin/dejagnu/sde-sim ./avx512bitalgvl-vpopcntb-1.exe^M
PASS: gcc.target/i386/avx512bitalgvl-vpopcntb-1.c execution test

I've also verified manually that test PASS, not SKIPPED.

Could you pls send some more info on failure?

[Bug target/83828] FAIL: gcc.target/i386/avx512vpopcntdqvl-vpopcntq-1.c execution test

2018-02-11 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83828

--- Comment #12 from Kirill Yukhin  ---
Author: kyukhin
Date: Mon Feb 12 06:14:15 2018
New Revision: 257579

URL: https://gcc.gnu.org/viewcvs?rev=257579&root=gcc&view=rev
Log:
Fix AVX-512 popcnt and bitalg tests.

gcc/testsuite/
PR target/83828
* gcc.target/i386/avx512bitalg-vpopcntb-1.c: Fix test.
* gcc.target/i386/avx512bitalg-vpopcntw-1.c: Ditto.
* gcc.target/i386/avx512bitalg-vpshufbitqmb-1.c: Ditto.
* gcc.target/i386/avx512vpopcntdq-vpopcntd-1.c: Ditto.
* gcc.target/i386/avx512vpopcntdq-vpopcntq-1.c: Ditto.

Modified:
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.target/i386/avx512bitalg-vpopcntb-1.c
trunk/gcc/testsuite/gcc.target/i386/avx512bitalg-vpopcntw-1.c
trunk/gcc/testsuite/gcc.target/i386/avx512bitalg-vpshufbitqmb-1.c
trunk/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntd-1.c
trunk/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntq-1.c

[Bug fortran/69524] New: [ICE] [F2008] Compiler segfaults on simple testcase @ -O0

2016-01-27 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69524

Bug ID: 69524
   Summary: [ICE] [F2008] Compiler segfaults on simple testcase @
-O0
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: kyukhin at gcc dot gnu.org
  Target Milestone: ---

Created attachment 37501
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37501&action=edit
Reproducer

Attached testcase produces ICE while compiling w/ recent trunk:

$ ./build-x86_64-linux/gcc/gfortran -B./build-x86_64-linux/gcc -S 2.f08
f951: internal compiler error: in build_function_decl, at
fortran/trans-decl.c:2065
0x88c1df build_function_decl
/export/users/kyukhin/gcc/git/gcc2/gcc/fortran/trans-decl.c:2065
0x88ec53 gfc_create_function_decl(gfc_namespace*, bool)
/export/users/kyukhin/gcc/git/gcc2/gcc/fortran/trans-decl.c:2758
0x86361d gfc_generate_module_code(gfc_namespace*)
/export/users/kyukhin/gcc/git/gcc2/gcc/fortran/trans.c:2043
0x7f9d19 translate_all_program_units
/export/users/kyukhin/gcc/git/gcc2/gcc/fortran/parse.c:5599
0x7fa3f7 gfc_parse_file()
/export/users/kyukhin/gcc/git/gcc2/gcc/fortran/parse.c:5818
0x84c839 gfc_be_parse_file
/export/users/kyukhin/gcc/git/gcc2/gcc/fortran/f95-lang.c:201
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <http://gcc.gnu.org/bugs.html> for instructions.

[Bug target/69118] Wrong condition in avx512f_maskcmp3

2016-02-03 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69118

Kirill Yukhin  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2016-02-03
 CC||kyukhin at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Kirill Yukhin  ---
Will fix.

[Bug target/69120] sse2_shufpd_v2df_mask has wrong name

2016-02-03 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69120

--- Comment #1 from Kirill Yukhin  ---
Will fix.

[Bug libfortran/69651] New: Usage of unitialized pointer io/list_read.c

2016-02-03 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69651

Bug ID: 69651
   Summary: Usage of unitialized pointer io/list_read.c
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libfortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: kyukhin at gcc dot gnu.org
  Target Milestone: ---

Unfortunately I have no testcase.

But code itself looks awful to me:
/* Worker function to save a KIND=4 character to a string buffer,
   enlarging the buffer as necessary.  */

static void
push_char4 (st_parameter_dt *dtp, int c)
{
  gfc_char4_t *new, *p = (gfc_char4_t *) dtp->u.p.saved_string;

  if (p == NULL)
{
  dtp->u.p.saved_string = xcalloc (SCRATCH_SIZE, sizeof (gfc_char4_t));
  dtp->u.p.saved_length = SCRATCH_SIZE;
  dtp->u.p.saved_used = 0;
  p = (gfc_char4_t *) dtp->u.p.saved_string;
}

  if (dtp->u.p.saved_used >= dtp->u.p.saved_length)
{
  dtp->u.p.saved_length = 2 * dtp->u.p.saved_length;
  p = xrealloc (p, dtp->u.p.saved_length * sizeof (gfc_char4_t));

  memset4 (new + dtp->u.p.saved_used, 0, // <-- ??? new==junk ???
  dtp->u.p.saved_length - dtp->u.p.saved_used);
}

  p[dtp->u.p.saved_used++] = c;
}

It was introduced w/ r210948
(https://gcc.gnu.org/ml/fortran/2014-05/msg00149.html). Before that new was [at
least] initialized.

[Bug libfortran/69651] Usage of unitialized pointer io/list_read.c

2016-02-03 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69651

--- Comment #1 from Kirill Yukhin  ---
File is: libgfortran/io/list_read.c

[Bug target/69120] sse2_shufpd_v2df_mask has wrong name

2016-02-03 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69120

--- Comment #2 from Kirill Yukhin  ---
Looked closely.
The name was chosen intentionally to simplify "sse2_shufpd"
expand. If we want to fix this name - new subst attribute need to be introduced
and 
if () 
  emit_insn (avx512vl_...
else
  emit_insn (sse2_...
inserted into the expand.

Beside of the expand this template never called by name.

So, I bet to have the name unchanged and keep things simple.

[Bug tree-optimization/69652] New: [6 Regression] [ICE] verify_ssa fail w/ -O2 -ffast-math -ftree-vectorize

2016-02-03 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69652

Bug ID: 69652
   Summary: [6 Regression] [ICE] verify_ssa fail w/ -O2
-ffast-math -ftree-vectorize
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: kyukhin at gcc dot gnu.org
  Target Milestone: ---

Created attachment 37569
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37569&action=edit
Reproducer

While building huge workload I've encountered ICE.
creduce finished w/ attached reproducer.

Reproduce:
$ ./gcc -O2 -ffast-math -ftree-vectorize -march=sandybridge repro.i
repro.i:1:1: warning: return type defaults to ‘int’ [-Wimplicit-int]
 fn1() {
 ^~~
repro.i: In function ‘fn1’:
repro.i:1:1: error: definition in block 8 does not dominate use in block 7
for SSA_NAME: .MEM_134 in statement:
# VUSE <.MEM_134>
_24 = *_13;
repro.i:1:1: internal compiler error: verify_ssa failed
0xd12f09 verify_ssa(bool, bool)
/export/users/kyukhin/gcc/git/gcc/gcc/tree-ssa.c:1039
0xa52dad execute_function_todo
/export/users/kyukhin/gcc/git/gcc/gcc/passes.c:1965
0xa5363b execute_todo
/export/users/kyukhin/gcc/git/gcc/gcc/passes.c:2010
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <http://gcc.gnu.org/bugs.html> for instructions.

Recent gcc-5-branch works fine w/ the case

[Bug target/69120] sse2_shufpd_v2df_mask has wrong name

2016-02-03 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69120

Kirill Yukhin  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |WONTFIX

--- Comment #4 from Kirill Yukhin  ---
This makes corresponding expand much simpler. So I think this little naming
inconsistency is preferable to additional checks in define_expand.

[Bug target/69118] Wrong condition in avx512f_maskcmp3

2016-02-03 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69118

--- Comment #2 from Kirill Yukhin  ---
Author: kyukhin
Date: Wed Feb  3 13:44:50 2016
New Revision: 233103

URL: https://gcc.gnu.org/viewcvs?rev=233103&root=gcc&view=rev
Log:
PR target/69118

gcc/
* config/i386/sse.md (define_insn "avx512f_maskcmp3"):
Fix target.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/sse.md

[Bug target/69118] Wrong condition in avx512f_maskcmp3

2016-02-03 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69118

--- Comment #3 from Kirill Yukhin  ---
Author: kyukhin
Date: Wed Feb  3 13:48:27 2016
New Revision: 233104

URL: https://gcc.gnu.org/viewcvs?rev=233104&root=gcc&view=rev
Log:
PR target/69118.

gcc/
* config/i386/sse.md (define_insn "avx512f_maskcmp3"):
Fix target.

Modified:
branches/gcc-5-branch/gcc/ChangeLog
branches/gcc-5-branch/gcc/config/i386/sse.md

[Bug target/69118] Wrong condition in avx512f_maskcmp3

2016-02-03 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69118

Kirill Yukhin  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Kirill Yukhin  ---
Fixed.

[Bug target/69671] [6 Regression] FAIL: gcc.target/i386/avx512vl-vpmovqb-1.c scan-assembler-times vpmovqb[ \\t]+[^{\n]*%ymm[0-9]+[^\n]*%xmm[0-9]+{%k[1-7]}{z}(?

2016-02-05 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69671

--- Comment #5 from Kirill Yukhin  ---
(In reply to ktkachov from comment #3)
> CC'ing Kirill for AVX512 opinion

I suppose that there's something wrong w/ MD patterns.
E.g. for example provided pattern is:
;; /export/users/kyukhin/gcc/git/gcc/gcc/config/i386/sse.md: 9199
(define_insn ("avx512vl_truncatev4siv4qi2_mask")
 [
(set (match_operand:V16QI 0 ("register_operand") ("=v"))
(vec_concat:V16QI (vec_merge:V4QI (truncate:V4QI
(match_operand:V4SI 1 ("register_operand") ("v")))
(vec_select:V4QI (match_operand:V16QI 2
("vector_move_operand") ("0C"))
(parallel [
(const_int 0 [0])
(const_int 1 [0x1])
(const_int 2 [0x2])
(const_int 3 [0x3])
]))
(match_operand:QI 3 ("register_operand") ("Yk")))
(const_vector:V12QI [
(const_int 0 [0])
(const_int 0 [0])
(const_int 0 [0])
(const_int 0 [0])
(const_int 0 [0])
(const_int 0 [0])
(const_int 0 [0])
(const_int 0 [0])
(const_int 0 [0])
(const_int 0 [0])
(const_int 0 [0])
(const_int 0 [0])
Right now I think that 2nd operand predicate is not correct.
It should be const0_rtx (of corresponding mode) or duplicate of operand 0
(result actually)
This is whats contstraint tells.

However predicate says simply that operand is either const0_rtx or
nonimmediate: no connection with operand 0.

[Bug target/69671] [6 Regression] FAIL: gcc.target/i386/avx512vl-vpmovqb-1.c scan-assembler-times vpmovqb[ \\t]+[^{\n]*%ymm[0-9]+[^\n]*%xmm[0-9]+{%k[1-7]}{z}(?

2016-02-05 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69671

--- Comment #6 from Kirill Yukhin  ---
This bug seems to be mine.

[Bug target/69671] [6 Regression] FAIL: gcc.target/i386/avx512vl-vpmovqb-1.c scan-assembler-times vpmovqb[ \\t]+[^{\n]*%ymm[0-9]+[^\n]*%xmm[0-9]+{%k[1-7]}{z}(?

2016-02-05 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69671

Kirill Yukhin  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |kyukhin at gcc dot 
gnu.org

[Bug target/69671] [6 Regression] FAIL: gcc.target/i386/avx512vl-vpmovqb-1.c scan-assembler-times vpmovqb[ \\t]+[^{\n]*%ymm[0-9]+[^\n]*%xmm[0-9]+{%k[1-7]}{z}(?

2016-02-05 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69671

--- Comment #8 from Kirill Yukhin  ---
(In reply to Jakub Jelinek from comment #7)
> So do you want to use reg_or_0_operand?  I don't think we usually tie output
> with input already in the predicates, except when match_dup is used.

That is the issue. reg_or_0_operand won't work (although it is better than
"vector_move_operand" since it is prohibits memory)

We want 2nd operand to be either:
1. const0_rtx
2. match_dup 0

I cannot see in gcc/genpreds.c if a reference to another operands is possible
from the other.

We might invent some complicated subst. But patterns look too complicated for
that.

Maybe extend genpreds.c and friends introducing new version of predicate which
will take instead of (op, mode) -> (op, mode, operands).
Not sure in volume of efforts though.
Really hope there's some simpler solution.

[Bug target/69671] [6 Regression] FAIL: gcc.target/i386/avx512vl-vpmovqb-1.c scan-assembler-times vpmovqb[ \\t]+[^{\n]*%ymm[0-9]+[^\n]*%xmm[0-9]+{%k[1-7]}{z}(?

2016-02-05 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69671

--- Comment #10 from Kirill Yukhin  ---
(In reply to Jakub Jelinek from comment #9)
> But something like that might remove the flexibility from the register
> allocator.
> 
> Wonder why the RA in this case doesn't see that the value loaded into that
> pseudo register is CONST0_RTX which satisfies the C constraint and doesn't
> undo CSE (rematerialize) in that case if it doesn't have that value already
> loaded in the matching register to the output one.

Then I see two options:
1. Split all patterns into match_dup and 0_operand by hand
2. Implement dedicated subst for such a patterns which will do p.1 while
processing MD. Not sure it'll be easy

[Bug libfortran/69651] Usage of unitialized pointer io/list_read.c

2016-02-07 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69651

--- Comment #4 from Kirill Yukhin  ---
Created attachment 37628
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37628&action=edit
Reproducer input

[Bug libfortran/69651] Usage of unitialized pointer io/list_read.c

2016-02-07 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69651

--- Comment #3 from Kirill Yukhin  ---
Created attachment 37627
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37627&action=edit
Reproducer src

Reproducer

[Bug libfortran/69651] Usage of unitialized pointer io/list_read.c

2016-02-07 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69651

--- Comment #5 from Kirill Yukhin  ---
A bug in fortran's IO RT has emerged during 21 Apr 2016,
between r54 and r92; 
looks like it's caused by the same revision –r71
(libgfortran/io/list_read.c ), which probably just triggers
another hidden bug.

Trying two  builds (as of  21 and 22 Apr ):
$  gfortran-20160421 -O0 T.f90 -static  
$ ./a.out 
 res, (1) ==1 !
### Ok 

$  gfortran-20160422 -O0 T.f90 -static  
$ ./a.out  
 res, (1) ==   80  @p¼B
### FAIL – garbage is read in 

[Bug target/69671] [6 Regression] FAIL: gcc.target/i386/avx512vl-vpmovqb-1.c scan-assembler-times vpmovqb[ \\t]+[^{\n]*%ymm[0-9]+[^\n]*%xmm[0-9]+{%k[1-7]}{z}(?

2016-02-12 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69671

--- Comment #14 from Kirill Yukhin  ---
Okay,
I've tried:
1. Run AVX-512 testing on Spec2006 and see no impact of the one-liner:
Geomeans:
INT : 5.11 5.11+0.05%
FP  : 2.73 2.73-0.08%
ALL : 3.54 3.54-0.02%

2. Tried Uroš's proposal. Adding to guilty pattern a condition like this:
  "TARGET_AVX512VL
   && ((REG_P (operands[2]) && REG_P (operands[0]) && REGNO (operands[0]) ==
REGNO (operands[2]))
   || (operands[2] == CONST0_RTX (mode)))"

  No success as well. The problem is that zero-masked built-in have register as
second sorce at expand. Which when rematerializes to zero. So, setting this
condition will lead to ICE in recog @ expand.

So, for v6 it looks like we need to remove one-liner.

For v7  we need to extend define_subst a bit to allow multiple output patterns.
E.g. currently:
(define_subst "mask"
  [(set (match_operand:SUBST_V 0)
(match_operand:SUBST_V 1))]
  "TARGET_AVX512F"
  [(set (match_dup 0)
(vec_merge:SUBST_V
  (match_dup 1)
  (match_operand:SUBST_V 2 "vector_move_operand" "0C")
  (match_operand: 3 "register_operand" "Yk")))])

It'd solve a problem if we'll had this instead:
(define_subst "mask"
  [(set (match_operand:SUBST_V 0)
(match_operand:SUBST_V 1))]
  "TARGET_AVX512F"
  [(set (match_dup 0)
(vec_merge:SUBST_V
  (match_dup 1)
  (match_dup 0)
  (match_operand: 3 "register_operand" "Yk")))])
  [(set (match_dup 0)
(vec_merge:SUBST_V
  (match_dup 1)
  (match_operand:SUBST_V 2 "const0_operand" "C")
  (match_operand: 3 "register_operand" "Yk")))])

Opinions?

[Bug target/69671] [6 Regression] FAIL: gcc.target/i386/avx512vl-vpmovqb-1.c scan-assembler-times vpmovqb[ \\t]+[^{\n]*%ymm[0-9]+[^\n]*%xmm[0-9]+{%k[1-7]}{z}(?

2016-02-17 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69671

--- Comment #21 from Kirill Yukhin  ---
I am going to fix the issue in v7 for sure.
But from current point of view this is going to be great pattern refactoring
and hence patch will be thousands of lines.
If this might be ported - I can put an XFAIL on the tests

[Bug target/69671] [6 Regression] FAIL: gcc.target/i386/avx512vl-vpmovqb-1.c scan-assembler-times vpmovqb[ \\t]+[^{\n]*%ymm[0-9]+[^\n]*%xmm[0-9]+{%k[1-7]}{z}(?

2016-02-18 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69671

--- Comment #24 from Kirill Yukhin  ---
(In reply to rguent...@suse.de from comment #23)
> On Wed, 17 Feb 2016, jakub at gcc dot gnu.org wrote:
> 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69671
> > 
> > --- Comment #22 from Jakub Jelinek  ---
> > Created attachment 37722 [details]
> >   --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37722&action=edit
> > gcc6-pr69671.patch
> > 
> > Actually, on a closer look, I believe the only problem are the patterns that
> > use a vector_move_operand "0C" inside of vec_select with only constants as 
> > the
> > parallel's operands.  Because fwprop is able to propagate constants into
> > instructions (thus undo the CSE effect), but doesn't do anything on these,
> > because it also simplifies them, so instead of the expected say
> > (vec_select:V4QI (const_vector:V16QI [
> > (const_int 0 [0])
> > (const_int 0 [0])
> > (const_int 0 [0])
> > (const_int 0 [0])
> > (const_int 0 [0])
> > (const_int 0 [0])
> > (const_int 0 [0])
> > (const_int 0 [0])
> > (const_int 0 [0])
> > (const_int 0 [0])
> > (const_int 0 [0])
> > (const_int 0 [0])
> > (const_int 0 [0])
> > (const_int 0 [0])
> > (const_int 0 [0])
> > (const_int 0 [0])
> > ])
> > (parallel [
> > (const_int 0 [0])
> > (const_int 1 [0x1])
> > (const_int 2 [0x2])
> > (const_int 3 [0x3])
> > ]))
> > we get in there simplified:
> > (const_vector:V4QI [
> > (const_int 0 [0])
> > (const_int 0 [0])
> > (const_int 0 [0])
> > (const_int 0 [0])
> > ])
> > So, by adding extra patterns for that simplification fwprop is able to do 
> > its
> > job even if CSE did a better job.
> 
> Of course then I wonder why we didn't simplify this in the first place
> when generating RTL and need to wait for forwprop ...
> 
> But yes, sounds like the easiest way to go forward.

Agree.

[Bug tree-optimization/69882] New: [6 regression] Excessive reduction statements generated by SLP

2016-02-20 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69882

Bug ID: 69882
   Summary: [6 regression] Excessive reduction statements
generated by SLP
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: kyukhin at gcc dot gnu.org
  Target Milestone: ---

Created attachment 37743
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37743&action=edit
Reproducer

Hello,
Attached test case emits wrong reduction statements.

Compile:
$ trunk/64/20160220/bin/gfortran -o repro -static -m64 -Ofast -mavx repro.f90

Execution ABORTs

Works fine when compiled w/ -O0

Extract from vectorizer dump:
  :
  # k_239 = PHI 
  # c_I_lsm.10_241 = PHI 
  # c_I_lsm.11_242 = PHI 
  # vectp_a.47_406 = PHI 
  # vect_M.50_410 = PHI 
  # ivtmp_420 = PHI <0(48), ivtmp_421(56)>
  _245 = (integer(kind=8)) k_239;
  _246 = _245 * 4;
  _247 = _246 + -4;
  _248 = *a_22(D)[_247];
  M.0_249 = MAX_EXPR <_248, c_I_lsm.10_241>;
  _250 = _246 + -3;
  vect__248.49_408 = MEM[(real(kind=8) *)vectp_a.47_406]; <-- SLP
  vectp_a.47_409 = vectp_a.47_406 + 32;
  _251 = *a_22(D)[_250];
  vect_M.50_411 = MAX_EXPR ; <-- SLP
  M.0_252 = MAX_EXPR <_251, c_I_lsm.11_242>;
  k_266 = k_239 + 1;
  vectp_a.47_407 = vectp_a.47_409 + 32; < -- SLP
  ivtmp_421 = ivtmp_420 + 1;
  if (ivtmp_421 >= bnd.44_361)
goto ;
  else
goto ;

  :
  ... # REMAINDER
  k_377 = k_365 + 1;
  if (k_365 == 26)
goto ;
  else
goto ;

  :
  goto ;

  :
  # k_381 = PHI 
  # c_I_lsm.10_384 = PHI 
  # c_I_lsm.11_386 = PHI 
  # c_I_lsm.13_389 = PHI 
  # c_I_lsm.12_392 = PHI 
  # vect_M.50_413 = PHI 
  stmp_M.51_414 = BIT_FIELD_REF ;
  stmp_M.51_415 = BIT_FIELD_REF ;
  stmp_M.51_416 = BIT_FIELD_REF ;
  stmp_M.51_417 = BIT_FIELD_REF ;
  stmp_M.51_418 = MAX_EXPR ;  # <-- WHOT??
  stmp_M.51_419 = MAX_EXPR ;  # <-- DITTO.
  _401 = (integer(kind=4)) ratio_mult_vf.45_364;
  tmp.46_400 = k.4_11 + _401;
  if (niters.42_358 == ratio_mult_vf.45_364)
goto ;
  else
goto ;

Those 2 SSA names are then stored to 1st and 2nd array elements

[Bug tree-optimization/69956] New: [ICE] Wrong vector type @ fold-const

2016-02-25 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69956

Bug ID: 69956
   Summary: [ICE] Wrong vector type @ fold-const
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: kyukhin at gcc dot gnu.org
  Target Milestone: ---

Created attachment 37789
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37789&action=edit
Reproducer

Hello,
Attached testcase produces ICE when compiled as following:
gcc -S -O2 -march=skylake-avx512 repro.i -ftree-vectorize

I observe the ICE since 02.02.2016

/nfs/ims/home/kyukhin/repro.i:2:1: internal compiler error: tree check:
expected vector_type, have integer_type in co\
nst_unop, at fold-const.c:1665
 fn1() {
 ^~~
0xda1f9c tree_check_failed(tree_node const*, char const*, int, char const*,
...)
/export/users/gnutester/stability/svn/trunk/gcc/tree.c:9637
0x860742 tree_check(tree_node*, char const*, int, char const*, tree_code)
/export/users/gnutester/stability/svn/trunk/gcc/tree.h:3006
0x860742 const_unop(tree_code, tree_node*, tree_node*)
/export/users/gnutester/stability/svn/trunk/gcc/fold-const.c:1665
0xe7f639 gimple_resimplify1(gimple**, code_helper*, tree_node*, tree_node**,
tree_node* (*)(tree_node*))
/export/users/gnutester/stability/svn/trunk/gcc/gimple-match-head.c:85
0xee84b3 gimple_simplify(gimple*, code_helper*, tree_node**, gimple**,
tree_node* (*)(tree_node*), tree_node* (*)(tre\
e_node*))
/export/users/gnutester/stability/svn/trunk/gcc/gimple-match-head.c:622
0x8a0933 gimple_fold_stmt_to_constant_1(gimple*, tree_node* (*)(tree_node*),
tree_node* (*)(tree_node*))
/export/users/gnutester/stability/svn/trunk/gcc/gimple-fold.c:4981
0xc409d2 back_propagate_equivalences
/export/users/gnutester/stability/svn/trunk/gcc/tree-ssa-dom.c:881
0xc409d2 record_temporary_equivalences(edge_def*, const_and_copies*,
avail_exprs_stack*)
/export/users/gnutester/stability/svn/trunk/gcc/tree-ssa-dom.c:963
0xd0663a thread_through_normal_block
   
/export/users/gnutester/stability/svn/trunk/gcc/tree-ssa-threadedge.c:858
0xd07a22 thread_across_edge(gcond*, edge_def*, bool, const_and_copies*,
avail_exprs_stack*, tree_node* (*)(gimple*, g\
imple*, avail_exprs_stack*))
   
/export/users/gnutester/stability/svn/trunk/gcc/tree-ssa-threadedge.c:1005
0xc404c0 dom_opt_dom_walker::thread_across_edge(edge_def*)
/export/users/gnutester/stability/svn/trunk/gcc/tree-ssa-dom.c:989
0xc406eb dom_opt_dom_walker::after_dom_children(basic_block_def*)
/export/users/gnutester/stability/svn/trunk/gcc/tree-ssa-dom.c:1423
0x11a47a7 dom_walker::walk(basic_block_def*)
/export/users/gnutester/stability/svn/trunk/gcc/domwalk.c:307
0xc432a0 execute
/export/users/gnutester/stability/svn/trunk/gcc/tree-ssa-dom.c:614
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <http://gcc.gnu.org/bugs.html> for instructions.

I suspect scalar masks.

[Bug tree-optimization/69980] New: [6 regression] Supposedly wrong SLP code emitted

2016-02-26 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69980

Bug ID: 69980
   Summary: [6 regression] Supposedly wrong SLP code emitted
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: kyukhin at gcc dot gnu.org
  Target Milestone: ---

Created attachment 37806
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37806&action=edit
Reproducer

Hello,
Attached test runfails when compiled is following:
$ gfortran -m64 -Ofast repro.f90 -msse

When compiled w/ -O2  - it works fine.

Second loop nest is just for verification.

Issue lives here:
  mumax = 0;
  do k=1,26
 do i=1,3
mumax(i) = max(mumax(i), mu(i,k)+mu(i,k))
 end do
  end do

Looks like SLP emits some wrong permutations here.

[Bug target/70028] Error: operand size mismatch for `kmovw' (wrong assembly generated) with -mavx512bw -masm=intel

2016-03-01 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70028

Kirill Yukhin  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2016-03-01
 CC||kyukhin at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #2 from Kirill Yukhin  ---
Confirmed.
The issue is that operand modifier passed in .md file is %k1,
which stands for SI mode.
It should be 32b reg or 16b memory, i.e. %ebx and WORD.

[Bug target/70028] Error: operand size mismatch for `kmovw' (wrong assembly generated) with -mavx512bw -masm=intel

2016-03-02 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70028

--- Comment #4 from Kirill Yukhin  ---
(In reply to Jakub Jelinek from comment #3)
> Created attachment 37835 [details]
> gcc6-pr70028.patch
> 
> So what about this patch then?  I don't see kmov* used with %k in other
> patterns, where "m" could appear.

Hi Jakub, patch is fine to me.

[Bug target/70293] [ICE, AVX-512] Wrong reg constraints in vec_dup

2016-03-18 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70293

--- Comment #3 from Kirill Yukhin  ---
Regtest is in progress

[Bug target/70293] [ICE, AVX-512] Wrong reg constraints in vec_dup

2016-03-18 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70293

--- Comment #2 from Kirill Yukhin  ---
Created attachment 38020
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38020&action=edit
Proposed patch

Attached patch solves the issue by blocking AVX2's broadcast pattern
alternative: $r->Yi, which is subject of split2

[Bug target/70293] [ICE, AVX-512] Wrong reg constraints in vec_dup

2016-03-19 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70293

--- Comment #1 from Kirill Yukhin  ---
We've got duplication of patterns (make mddump):
;; /export/users/kyukhin/gcc/git/gcc2/gcc/config/i386/sse.md: 17107
(define_insn ("avx2_pbroadcastv8hi")
 [
(set (match_operand:V8HI 0 ("register_operand") ("=x"))
(vec_duplicate:V8HI (vec_select:HI (match_operand:V8HI 1
("nonimmediate_operand") ("xm"))
(parallel [
(const_int 0 [0])
]
] ("TARGET_AVX2") ("vpbroadcastw\t{%1, %0|%0, %w1}")
 [
(set_attr ("type") ("ssemov"))
(set_attr ("prefix_extra") ("1"))
(set_attr ("prefix") ("vex"))
(set_attr ("mode") ("TI"))
])
...
(define_insn ("avx512vl_vec_dupv8hi")
 [
(set (match_operand:V8HI 0 ("register_operand") ("=v"))
(vec_duplicate:V8HI (vec_select:HI (match_operand:V8HI 1
("nonimmediate_operand") ("vm"))
(parallel [
(const_int 0 [0])
]
] ("(TARGET_AVX512BW) && (TARGET_AVX512VL)") ("vpbroadcastw\t{%1, %0|%0,
%1}")
 [
(set_attr ("type") ("ssemov"))
(set_attr ("prefix") ("evex"))
(set_attr ("mode") ("TI"))
])

That's why we've got unsatisfied constraints on xmmN, N>15.

[Bug target/70293] New: [ICE, AVX-512] Wrong reg constraints in vec_dup

2016-03-20 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70293

Bug ID: 70293
   Summary: [ICE, AVX-512] Wrong reg constraints in vec_dup
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: kyukhin at gcc dot gnu.org
  Target Milestone: ---

Created attachment 38018
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38018&action=edit
Reproducer

Attached testcase ICEs when compiled as:
./xgcc -B. -mtune=broadwell -mavx512vl -O2 -S ~/pixman-sse.i

1_0.32.6-r0/pixman-0.32.6/pixman/pixman-sse2.c: In function
‘fast_composite_scaled_bilinear_sse2__8__none\
_OVER’:
/home/donn/c/8.x/wrl-projects/intel-skylake-standard-glibc_std/bitbake_build/tmp/work/skylake-avx512-64-wrs-linux/pix\
man/1_0.32.6-r0/pixman-0.32.6/pixman/pixman-sse2.c:6059:1: error: insn does not
satisfy its constraints:
(insn 5050 5049 1065 58 (set (reg/v:V8HI 56 xmm19 [orig:670 D.27517 ] [670])
(vec_duplicate:V8HI (vec_select:HI (reg/v:V8HI 56 xmm19 [orig:670
D.27517 ] [670])
(parallel [
(const_int 0 [0])
]
/home/donn/c/8.x/wrl-projects/intel-skylake-standard-glibc_std/bitbake_build/tmp/sysroots/x\
86_64-linux/usr/lib/x86_64-wrs-linux/gcc/x86_64-wrs-linux/5.2.0/include/emmintrin.h:606
4153 {avx2_pbroadcastv8hi}
 (nil))
/home/donn/c/8.x/wrl-projects/intel-skylake-standard-glibc_std/bitbake_build/tmp/work/skylake-avx512-64-wrs-linux/pix\
man/1_0.32.6-r0/pixman-0.32.6/pixman/pixman-sse2.c:6059:1: internal compiler
error: in extract_constrain_insn, at rec\
og.c:2190
0xdaccab _fatal_insn(char const*, rtx_def const*, char const*, int, char
const*)
/export/users/kyukhin/gcc/git/gcc2/gcc/rtl-error.c:108
0xdacd0b _fatal_insn_not_found(rtx_def const*, char const*, int, char const*)
/export/users/kyukhin/gcc/git/gcc2/gcc/rtl-error.c:119
0xd50b31 extract_constrain_insn(rtx_insn*)
/export/users/kyukhin/gcc/git/gcc2/gcc/recog.c:2190
0xd5f1d3 copyprop_hardreg_forward_1
/export/users/kyukhin/gcc/git/gcc2/gcc/regcprop.c:774
0xd60afe execute
/export/users/kyukhin/gcc/git/gcc2/gcc/regcprop.c:1280
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <http://gcc.gnu.org/bugs.html> for instructions.

[Bug target/70325] ICE on __builtin_ia32_storedquqi256_mask

2016-03-21 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70325

Kirill Yukhin  changed:

   What|Removed |Added

 CC||kyukhin at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |kyukhin at gcc dot 
gnu.org

--- Comment #1 from Kirill Yukhin  ---
Reproducible with:
 ./xg++ -B. -O2 -S 1.c

[Bug target/70293] [ICE, AVX-512] Wrong reg constraints in vec_dup

2016-03-21 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70293

--- Comment #4 from Kirill Yukhin  ---
Author: kyukhin
Date: Mon Mar 21 10:51:04 2016
New Revision: 234363

URL: https://gcc.gnu.org/viewcvs?rev=234363&root=gcc&view=rev
Log:
PR target/70293

gcc/
* config/i386 (define_insn "*vec_dup"/AVX2): Block
third alternative for AVX-512VL target,

gcc/testsuite/
* gcc.target/i386/pr70293.c: New test.

Added:
trunk/gcc/testsuite/gcc.target/i386/pr70293.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/sse.md
trunk/gcc/testsuite/ChangeLog

[Bug target/70293] [ICE, AVX-512] Wrong reg constraints in vec_dup

2016-03-21 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70293

--- Comment #5 from Kirill Yukhin  ---
Author: kyukhin
Date: Mon Mar 21 10:53:50 2016
New Revision: 234364

URL: https://gcc.gnu.org/viewcvs?rev=234364&root=gcc&view=rev
Log:
PR target/70293.

gcc/
* config/i386 (define_insn "*vec_dup"/AVX2): Block
third alternative for AVX-512VL target,
gcc/testsuite/
* gcc.target/i386/pr70293.c: New test.

Added:
branches/gcc-5-branch/gcc/testsuite/gcc.target/i386/pr70293.c
Modified:
branches/gcc-5-branch/gcc/ChangeLog
branches/gcc-5-branch/gcc/config/i386/sse.md
branches/gcc-5-branch/gcc/testsuite/ChangeLog

[Bug target/70325] ICE on __builtin_ia32_storedquqi256_mask

2016-03-21 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70325

--- Comment #2 from Kirill Yukhin  ---
I am testing this patch:
commit e88ceeabc50634012fa21f47625934d9a2c2e160
Author: Kirill Yukhin 
Date:   Mon Mar 21 14:28:58 2016 +0300

AVX-512. Fix PR70325.

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 3d8dbc4..2c56ee7 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -32431,7 +32431,7 @@ def_builtin (HOST_WIDE_INT mask, const char *name,

   mask &= ~OPTION_MASK_ISA_64BIT;
   if (mask == 0
- || (mask & ix86_isa_flags) != 0
+ || (mask & ix86_isa_flags) == mask
  || (lang_hooks.builtin_function
  == lang_hooks.builtin_function_ext_scope))

diff --git a/gcc/testsuite/gcc.target/i386/pr70325.c
b/gcc/testsuite/gcc.target/i386/pr70325.c
new file mode 100644
index 000..e2b9342
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr70325.c
@@ -0,0 +1,12 @@
+/* PR target/70325 */
+/* { dg-do compile } */
+/* { dg-options "-mavx512vl -O2" } */
+
+typedef char C __attribute((__vector_size__(32)));
+typedef int I __attribute((__vector_size__(32)));
+
+void
+f(int a,I b)
+{
+  __builtin_ia32_storedquqi256_mask((C*)f,(C)b,a); /* { dg-warning "implicit
declaration of function" } */
+}

[Bug target/70325] ICE on __builtin_ia32_storedquqi256_mask

2016-03-22 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70325

Kirill Yukhin  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Kirill Yukhin  ---
Done.

[Bug target/70325] ICE on __builtin_ia32_storedquqi256_mask

2016-03-22 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70325

Kirill Yukhin  changed:

   What|Removed |Added

 Status|RESOLVED|ASSIGNED
   Last reconfirmed||2016-03-22
 Resolution|FIXED   |---
 Ever confirmed|0   |1

--- Comment #4 from Kirill Yukhin  ---
Sorry, closed by mistake

[Bug target/70293] [ICE, AVX-512] Wrong reg constraints in vec_dup

2016-03-22 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70293

Kirill Yukhin  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #6 from Kirill Yukhin  ---
Done

[Bug target/70325] ICE on __builtin_ia32_storedquqi256_mask

2016-03-22 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70325

--- Comment #5 from Kirill Yukhin  ---
Author: kyukhin
Date: Tue Mar 22 11:09:03 2016
New Revision: 234395

URL: https://gcc.gnu.org/viewcvs?rev=234395&root=gcc&view=rev
Log:

PR target/70325
gcc/
* config/i386/i386.c (def_builtin): Handle
OPTION_MASK_ISA_AVX512VL to be and-ed with other
bits.
(const struct builtin_description bdesc_special_args[]):
Remove duplicate ISA bits.
gcc/testsuite/
* gcc.target/i386/pr70325.c: New test.

Added:
trunk/gcc/testsuite/gcc.target/i386/pr70325.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/i386.c
trunk/gcc/testsuite/ChangeLog

[Bug target/70325] ICE on __builtin_ia32_storedquqi256_mask

2016-03-22 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70325

--- Comment #6 from Kirill Yukhin  ---
Author: kyukhin
Date: Tue Mar 22 11:13:44 2016
New Revision: 234396

URL: https://gcc.gnu.org/viewcvs?rev=234396&root=gcc&view=rev
Log:
PR target/70325.

gcc/
* config/i386/i386.c (def_builtin): Handle
OPTION_MASK_ISA_AVX512VL to be and-ed with other
bits.
(const struct builtin_description bdesc_special_args[]):
Remove duplicate ISA bits.
gcc/testsuite/
* gcc.target/i386/pr70325.c: New test.

Added:
branches/gcc-5-branch/gcc/testsuite/gcc.target/i386/pr70325.c
Modified:
branches/gcc-5-branch/gcc/ChangeLog
branches/gcc-5-branch/gcc/config/i386/i386.c
branches/gcc-5-branch/gcc/testsuite/ChangeLog

[Bug target/70325] ICE on __builtin_ia32_storedquqi256_mask

2016-03-23 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70325

Kirill Yukhin  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #7 from Kirill Yukhin  ---
Done.

[Bug target/70406] ICE: in extract_insn, at recog.c:2287 (unrecognizable insn) with -mtune=pentium2 -mavx512f

2016-03-25 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70406

Kirill Yukhin  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2016-03-25
 Ever confirmed|0   |1

--- Comment #2 from Kirill Yukhin  ---
Reproduced.

[Bug target/70406] ICE: in extract_insn, at recog.c:2287 (unrecognizable insn) with -mtune=pentium2 -mavx512f

2016-03-25 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70406

--- Comment #3 from Kirill Yukhin  ---
Created attachment 38095
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38095&action=edit
Bootstrapped/regtested patch

Will submit to gcc-patches shortly

[Bug target/70406] ICE: in extract_insn, at recog.c:2287 (unrecognizable insn) with -mtune=pentium2 -mavx512f

2016-03-28 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70406

--- Comment #4 from Kirill Yukhin  ---
Author: kyukhin
Date: Mon Mar 28 07:59:44 2016
New Revision: 234500

URL: https://gcc.gnu.org/viewcvs?rev=234500&root=gcc&view=rev
Log:
PR target/70406

gcc/
 * config/i386/i386.md (define_split, andn): Fix modes.

gcc/testsuite/
 * gcc.target/i386/pr70406.c: New test.

Added:
trunk/gcc/testsuite/gcc.target/i386/pr70406.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/i386.md
trunk/gcc/testsuite/ChangeLog

[Bug target/70406] ICE: in extract_insn, at recog.c:2287 (unrecognizable insn) with -mtune=pentium2 -mavx512f

2016-03-28 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70406

--- Comment #5 from Kirill Yukhin  ---
Author: kyukhin
Date: Mon Mar 28 08:01:56 2016
New Revision: 234501

URL: https://gcc.gnu.org/viewcvs?rev=234501&root=gcc&view=rev
Log:
PR target/70406.

gcc/
 * config/i386/i386.md (define_split, andn): Fix modes.

gcc/testsuite/
 * gcc.target/i386/pr70406.c: New test.

Added:
branches/gcc-5-branch/gcc/testsuite/gcc.target/i386/pr70406.c
Modified:
branches/gcc-5-branch/gcc/ChangeLog
branches/gcc-5-branch/gcc/config/i386/i386.md
branches/gcc-5-branch/gcc/testsuite/ChangeLog

[Bug target/70406] ICE: in extract_insn, at recog.c:2287 (unrecognizable insn) with -mtune=pentium2 -mavx512f

2016-03-28 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70406

Kirill Yukhin  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #6 from Kirill Yukhin  ---
Done.

[Bug target/70429] Wrong code with -O1.

2016-03-28 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70429

Kirill Yukhin  changed:

   What|Removed |Added

 CC||kyukhin at gcc dot gnu.org

--- Comment #4 from Kirill Yukhin  ---
Seems like combiner performs invalid reassociation. This trivial addition to
Jakub's PR70222 fix makes test work:

--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -10526,7 +10526,7 @@ simplify_shift_const_1 (enum rtx_code code,
machine_mode result_mode,
{
  /* For ((unsigned) (cstULL >> count)) >> cst2 we have to make
 sure the result will be masked.  See PR70222.  */
- if (code == LSHIFTRT
+ if ((code == LSHIFTRT || code == ASHIFTRT)
  && mode != result_mode
  && !merge_outer_ops (&outer_op, &outer_const, AND,
   GET_MODE_MASK (result_mode)

[Bug target/70453] gcc generates invalid instruction vextractu64x4 (should be: vextracti64x4)

2016-03-30 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70453

Kirill Yukhin  changed:

   What|Removed |Added

 CC||kyukhin at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |kyukhin at gcc dot 
gnu.org

--- Comment #1 from Kirill Yukhin  ---
Will look

[Bug target/70453] gcc generates invalid instruction vextractu64x4 (should be: vextracti64x4)

2016-03-30 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70453

Kirill Yukhin  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2016-03-30
 Ever confirmed|0   |1

--- Comment #2 from Kirill Yukhin  ---
Confirmed

[Bug target/70453] gcc generates invalid instruction vextractu64x4 (should be: vextracti64x4)

2016-03-30 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70453

--- Comment #3 from Kirill Yukhin  ---
Created attachment 38133
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38133&action=edit
Proposed patch

I am reg-testing trivial patch

[Bug target/70453] gcc generates invalid instruction vextractu64x4 (should be: vextracti64x4)

2016-03-30 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70453

Kirill Yukhin  changed:

   What|Removed |Added

  Attachment #38133|0   |1
is obsolete||

--- Comment #5 from Kirill Yukhin  ---
Created attachment 38135
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38135&action=edit
The patch

Woops, this one.

[Bug tree-optimization/70479] New: FMA is not reassociated causing x2 slowdown vs. ICC

2016-03-31 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70479

Bug ID: 70479
   Summary: FMA is not reassociated causing x2 slowdown vs. ICC
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: kyukhin at gcc dot gnu.org
  Target Milestone: ---

Created attachment 38146
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38146&action=edit
Reproducer

Attached example demonstrates the issue.
GCC is recent trunk. ICC is v16.

Compile:
  GCC: g++ -march=haswell -Ofast -flto -fopenmp-simd -fpermissive m.cpp -o
m.gcc
  ICC: icpc -O3 -ipo -fpermissive -xAVX2 -qopenmp m.cpp -o m.icc

Run
  GCC: time ./m.icc 2 2
  ICC: time ./m.gcc 2 2

Hot spot generated by GCC (annotated w/ perf hit counts):
157 │8d0:┌─→vbroad 0x4(%r13),%zmm0
193 ││  lea0x1(%rdx),%edx
173 ││  vmulps (%r14,%rax,1),%zmm0,%zmm0
2943││  vbroad 0x60(%r13),%zmm1
166 ││  vbroad 0x5c(%r13),%zmm2
151 ││  vbroad 0x58(%r13),%zmm3
144 ││  vbroad 0x54(%r13),%zmm4
164 ││  vbroad 0x50(%r13),%zmm5
170 ││  vbroad 0x4c(%r13),%zmm6
162 ││  vbroad 0x48(%r13),%zmm7
162 ││  vbroad 0x44(%r13),%zmm8
154 ││  vbroad 0x40(%r13),%zmm9
172 ││  vbroad 0x3c(%r13),%zmm10
167 ││  vbroad 0x38(%r13),%zmm11
172 ││  vbroad 0x34(%r13),%zmm12
171 ││  vbroad 0x30(%r13),%zmm13
161 ││  vbroad 0x2c(%r13),%zmm14
176 ││  vbroad 0x28(%r13),%zmm15
139 ││  vbroad 0x24(%r13),%zmm16
180 ││  vbroad 0x20(%r13),%zmm17
158 ││  vbroad 0x1c(%r13),%zmm18
165 ││  vbroad 0x18(%r13),%zmm19
140 ││  vbroad 0x10(%r13),%zmm21
179 ││  vbroad 0xc(%r13),%zmm22
146 ││  vbroad 0x8(%r13),%zmm23
170 ││  vbroad 0x0(%r13),%zmm24
170 ││  vbroad 0x14(%r13),%zmm20
168 ││  vfmadd (%r15,%rax,1),%zmm24,%zmm0
2732││  mov0xb8(%rsp),%rcx
172 ││  vfmadd (%r11,%rax,1),%zmm23,%zmm0
1649││  vfmadd (%rsi,%rax,1),%zmm22,%zmm0  
  3413││  vfmadd
(%rcx,%rax,1),%zmm21,%zmm0 
   3653││  mov0xc0(%rsp),%rcx  
 182 ││  vfmadd
(%rcx,%rax,1),%zmm20,%zmm0
2806││  mov0xc8(%rsp),%rcx
176 ││  vfmadd (%rcx,%rax,1),%zmm19,%zmm0
2439││  mov0xd0(%rsp),%rcx
179 ││  vfmadd (%rcx,%rax,1),%zmm18,%zmm0
2562││  mov0xd8(%rsp),%rcx
197 ││  vfmadd (%rcx,%rax,1),%zmm17,%zmm0
2867││  mov0xe0(%rsp),%rcx
141 ││  vfmadd (%rcx,%rax,1),%zmm16,%zmm0
3200││  mov0xe8(%rsp),%rcx
156 ││  vfmadd (%rcx,%rax,1),%zmm15,%zmm0
3557││  mov0xf0(%rsp),%rcx
158 ││  vfmadd (%rcx,%rax,1),%zmm14,%zmm0
││  mov0xf8(%rsp),%rcx
143 ││  vfmadd (%rcx,%rax,1),%zmm13,%zmm0
3004││  mov0x100(%rsp),%rcx
177 ││  vfmadd (%rcx,%rax,1),%zmm12,%zmm0
2876││  mov0x108(%rsp),%rcx
144 ││  vfmadd (%rcx,%rax,1),%zmm11,%zmm0
2838││  mov0x110(%rsp),%rcx
168 ││  vfmadd (%rcx,%rax,1),%zmm10,%zmm0
2503││  mov0x118(%rsp),%rcx
203 ││  vfmadd (%rcx,%rax,1),%zmm9,%zmm0
2471││  mov0x120(%rsp),%rcx
185 ││  vfmadd (%rcx,%rax,1),%zmm8,%zmm0
2153││  mov0x128(%rsp),%rcx
152 ││  vfmadd (%r12,%rax,1),%zmm7,%zmm0
2091││  vfmadd (%rbx,%rax,1),%zmm6,%zmm0
3049││  vfmadd (%r10,%rax,1),%zmm5,%zmm0
3737││  vfmadd (%r9,%rax,1),%zmm4,%zmm0
3665││  vfmadd (%r8,%rax,1),%zmm3,%zmm0
3627││  vfmadd (%rdi,%rax,1),%zmm2,%zmm0
3804││  vfmadd (%rcx,%rax,1),%zmm1,%zmm0
4052││  mov0x130(%rsp),%rcx
160 ││  cmp0x138(%rsp),%edx
534 ││  vmovup %zmm0,(%rcx,%rax,1)
3235││  lea0x40(%rax),%rax
161 │└──jb 8d0

Hot spot generated by ICC (annotated w/ perf hit counts):
   344 │47a:┌─→vmulps 0x204c(%r11,%r14,4),%zmm27,%zmm2
   821 ││  vmulps 0x2050(%r11,%r14,4),%zmm4,%zmm1
   318 ││  vmulps 0x2040(%r11,%r14,4),%zmm6,%zmm29
   818 ││  vmulps 0x1840(%r11,%r14,4),%zmm7,%zmm31
   275 ││  vfmadd 0x1838(%r11,%r14,4),%zmm9,%zmm31
  1234 ││  vfmadd 0x183c(%r11,%r14,4),%zmm8,%zmm29
   442 ││  vfmadd 0x2044(%r11,%r14,4),%zmm5,%zmm2
  1110 ││  vfmadd 0x2048(%r11,%r14,4),%zmm28,%zmm1
   337 ││  vaddps %zmm29,%zmm31,%zmm0
  1047 ││  vaddps %zmm1,%zmm2,%zmm3
   655 ││  vmulps 0x1830(%r11,%r14,4),%zmm11,%zmm30
   956 ││  vmulps 0x1834(%r11,%r14,4),%zmm10,%zmm2
   296 ││  vmulps 0x1024(%r11,%r14,4),%zmm15,%zmm1
  1050 ││  vmulps 0x1028(%r11,%r14,4),%zmm14,%zmm31
   294 ││  vaddps %zmm0,%zmm3,%zmm3
  1057 ││  vfmadd 0x102c(%r11,%r14,4),%zmm13,%zmm30
   344 ││  vfmadd 0x1030(%r11,%r14,4),%zmm12,%zmm2
   911 ││  vfmadd 0x1020(%r11,%r14,4),%zmm16,%zmm31
   332 ││  vfmadd 0x820(%r11,%r14,4),%zmm17,%zmm1

[Bug tree-optimization/70479] FMA is not reassociated causing x2 slowdown vs. ICC

2016-03-31 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70479

--- Comment #1 from Kirill Yukhin  ---
(In reply to Kirill Yukhin from comment #0)
> Compile:
>   GCC: g++ -march=haswell -Ofast -flto -fopenmp-simd -fpermissive m.cpp -o
> m.gcc
>   ICC: icpc -O3 -ipo -fpermissive -xAVX2 -qopenmp m.cpp -o m.icc
Correct compile commands (original are for Haswell)
   GCC: g++ -march=knl -Ofast -flto -fopenmp-simd -fpermissive m.cpp -o m.gcc
   ICC: icpc -O3 -ipo -fpermissive -xMIC-AVX512 -qopenmp m.cpp -o m.icc

[Bug tree-optimization/70479] FMA is not reassociated causing x2 slowdown vs. ICC

2016-03-31 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70479

--- Comment #3 from Kirill Yukhin  ---
(In reply to Richard Biener from comment #2)
> You mean we fail to handle ternary associative tree codes in GIMPLE reassoc?
> Yes, that's true.  It's not going to be easy to retro-fit there
> implementation-wise.  With rebalancing you mean handling reassoc-width > 1?

Hi Richard, yes to both.

[Bug target/70453] gcc generates invalid instruction vextractu64x4 (should be: vextracti64x4)

2016-03-31 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70453

--- Comment #6 from Kirill Yukhin  ---
Author: kyukhin
Date: Thu Mar 31 15:23:29 2016
New Revision: 234634

URL: https://gcc.gnu.org/viewcvs?rev=234634&root=gcc&view=rev
Log:
Fix PR target/70453.

gcc/
* config/i386/sse.md (define_mode_attr shuffletype): Fix typo.

gcc/testsuite/
* gcc.target/i386/pr70453.c: New test.

Added:
trunk/gcc/testsuite/gcc.target/i386/pr70453.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/sse.md
trunk/gcc/testsuite/ChangeLog

[Bug target/70453] gcc generates invalid instruction vextractu64x4 (should be: vextracti64x4)

2016-03-31 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70453

--- Comment #7 from Kirill Yukhin  ---
Author: kyukhin
Date: Thu Mar 31 15:25:33 2016
New Revision: 234635

URL: https://gcc.gnu.org/viewcvs?rev=234635&root=gcc&view=rev
Log:
Fix PR target/70453.

gcc/
* config/i386/sse.md (define_mode_attr shuffletype): Fix typo.

gcc/testsuite/
* gcc.target/i386/pr70453.c: New test.

Added:
branches/gcc-5-branch/gcc/testsuite/gcc.target/i386/pr70453.c
Modified:
branches/gcc-5-branch/gcc/ChangeLog
branches/gcc-5-branch/gcc/config/i386/sse.md
branches/gcc-5-branch/gcc/testsuite/ChangeLog

[Bug target/70453] gcc generates invalid instruction vextractu64x4 (should be: vextracti64x4)

2016-04-01 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70453

Kirill Yukhin  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #8 from Kirill Yukhin  ---
Done.

[Bug target/70510] ICE: output_operand: invalid %-code with -mavx512bw -masm=intel when emitting vpbroatcast

2016-04-03 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70510

Kirill Yukhin  changed:

   What|Removed |Added

 CC||kyukhin at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |kyukhin at gcc dot 
gnu.org

--- Comment #1 from Kirill Yukhin  ---
will take a look.

[Bug target/64393] ICE: in extract_insn, at recog.c:2327 (unrecognizable insn) with -mavx512vbmi

2016-04-03 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64393

Kirill Yukhin  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Kirill Yukhin  ---
Done

[Bug target/64387] ICE: in extract_insn, at recog.c:2327 (unrecognizable insn) with -ffloat-store -mavx512er

2016-04-04 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64387

Kirill Yukhin  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 CC||kyukhin at gcc dot gnu.org
 Resolution|--- |FIXED

--- Comment #6 from Kirill Yukhin  ---
Done

[Bug target/70510] ICE: output_operand: invalid %-code with -mavx512bw -masm=intel when emitting vpbroatcast

2016-04-05 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70510

Kirill Yukhin  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2016-04-05
 Ever confirmed|0   |1

[Bug target/70510] ICE: output_operand: invalid %-code with -mavx512bw -masm=intel when emitting vpbroatcast

2016-04-05 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70510

--- Comment #3 from Kirill Yukhin  ---
(In reply to Uroš Bizjak from comment #2)
> (In reply to Kirill Yukhin from comment #1)
> > will take a look.
> 
> I have patch in testing:
> 
Oh, great! Thanks!

[Bug target/64386] ICE: in extract_insn, at recog.c:2327 (unrecognizable insn) with -mavx512bw

2016-04-07 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64386

Kirill Yukhin  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Kirill Yukhin  ---
Done.

[Bug target/59683] ICE: in classify_argument, at config/i386/i386.c:6637 with #pragma GCC target("avx512f")

2016-04-07 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59683

--- Comment #3 from Kirill Yukhin  ---
This hunk from Jakub's fix for PR61925 makes test working:
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index a41efa4..6aebaed 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -4962,6 +4962,15 @@ static GTY(()) tree ix86_previous_fndecl;
 void
 ix86_reset_previous_fndecl (void)
 {
+  tree new_tree = target_option_current_node;
+  cl_target_option_restore (&global_options, TREE_TARGET_OPTION (new_tree));
+  if (TREE_TARGET_GLOBALS (new_tree))
+restore_target_globals (TREE_TARGET_GLOBALS (new_tree));
+  else if (new_tree == target_option_default_node)
+restore_target_globals (&default_target_globals);
+  else
+TREE_TARGET_GLOBALS (new_tree) = save_target_globals_default_opts ();
+
   ix86_previous_fndecl = NULL_TREE;
 }

[Bug tree-optimization/70577] [6 regression] tree-ssa/prefetch-5.c scan-tree-dump-times aprefetch failures

2016-04-11 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70577

Kirill Yukhin  changed:

   What|Removed |Added

 CC||kyukhin at gcc dot gnu.org

--- Comment #8 from Kirill Yukhin  ---
This commit caused miscompare of spec2000/178.galgel on -march=skylake-avx512
(-Ofast -flto -funroll-loops):
   Newton iteration #  0Maximal derivative = 0.1526E-07
   Newton iteration #  0Maximal derivative = 0.3901E-07

[Bug target/70662] vpbroadcastq assemble failure with -masm=intel -mavx512vbmi

2016-04-14 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70662

Kirill Yukhin  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2016-04-14
   Assignee|unassigned at gcc dot gnu.org  |kyukhin at gcc dot 
gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Kirill Yukhin  ---
I'll take a look.

[Bug target/70662] vpbroadcastq assemble failure with -masm=intel -mavx512vbmi

2016-04-15 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70662

--- Comment #2 from Kirill Yukhin  ---
Author: kyukhin
Date: Fri Apr 15 08:25:49 2016
New Revision: 235008

URL: https://gcc.gnu.org/viewcvs?rev=235008&root=gcc&view=rev
Log:
AVX-512. Fix mem operand modifier for Intel syntax.

PR target/70662
gcc/
* config/i386/sse.md: Use proper memory operand
modifiers.
testsuite/gcc/
* gcc.target/i386/pr70662.c: New test.

Added:
trunk/gcc/testsuite/gcc.target/i386/pr70662.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/sse.md
trunk/gcc/testsuite/ChangeLog

[Bug target/70662] vpbroadcastq assemble failure with -masm=intel -mavx512vbmi

2016-04-15 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70662

--- Comment #3 from Kirill Yukhin  ---
Author: kyukhin
Date: Fri Apr 15 09:36:31 2016
New Revision: 235013

URL: https://gcc.gnu.org/viewcvs?rev=235013&root=gcc&view=rev
Log:
AVX-512. Use proper mem ops modifier for Intel syntax in broadcast patter.

PR target/70662
gcc/
* config/i386/sse.md: Use proper memory operand
modifiers.
gcc/testsuite.
* gcc.target/i386/pr70662.c: New test.

Added:
branches/gcc-5-branch/gcc/testsuite/gcc.target/i386/pr70662.c
Modified:
branches/gcc-5-branch/gcc/ChangeLog
branches/gcc-5-branch/gcc/config/i386/sse.md
branches/gcc-5-branch/gcc/testsuite/ChangeLog

[Bug target/70662] vpbroadcastq assemble failure with -masm=intel -mavx512vbmi

2016-04-15 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70662

--- Comment #5 from Kirill Yukhin  ---
Author: kyukhin
Date: Fri Apr 15 15:13:42 2016
New Revision: 235037

URL: https://gcc.gnu.org/viewcvs?rev=235037&root=gcc&view=rev
Log:
AVX-512, Fix mode size check.

PR target/70662
gcc/
* config/i386/sse.md(define_insn "_vec_dup"):
Fix mode size check.

Modified:
branches/gcc-5-branch/gcc/ChangeLog
branches/gcc-5-branch/gcc/config/i386/sse.md

[Bug target/70662] vpbroadcastq assemble failure with -masm=intel -mavx512vbmi

2016-04-15 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70662

--- Comment #6 from Kirill Yukhin  ---
Author: kyukhin
Date: Fri Apr 15 15:17:31 2016
New Revision: 235038

URL: https://gcc.gnu.org/viewcvs?rev=235038&root=gcc&view=rev
Log:
AVX-512. Fix mode size check.

PR target/70662
gcc/   
   * config/i386/sse.md(define_insn "_vec_dup"):
Fix mode size check.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/sse.md

[Bug target/70662] vpbroadcastq assemble failure with -masm=intel -mavx512vbmi

2016-04-19 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70662

Kirill Yukhin  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #8 from Kirill Yukhin  ---
Done

[Bug target/70728] GCC trunk emits invalid assembly for knl target

2016-04-19 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70728

Kirill Yukhin  changed:

   What|Removed |Added

 Target||i?86/x86_64
 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2016-04-19
 CC||kyukhin at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Kirill Yukhin  ---
I'll take a look.

[Bug target/70728] GCC trunk emits invalid assembly for knl target

2016-04-21 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70728

--- Comment #2 from Kirill Yukhin  ---
This is a 5/6 regression

  1   2   3   >