[Bug middle-end/26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163 Bug 26163 depends on bug 68627, which changed state. Bug 68627 Summary: [i386, AVX-512] Illegal insn generated while compiling spec2k6/437.leslie3d for KNL https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68627 What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED
[Bug target/68627] [i386, AVX-512] Illegal insn generated while compiling spec2k6/437.leslie3d for KNL
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68627 Kirill Yukhin changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #3 from Kirill Yukhin --- Fixed.
[Bug other/84613] [meta-bug] SPEC compiler performance issues
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84613 Bug 84613 depends on bug 68627, which changed state. Bug 68627 Summary: [i386, AVX-512] Illegal insn generated while compiling spec2k6/437.leslie3d for KNL https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68627 What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED
[Bug target/68633] [i386, AVX-512] Spec2006/434.zeus miscompares when executed on KNL
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68633 Kirill Yukhin changed: What|Removed |Added Resolution|--- |FIXED Status|UNCONFIRMED |RESOLVED --- Comment #3 from Kirill Yukhin --- Fixed.
[Bug other/84613] [meta-bug] SPEC compiler performance issues
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84613 Bug 84613 depends on bug 68633, which changed state. Bug 68633 Summary: [i386, AVX-512] Spec2006/434.zeus miscompares when executed on KNL https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68633 What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED
[Bug middle-end/26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163 Bug 26163 depends on bug 68633, which changed state. Bug 68633 Summary: [i386, AVX-512] Spec2006/434.zeus miscompares when executed on KNL https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68633 What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED
[Bug target/95144] Many AVX-512 functions take an int instead of unsigned int
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95144 Kirill Yukhin changed: What|Removed |Added Last reconfirmed||2020-06-16 CC||kyukhin at gcc dot gnu.org Ever confirmed|0 |1 Status|UNCONFIRMED |ASSIGNED --- Comment #2 from Kirill Yukhin --- Similar bug https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65744
[Bug target/95766] Failure to directly use vpbroadcastd for _mm_set1_epi32 when passing unsigned short
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95766 --- Comment #11 from Kirill Yukhin --- (In reply to Jakub Jelinek from comment #10) > Kirill, any thoughts on that? I'd prefer your variant, w/o unspecs.
[Bug target/58269] [4.9 Regression] ICE when building libobjc on x86_64-apple-darwin* after revision 201915
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58269 --- Comment #8 from Kirill Yukhin --- Author: kyukhin Date: Fri Sep 6 10:36:30 2013 New Revision: 202318 URL: http://gcc.gnu.org/viewcvs?rev=202318&root=gcc&view=rev Log: PR target/58269 * config/i386/i386.c (ix86_conditional_register_usage): Proper initialize extended SSE registers. Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/i386.c
[Bug rtl-optimization/47698] CMOV accessing volatile memory with read side effect
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47698 --- Comment #6 from Kirill Yukhin 2011-11-07 08:42:00 UTC --- Author: kyukhin Date: Mon Nov 7 08:41:55 2011 New Revision: 181075 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=181075 Log: gcc/ PR rtl-optimization/47698 * ifconv.c (noce_operand_ok): prevent CMOV generation for volatile mem. gcc/testsuite/ PR rtl-optimization/47698 * gcc.target/i386/47698.c: New test. Added: trunk/gcc/testsuite/gcc.target/i386/47698.c Modified: trunk/gcc/ChangeLog trunk/gcc/ifcvt.c trunk/gcc/testsuite/ChangeLog
[Bug target/50962] Additional opportunity for AGU stall avoidance optimization for Atom processor
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50962 --- Comment #4 from Kirill Yukhin 2011-11-07 08:47:18 UTC --- Author: kyukhin Date: Mon Nov 7 08:47:15 2011 New Revision: 181077 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=181077 Log: gcc/ PR target/50962 * config/i386/i386-protos.h (ix86_use_lea_for_mov): New. * config/i386/i386.c (ix86_use_lea_for_mov): Likewise. * config/i386/i386.md (movsi_internal): Emit lea if profitable. (movdi_internal_rex64): Likewise. Modified: trunk/gcc/config/i386/i386-protos.h trunk/gcc/config/i386/i386.c trunk/gcc/config/i386/i386.md trunk/gcc/testsuite/ChangeLog
[Bug target/53201] [4.8 Regression] unrecognized command line option '-mno-lzcnt-mno-hle
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53201 --- Comment #4 from Kirill Yukhin 2012-05-03 06:50:25 UTC --- Author: kyukhin Date: Thu May 3 06:50:16 2012 New Revision: 187075 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=187075 Log: PR target/53201 * config/i386/driver-i386.c (host_detect_local_cpu): Add space to "-mno-hle". Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/driver-i386.c
[Bug target/53435] (ix86_expand_vec_perm) and (ix86_expand_vec_perm) do not pass arguments to avx2_permvar8s[f,i] correctly
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53435 --- Comment #4 from Kirill Yukhin 2012-05-25 13:03:21 UTC --- Author: kyukhin Date: Fri May 25 13:03:18 2012 New Revision: 187881 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=187881 Log: 2012-05-21 Alexander Ivchenko PR target/53435 * config/i386/i386.c (ix86_expand_vec_perm): Use correct op. (ix86_expand_vec_perm): Use int mode instead of float. (expand_vec_perm_pshufb): Remove handling of useseless type conversion. Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/i386.c
[Bug target/53435] (ix86_expand_vec_perm) and (ix86_expand_vec_perm) do not pass arguments to avx2_permvar8s[f,i] correctly
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53435 --- Comment #5 from Kirill Yukhin 2012-05-25 13:34:12 UTC --- Author: kyukhin Date: Fri May 25 13:34:07 2012 New Revision: 187882 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=187882 Log: 2012-05-25 Alexander Ivchenko PR target/53435 * config/i386/i386.c (ix86_expand_vec_perm): Use correct op. (ix86_expand_vec_perm): Use int mode instead of float. Modified: branches/gcc-4_7-branch/gcc/ChangeLog branches/gcc-4_7-branch/gcc/config/i386/i386.c
[Bug target/53877] __lzcnt_u16/__lzcnt_u32/__lzcnt_u64 aren't implemented
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53877 --- Comment #1 from Kirill Yukhin 2012-07-20 08:24:35 UTC --- Author: kyukhin Date: Fri Jul 20 08:24:24 2012 New Revision: 189703 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=189703 Log: 2012-07-20 Kirill Yukhin PR target/53877 * config/i386/lzcntintrin.h (_lzcnt_u32): New. (_lzcnt_u64): Ditto. Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/lzcntintrin.h
[Bug target/53877] __lzcnt_u16/__lzcnt_u32/__lzcnt_u64 aren't implemented
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53877 --- Comment #2 from Kirill Yukhin 2012-07-20 08:57:09 UTC --- Author: kyukhin Date: Fri Jul 20 08:57:04 2012 New Revision: 189706 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=189706 Log: 2012-07-20 Kirill Yukhin PR target/53877 * config/i386/lzcntintrin.h (_lzcnt_u32): New. (_lzcnt_u64): Ditto. Modified: branches/gcc-4_7-branch/gcc/ChangeLog branches/gcc-4_7-branch/gcc/config/i386/lzcntintrin.h
[Bug target/57491] [ia64] internal compiler error: in ia64_split_tmode -O2, quadmath
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57491 --- Comment #1 from Kirill Yukhin --- Author: kyukhin Date: Thu Nov 14 08:33:21 2013 New Revision: 204777 URL: http://gcc.gnu.org/viewcvs?rev=204777&root=gcc&view=rev Log: PR target/57491 * config/ia64/ia64.c (ia64_split_tmode_move): Relax `dead' flag setting. Modified: trunk/gcc/ChangeLog trunk/gcc/config/ia64/ia64.c
[Bug target/57756] Function target attribute is retaining state of previously seen function
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57756 --- Comment #10 from Kirill Yukhin --- Author: kyukhin Date: Wed Nov 20 11:59:05 2013 New Revision: 205104 URL: http://gcc.gnu.org/viewcvs?rev=205104&root=gcc&view=rev Log: PR target/57756 * config/i386/i386.c (ix86_option_override_internal): Add missed argument prefix for 'ix86_fpmath'. * config/i386/ssemath.h: Add missed definition of TARGET_FPMATH_DEFAULT_P macros. Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/i386.c trunk/gcc/config/i386/ssemath.h
[Bug target/51287] [4.7 regression] 252.eon compfail with -march=atom
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51287 --- Comment #1 from Kirill Yukhin 2011-11-25 09:46:31 UTC --- Author: kyukhin Date: Fri Nov 25 09:46:27 2011 New Revision: 181713 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=181713 Log: PR target/51287 * i386.c (distance_non_agu_define): Fix insn attr check. Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/i386.c
[Bug target/51287] [4.7 regression] 252.eon compfail with -march=atom
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51287 --- Comment #2 from Kirill Yukhin 2011-11-25 10:29:46 UTC --- Author: kyukhin Date: Fri Nov 25 10:29:42 2011 New Revision: 181714 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=181714 Log: 2011-11-24 Enkovich Ilya PR target/51287 * i386.c (distance_non_agu_define): Fix insn attr check. Modified: branches/gcc-4_6-branch/gcc/ChangeLog branches/gcc-4_6-branch/gcc/config/i386/i386.c
[Bug target/51524] New: [BMI2] New regression on 182266 vs 182257
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51524 Bug #: 51524 Summary: [BMI2] New regression on 182266 vs 182257 Classification: Unclassified Product: gcc Version: 4.7.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target AssignedTo: unassig...@gcc.gnu.org ReportedBy: kyuk...@gcc.gnu.org Hi, Seems we've got new regression on trunk: FAIL: gcc.target/i386/bmi2-mulx32-1a.c scan-assembler-times bmi2_umulsidi3_1 1 FAIL: gcc.target/i386/bmi2-mulx32-2a.c scan-assembler-times mulx[ \\t]+[^\n]* 1
[Bug target/50038] redundant zero extensions
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50038 --- Comment #8 from Kirill Yukhin 2011-12-21 11:52:32 UTC --- Author: kyukhin Date: Wed Dec 21 11:52:27 2011 New Revision: 182574 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=182574 Log: gcc/ 2011-12-21 Enkovich Ilya PR target/50038 * implicit-zee.c: Delete. * ree.c: New file. * Makefile.in: Replace implicit-zee.c with ree.c. * config/i386/i386.c (ix86_option_override_internal): Rename flag_zee to flag_ree. * common.opt (fzee): Ignored. (free): New. * passes.c (init_optimization_passes): Replace pass_implicit_zee with pass_ree. * tree-pass.h (pass_implicit_zee): Delete. (pass_ree): New. * timevar.def (TV_ZEE): Delete. (TV_REE): New. * doc/invoke.texi: Add -free description. gcc/testsuite/ 2011-12-21 Enkovich Ilya PR target/50038 Added: trunk/gcc/ree.c trunk/gcc/testsuite/gcc.dg/pr50038.c Removed: trunk/gcc/implicit-zee.c Modified: trunk/gcc/ChangeLog trunk/gcc/Makefile.in trunk/gcc/common.opt trunk/gcc/config/i386/i386.c trunk/gcc/doc/invoke.texi trunk/gcc/passes.c trunk/gcc/testsuite/ChangeLog trunk/gcc/timevar.def trunk/gcc/tree-pass.h
[Bug target/83828] FAIL: gcc.target/i386/avx512vpopcntdqvl-vpopcntq-1.c execution test
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83828 Kirill Yukhin changed: What|Removed |Added Status|NEW |ASSIGNED CC||kyukhin at gcc dot gnu.org --- Comment #6 from Kirill Yukhin --- Looks like avx512bw demand is excessive in avx512bitalgintrin.h
[Bug target/83828] FAIL: gcc.target/i386/avx512vpopcntdqvl-vpopcntq-1.c execution test
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83828 --- Comment #7 from Kirill Yukhin --- On the other hand, if masked variant of vpopcnt[w,q] is being issued: there's no way for reload to put 32/64 bit mask into mask register, since kmov[d,q] are only available under -mavx512bw switch. We can insist user to issue -mavx512bw along w/ -mavx512bitalg if she is going to use masked variants of corresponding intrinsics. Then only tests need to be fixed.
[Bug target/83828] FAIL: gcc.target/i386/avx512vpopcntdqvl-vpopcntq-1.c execution test
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83828 --- Comment #8 from Kirill Yukhin --- Author: kyukhin Date: Tue Jan 30 08:21:22 2018 New Revision: 257173 URL: https://gcc.gnu.org/viewcvs?rev=257173&root=gcc&view=rev Log: Fix AVX-512BITALG test failures gcc/testsuite PR target/83828 * gcc.target/i386/avx512bitalg-vpopcntb-1.c: Fix test. * gcc.target/i386/avx512bitalg-vpopcntw-1.c: Ditto. * gcc.target/i386/avx512bitalgvl-vpopcntb-1.c: Ditto. * gcc.target/i386/avx512bitalgvl-vpopcntw-1.c: Ditto. Modified: trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.target/i386/avx512bitalg-vpopcntb-1.c trunk/gcc/testsuite/gcc.target/i386/avx512bitalg-vpopcntw-1.c trunk/gcc/testsuite/gcc.target/i386/avx512bitalgvl-vpopcntb-1.c trunk/gcc/testsuite/gcc.target/i386/avx512bitalgvl-vpopcntw-1.c
[Bug target/83828] FAIL: gcc.target/i386/avx512vpopcntdqvl-vpopcntq-1.c execution test
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83828 --- Comment #10 from Kirill Yukhin --- HJ, I cannot reproduce this fail on recent SDE. Here's what I have in gcc.log: spawn -ignore SIGHUP /export/kyukhin/gcc/bld-svn/build-x86_64-linux/gcc/xgcc -B/export/kyukhin/gcc/bld-svn/build-x86_64-linux/gcc/ /export/kyukhin/gcc/svn/trunk/gcc/testsuite/gcc.target/i386/avx512bitalgvl-vpopc\ ntb-1.c -fno-diagnostics-show-caret -fdiagnostics-color=never -O2 -mavx512vl -mavx512bitalg -mavx512bw -lm -o ./avx512bitalgvl-vpopcntb-1.exe^M PASS: gcc.target/i386/avx512bitalgvl-vpopcntb-1.c (test for excess errors) Setting LD_LIBRARY_PATH to :/export/kyukhin/gcc/bld-svn/build-x86_64-linux/gcc:/export/kyukhin/gcc/bld-svn/build-x86_64-linux/gcc/32: spawn /home/kyukhin/bin/dejagnu/sde-sim ./avx512bitalgvl-vpopcntb-1.exe^M PASS: gcc.target/i386/avx512bitalgvl-vpopcntb-1.c execution test I've also verified manually that test PASS, not SKIPPED. Could you pls send some more info on failure?
[Bug target/83828] FAIL: gcc.target/i386/avx512vpopcntdqvl-vpopcntq-1.c execution test
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83828 --- Comment #12 from Kirill Yukhin --- Author: kyukhin Date: Mon Feb 12 06:14:15 2018 New Revision: 257579 URL: https://gcc.gnu.org/viewcvs?rev=257579&root=gcc&view=rev Log: Fix AVX-512 popcnt and bitalg tests. gcc/testsuite/ PR target/83828 * gcc.target/i386/avx512bitalg-vpopcntb-1.c: Fix test. * gcc.target/i386/avx512bitalg-vpopcntw-1.c: Ditto. * gcc.target/i386/avx512bitalg-vpshufbitqmb-1.c: Ditto. * gcc.target/i386/avx512vpopcntdq-vpopcntd-1.c: Ditto. * gcc.target/i386/avx512vpopcntdq-vpopcntq-1.c: Ditto. Modified: trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.target/i386/avx512bitalg-vpopcntb-1.c trunk/gcc/testsuite/gcc.target/i386/avx512bitalg-vpopcntw-1.c trunk/gcc/testsuite/gcc.target/i386/avx512bitalg-vpshufbitqmb-1.c trunk/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntd-1.c trunk/gcc/testsuite/gcc.target/i386/avx512vpopcntdq-vpopcntq-1.c
[Bug fortran/69524] New: [ICE] [F2008] Compiler segfaults on simple testcase @ -O0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69524 Bug ID: 69524 Summary: [ICE] [F2008] Compiler segfaults on simple testcase @ -O0 Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: kyukhin at gcc dot gnu.org Target Milestone: --- Created attachment 37501 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37501&action=edit Reproducer Attached testcase produces ICE while compiling w/ recent trunk: $ ./build-x86_64-linux/gcc/gfortran -B./build-x86_64-linux/gcc -S 2.f08 f951: internal compiler error: in build_function_decl, at fortran/trans-decl.c:2065 0x88c1df build_function_decl /export/users/kyukhin/gcc/git/gcc2/gcc/fortran/trans-decl.c:2065 0x88ec53 gfc_create_function_decl(gfc_namespace*, bool) /export/users/kyukhin/gcc/git/gcc2/gcc/fortran/trans-decl.c:2758 0x86361d gfc_generate_module_code(gfc_namespace*) /export/users/kyukhin/gcc/git/gcc2/gcc/fortran/trans.c:2043 0x7f9d19 translate_all_program_units /export/users/kyukhin/gcc/git/gcc2/gcc/fortran/parse.c:5599 0x7fa3f7 gfc_parse_file() /export/users/kyukhin/gcc/git/gcc2/gcc/fortran/parse.c:5818 0x84c839 gfc_be_parse_file /export/users/kyukhin/gcc/git/gcc2/gcc/fortran/f95-lang.c:201 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <http://gcc.gnu.org/bugs.html> for instructions.
[Bug target/69118] Wrong condition in avx512f_maskcmp3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69118 Kirill Yukhin changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2016-02-03 CC||kyukhin at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #1 from Kirill Yukhin --- Will fix.
[Bug target/69120] sse2_shufpd_v2df_mask has wrong name
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69120 --- Comment #1 from Kirill Yukhin --- Will fix.
[Bug libfortran/69651] New: Usage of unitialized pointer io/list_read.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69651 Bug ID: 69651 Summary: Usage of unitialized pointer io/list_read.c Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libfortran Assignee: unassigned at gcc dot gnu.org Reporter: kyukhin at gcc dot gnu.org Target Milestone: --- Unfortunately I have no testcase. But code itself looks awful to me: /* Worker function to save a KIND=4 character to a string buffer, enlarging the buffer as necessary. */ static void push_char4 (st_parameter_dt *dtp, int c) { gfc_char4_t *new, *p = (gfc_char4_t *) dtp->u.p.saved_string; if (p == NULL) { dtp->u.p.saved_string = xcalloc (SCRATCH_SIZE, sizeof (gfc_char4_t)); dtp->u.p.saved_length = SCRATCH_SIZE; dtp->u.p.saved_used = 0; p = (gfc_char4_t *) dtp->u.p.saved_string; } if (dtp->u.p.saved_used >= dtp->u.p.saved_length) { dtp->u.p.saved_length = 2 * dtp->u.p.saved_length; p = xrealloc (p, dtp->u.p.saved_length * sizeof (gfc_char4_t)); memset4 (new + dtp->u.p.saved_used, 0, // <-- ??? new==junk ??? dtp->u.p.saved_length - dtp->u.p.saved_used); } p[dtp->u.p.saved_used++] = c; } It was introduced w/ r210948 (https://gcc.gnu.org/ml/fortran/2014-05/msg00149.html). Before that new was [at least] initialized.
[Bug libfortran/69651] Usage of unitialized pointer io/list_read.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69651 --- Comment #1 from Kirill Yukhin --- File is: libgfortran/io/list_read.c
[Bug target/69120] sse2_shufpd_v2df_mask has wrong name
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69120 --- Comment #2 from Kirill Yukhin --- Looked closely. The name was chosen intentionally to simplify "sse2_shufpd" expand. If we want to fix this name - new subst attribute need to be introduced and if () emit_insn (avx512vl_... else emit_insn (sse2_... inserted into the expand. Beside of the expand this template never called by name. So, I bet to have the name unchanged and keep things simple.
[Bug tree-optimization/69652] New: [6 Regression] [ICE] verify_ssa fail w/ -O2 -ffast-math -ftree-vectorize
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69652 Bug ID: 69652 Summary: [6 Regression] [ICE] verify_ssa fail w/ -O2 -ffast-math -ftree-vectorize Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: kyukhin at gcc dot gnu.org Target Milestone: --- Created attachment 37569 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37569&action=edit Reproducer While building huge workload I've encountered ICE. creduce finished w/ attached reproducer. Reproduce: $ ./gcc -O2 -ffast-math -ftree-vectorize -march=sandybridge repro.i repro.i:1:1: warning: return type defaults to ‘int’ [-Wimplicit-int] fn1() { ^~~ repro.i: In function ‘fn1’: repro.i:1:1: error: definition in block 8 does not dominate use in block 7 for SSA_NAME: .MEM_134 in statement: # VUSE <.MEM_134> _24 = *_13; repro.i:1:1: internal compiler error: verify_ssa failed 0xd12f09 verify_ssa(bool, bool) /export/users/kyukhin/gcc/git/gcc/gcc/tree-ssa.c:1039 0xa52dad execute_function_todo /export/users/kyukhin/gcc/git/gcc/gcc/passes.c:1965 0xa5363b execute_todo /export/users/kyukhin/gcc/git/gcc/gcc/passes.c:2010 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <http://gcc.gnu.org/bugs.html> for instructions. Recent gcc-5-branch works fine w/ the case
[Bug target/69120] sse2_shufpd_v2df_mask has wrong name
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69120 Kirill Yukhin changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |WONTFIX --- Comment #4 from Kirill Yukhin --- This makes corresponding expand much simpler. So I think this little naming inconsistency is preferable to additional checks in define_expand.
[Bug target/69118] Wrong condition in avx512f_maskcmp3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69118 --- Comment #2 from Kirill Yukhin --- Author: kyukhin Date: Wed Feb 3 13:44:50 2016 New Revision: 233103 URL: https://gcc.gnu.org/viewcvs?rev=233103&root=gcc&view=rev Log: PR target/69118 gcc/ * config/i386/sse.md (define_insn "avx512f_maskcmp3"): Fix target. Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/sse.md
[Bug target/69118] Wrong condition in avx512f_maskcmp3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69118 --- Comment #3 from Kirill Yukhin --- Author: kyukhin Date: Wed Feb 3 13:48:27 2016 New Revision: 233104 URL: https://gcc.gnu.org/viewcvs?rev=233104&root=gcc&view=rev Log: PR target/69118. gcc/ * config/i386/sse.md (define_insn "avx512f_maskcmp3"): Fix target. Modified: branches/gcc-5-branch/gcc/ChangeLog branches/gcc-5-branch/gcc/config/i386/sse.md
[Bug target/69118] Wrong condition in avx512f_maskcmp3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69118 Kirill Yukhin changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #4 from Kirill Yukhin --- Fixed.
[Bug target/69671] [6 Regression] FAIL: gcc.target/i386/avx512vl-vpmovqb-1.c scan-assembler-times vpmovqb[ \\t]+[^{\n]*%ymm[0-9]+[^\n]*%xmm[0-9]+{%k[1-7]}{z}(?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69671 --- Comment #5 from Kirill Yukhin --- (In reply to ktkachov from comment #3) > CC'ing Kirill for AVX512 opinion I suppose that there's something wrong w/ MD patterns. E.g. for example provided pattern is: ;; /export/users/kyukhin/gcc/git/gcc/gcc/config/i386/sse.md: 9199 (define_insn ("avx512vl_truncatev4siv4qi2_mask") [ (set (match_operand:V16QI 0 ("register_operand") ("=v")) (vec_concat:V16QI (vec_merge:V4QI (truncate:V4QI (match_operand:V4SI 1 ("register_operand") ("v"))) (vec_select:V4QI (match_operand:V16QI 2 ("vector_move_operand") ("0C")) (parallel [ (const_int 0 [0]) (const_int 1 [0x1]) (const_int 2 [0x2]) (const_int 3 [0x3]) ])) (match_operand:QI 3 ("register_operand") ("Yk"))) (const_vector:V12QI [ (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) (const_int 0 [0]) Right now I think that 2nd operand predicate is not correct. It should be const0_rtx (of corresponding mode) or duplicate of operand 0 (result actually) This is whats contstraint tells. However predicate says simply that operand is either const0_rtx or nonimmediate: no connection with operand 0.
[Bug target/69671] [6 Regression] FAIL: gcc.target/i386/avx512vl-vpmovqb-1.c scan-assembler-times vpmovqb[ \\t]+[^{\n]*%ymm[0-9]+[^\n]*%xmm[0-9]+{%k[1-7]}{z}(?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69671 --- Comment #6 from Kirill Yukhin --- This bug seems to be mine.
[Bug target/69671] [6 Regression] FAIL: gcc.target/i386/avx512vl-vpmovqb-1.c scan-assembler-times vpmovqb[ \\t]+[^{\n]*%ymm[0-9]+[^\n]*%xmm[0-9]+{%k[1-7]}{z}(?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69671 Kirill Yukhin changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |kyukhin at gcc dot gnu.org
[Bug target/69671] [6 Regression] FAIL: gcc.target/i386/avx512vl-vpmovqb-1.c scan-assembler-times vpmovqb[ \\t]+[^{\n]*%ymm[0-9]+[^\n]*%xmm[0-9]+{%k[1-7]}{z}(?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69671 --- Comment #8 from Kirill Yukhin --- (In reply to Jakub Jelinek from comment #7) > So do you want to use reg_or_0_operand? I don't think we usually tie output > with input already in the predicates, except when match_dup is used. That is the issue. reg_or_0_operand won't work (although it is better than "vector_move_operand" since it is prohibits memory) We want 2nd operand to be either: 1. const0_rtx 2. match_dup 0 I cannot see in gcc/genpreds.c if a reference to another operands is possible from the other. We might invent some complicated subst. But patterns look too complicated for that. Maybe extend genpreds.c and friends introducing new version of predicate which will take instead of (op, mode) -> (op, mode, operands). Not sure in volume of efforts though. Really hope there's some simpler solution.
[Bug target/69671] [6 Regression] FAIL: gcc.target/i386/avx512vl-vpmovqb-1.c scan-assembler-times vpmovqb[ \\t]+[^{\n]*%ymm[0-9]+[^\n]*%xmm[0-9]+{%k[1-7]}{z}(?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69671 --- Comment #10 from Kirill Yukhin --- (In reply to Jakub Jelinek from comment #9) > But something like that might remove the flexibility from the register > allocator. > > Wonder why the RA in this case doesn't see that the value loaded into that > pseudo register is CONST0_RTX which satisfies the C constraint and doesn't > undo CSE (rematerialize) in that case if it doesn't have that value already > loaded in the matching register to the output one. Then I see two options: 1. Split all patterns into match_dup and 0_operand by hand 2. Implement dedicated subst for such a patterns which will do p.1 while processing MD. Not sure it'll be easy
[Bug libfortran/69651] Usage of unitialized pointer io/list_read.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69651 --- Comment #4 from Kirill Yukhin --- Created attachment 37628 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37628&action=edit Reproducer input
[Bug libfortran/69651] Usage of unitialized pointer io/list_read.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69651 --- Comment #3 from Kirill Yukhin --- Created attachment 37627 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37627&action=edit Reproducer src Reproducer
[Bug libfortran/69651] Usage of unitialized pointer io/list_read.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69651 --- Comment #5 from Kirill Yukhin --- A bug in fortran's IO RT has emerged during 21 Apr 2016, between r54 and r92; looks like it's caused by the same revision –r71 (libgfortran/io/list_read.c ), which probably just triggers another hidden bug. Trying two builds (as of 21 and 22 Apr ): $ gfortran-20160421 -O0 T.f90 -static $ ./a.out res, (1) ==1 ! ### Ok $ gfortran-20160422 -O0 T.f90 -static $ ./a.out res, (1) == 80 @p¼B ### FAIL – garbage is read in
[Bug target/69671] [6 Regression] FAIL: gcc.target/i386/avx512vl-vpmovqb-1.c scan-assembler-times vpmovqb[ \\t]+[^{\n]*%ymm[0-9]+[^\n]*%xmm[0-9]+{%k[1-7]}{z}(?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69671 --- Comment #14 from Kirill Yukhin --- Okay, I've tried: 1. Run AVX-512 testing on Spec2006 and see no impact of the one-liner: Geomeans: INT : 5.11 5.11+0.05% FP : 2.73 2.73-0.08% ALL : 3.54 3.54-0.02% 2. Tried Uroš's proposal. Adding to guilty pattern a condition like this: "TARGET_AVX512VL && ((REG_P (operands[2]) && REG_P (operands[0]) && REGNO (operands[0]) == REGNO (operands[2])) || (operands[2] == CONST0_RTX (mode)))" No success as well. The problem is that zero-masked built-in have register as second sorce at expand. Which when rematerializes to zero. So, setting this condition will lead to ICE in recog @ expand. So, for v6 it looks like we need to remove one-liner. For v7 we need to extend define_subst a bit to allow multiple output patterns. E.g. currently: (define_subst "mask" [(set (match_operand:SUBST_V 0) (match_operand:SUBST_V 1))] "TARGET_AVX512F" [(set (match_dup 0) (vec_merge:SUBST_V (match_dup 1) (match_operand:SUBST_V 2 "vector_move_operand" "0C") (match_operand: 3 "register_operand" "Yk")))]) It'd solve a problem if we'll had this instead: (define_subst "mask" [(set (match_operand:SUBST_V 0) (match_operand:SUBST_V 1))] "TARGET_AVX512F" [(set (match_dup 0) (vec_merge:SUBST_V (match_dup 1) (match_dup 0) (match_operand: 3 "register_operand" "Yk")))]) [(set (match_dup 0) (vec_merge:SUBST_V (match_dup 1) (match_operand:SUBST_V 2 "const0_operand" "C") (match_operand: 3 "register_operand" "Yk")))]) Opinions?
[Bug target/69671] [6 Regression] FAIL: gcc.target/i386/avx512vl-vpmovqb-1.c scan-assembler-times vpmovqb[ \\t]+[^{\n]*%ymm[0-9]+[^\n]*%xmm[0-9]+{%k[1-7]}{z}(?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69671 --- Comment #21 from Kirill Yukhin --- I am going to fix the issue in v7 for sure. But from current point of view this is going to be great pattern refactoring and hence patch will be thousands of lines. If this might be ported - I can put an XFAIL on the tests
[Bug target/69671] [6 Regression] FAIL: gcc.target/i386/avx512vl-vpmovqb-1.c scan-assembler-times vpmovqb[ \\t]+[^{\n]*%ymm[0-9]+[^\n]*%xmm[0-9]+{%k[1-7]}{z}(?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69671 --- Comment #24 from Kirill Yukhin --- (In reply to rguent...@suse.de from comment #23) > On Wed, 17 Feb 2016, jakub at gcc dot gnu.org wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69671 > > > > --- Comment #22 from Jakub Jelinek --- > > Created attachment 37722 [details] > > --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37722&action=edit > > gcc6-pr69671.patch > > > > Actually, on a closer look, I believe the only problem are the patterns that > > use a vector_move_operand "0C" inside of vec_select with only constants as > > the > > parallel's operands. Because fwprop is able to propagate constants into > > instructions (thus undo the CSE effect), but doesn't do anything on these, > > because it also simplifies them, so instead of the expected say > > (vec_select:V4QI (const_vector:V16QI [ > > (const_int 0 [0]) > > (const_int 0 [0]) > > (const_int 0 [0]) > > (const_int 0 [0]) > > (const_int 0 [0]) > > (const_int 0 [0]) > > (const_int 0 [0]) > > (const_int 0 [0]) > > (const_int 0 [0]) > > (const_int 0 [0]) > > (const_int 0 [0]) > > (const_int 0 [0]) > > (const_int 0 [0]) > > (const_int 0 [0]) > > (const_int 0 [0]) > > (const_int 0 [0]) > > ]) > > (parallel [ > > (const_int 0 [0]) > > (const_int 1 [0x1]) > > (const_int 2 [0x2]) > > (const_int 3 [0x3]) > > ])) > > we get in there simplified: > > (const_vector:V4QI [ > > (const_int 0 [0]) > > (const_int 0 [0]) > > (const_int 0 [0]) > > (const_int 0 [0]) > > ]) > > So, by adding extra patterns for that simplification fwprop is able to do > > its > > job even if CSE did a better job. > > Of course then I wonder why we didn't simplify this in the first place > when generating RTL and need to wait for forwprop ... > > But yes, sounds like the easiest way to go forward. Agree.
[Bug tree-optimization/69882] New: [6 regression] Excessive reduction statements generated by SLP
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69882 Bug ID: 69882 Summary: [6 regression] Excessive reduction statements generated by SLP Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: kyukhin at gcc dot gnu.org Target Milestone: --- Created attachment 37743 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37743&action=edit Reproducer Hello, Attached test case emits wrong reduction statements. Compile: $ trunk/64/20160220/bin/gfortran -o repro -static -m64 -Ofast -mavx repro.f90 Execution ABORTs Works fine when compiled w/ -O0 Extract from vectorizer dump: : # k_239 = PHI # c_I_lsm.10_241 = PHI # c_I_lsm.11_242 = PHI # vectp_a.47_406 = PHI # vect_M.50_410 = PHI # ivtmp_420 = PHI <0(48), ivtmp_421(56)> _245 = (integer(kind=8)) k_239; _246 = _245 * 4; _247 = _246 + -4; _248 = *a_22(D)[_247]; M.0_249 = MAX_EXPR <_248, c_I_lsm.10_241>; _250 = _246 + -3; vect__248.49_408 = MEM[(real(kind=8) *)vectp_a.47_406]; <-- SLP vectp_a.47_409 = vectp_a.47_406 + 32; _251 = *a_22(D)[_250]; vect_M.50_411 = MAX_EXPR ; <-- SLP M.0_252 = MAX_EXPR <_251, c_I_lsm.11_242>; k_266 = k_239 + 1; vectp_a.47_407 = vectp_a.47_409 + 32; < -- SLP ivtmp_421 = ivtmp_420 + 1; if (ivtmp_421 >= bnd.44_361) goto ; else goto ; : ... # REMAINDER k_377 = k_365 + 1; if (k_365 == 26) goto ; else goto ; : goto ; : # k_381 = PHI # c_I_lsm.10_384 = PHI # c_I_lsm.11_386 = PHI # c_I_lsm.13_389 = PHI # c_I_lsm.12_392 = PHI # vect_M.50_413 = PHI stmp_M.51_414 = BIT_FIELD_REF ; stmp_M.51_415 = BIT_FIELD_REF ; stmp_M.51_416 = BIT_FIELD_REF ; stmp_M.51_417 = BIT_FIELD_REF ; stmp_M.51_418 = MAX_EXPR ; # <-- WHOT?? stmp_M.51_419 = MAX_EXPR ; # <-- DITTO. _401 = (integer(kind=4)) ratio_mult_vf.45_364; tmp.46_400 = k.4_11 + _401; if (niters.42_358 == ratio_mult_vf.45_364) goto ; else goto ; Those 2 SSA names are then stored to 1st and 2nd array elements
[Bug tree-optimization/69956] New: [ICE] Wrong vector type @ fold-const
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69956 Bug ID: 69956 Summary: [ICE] Wrong vector type @ fold-const Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: kyukhin at gcc dot gnu.org Target Milestone: --- Created attachment 37789 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37789&action=edit Reproducer Hello, Attached testcase produces ICE when compiled as following: gcc -S -O2 -march=skylake-avx512 repro.i -ftree-vectorize I observe the ICE since 02.02.2016 /nfs/ims/home/kyukhin/repro.i:2:1: internal compiler error: tree check: expected vector_type, have integer_type in co\ nst_unop, at fold-const.c:1665 fn1() { ^~~ 0xda1f9c tree_check_failed(tree_node const*, char const*, int, char const*, ...) /export/users/gnutester/stability/svn/trunk/gcc/tree.c:9637 0x860742 tree_check(tree_node*, char const*, int, char const*, tree_code) /export/users/gnutester/stability/svn/trunk/gcc/tree.h:3006 0x860742 const_unop(tree_code, tree_node*, tree_node*) /export/users/gnutester/stability/svn/trunk/gcc/fold-const.c:1665 0xe7f639 gimple_resimplify1(gimple**, code_helper*, tree_node*, tree_node**, tree_node* (*)(tree_node*)) /export/users/gnutester/stability/svn/trunk/gcc/gimple-match-head.c:85 0xee84b3 gimple_simplify(gimple*, code_helper*, tree_node**, gimple**, tree_node* (*)(tree_node*), tree_node* (*)(tre\ e_node*)) /export/users/gnutester/stability/svn/trunk/gcc/gimple-match-head.c:622 0x8a0933 gimple_fold_stmt_to_constant_1(gimple*, tree_node* (*)(tree_node*), tree_node* (*)(tree_node*)) /export/users/gnutester/stability/svn/trunk/gcc/gimple-fold.c:4981 0xc409d2 back_propagate_equivalences /export/users/gnutester/stability/svn/trunk/gcc/tree-ssa-dom.c:881 0xc409d2 record_temporary_equivalences(edge_def*, const_and_copies*, avail_exprs_stack*) /export/users/gnutester/stability/svn/trunk/gcc/tree-ssa-dom.c:963 0xd0663a thread_through_normal_block /export/users/gnutester/stability/svn/trunk/gcc/tree-ssa-threadedge.c:858 0xd07a22 thread_across_edge(gcond*, edge_def*, bool, const_and_copies*, avail_exprs_stack*, tree_node* (*)(gimple*, g\ imple*, avail_exprs_stack*)) /export/users/gnutester/stability/svn/trunk/gcc/tree-ssa-threadedge.c:1005 0xc404c0 dom_opt_dom_walker::thread_across_edge(edge_def*) /export/users/gnutester/stability/svn/trunk/gcc/tree-ssa-dom.c:989 0xc406eb dom_opt_dom_walker::after_dom_children(basic_block_def*) /export/users/gnutester/stability/svn/trunk/gcc/tree-ssa-dom.c:1423 0x11a47a7 dom_walker::walk(basic_block_def*) /export/users/gnutester/stability/svn/trunk/gcc/domwalk.c:307 0xc432a0 execute /export/users/gnutester/stability/svn/trunk/gcc/tree-ssa-dom.c:614 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <http://gcc.gnu.org/bugs.html> for instructions. I suspect scalar masks.
[Bug tree-optimization/69980] New: [6 regression] Supposedly wrong SLP code emitted
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69980 Bug ID: 69980 Summary: [6 regression] Supposedly wrong SLP code emitted Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: kyukhin at gcc dot gnu.org Target Milestone: --- Created attachment 37806 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37806&action=edit Reproducer Hello, Attached test runfails when compiled is following: $ gfortran -m64 -Ofast repro.f90 -msse When compiled w/ -O2 - it works fine. Second loop nest is just for verification. Issue lives here: mumax = 0; do k=1,26 do i=1,3 mumax(i) = max(mumax(i), mu(i,k)+mu(i,k)) end do end do Looks like SLP emits some wrong permutations here.
[Bug target/70028] Error: operand size mismatch for `kmovw' (wrong assembly generated) with -mavx512bw -masm=intel
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70028 Kirill Yukhin changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2016-03-01 CC||kyukhin at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #2 from Kirill Yukhin --- Confirmed. The issue is that operand modifier passed in .md file is %k1, which stands for SI mode. It should be 32b reg or 16b memory, i.e. %ebx and WORD.
[Bug target/70028] Error: operand size mismatch for `kmovw' (wrong assembly generated) with -mavx512bw -masm=intel
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70028 --- Comment #4 from Kirill Yukhin --- (In reply to Jakub Jelinek from comment #3) > Created attachment 37835 [details] > gcc6-pr70028.patch > > So what about this patch then? I don't see kmov* used with %k in other > patterns, where "m" could appear. Hi Jakub, patch is fine to me.
[Bug target/70293] [ICE, AVX-512] Wrong reg constraints in vec_dup
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70293 --- Comment #3 from Kirill Yukhin --- Regtest is in progress
[Bug target/70293] [ICE, AVX-512] Wrong reg constraints in vec_dup
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70293 --- Comment #2 from Kirill Yukhin --- Created attachment 38020 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38020&action=edit Proposed patch Attached patch solves the issue by blocking AVX2's broadcast pattern alternative: $r->Yi, which is subject of split2
[Bug target/70293] [ICE, AVX-512] Wrong reg constraints in vec_dup
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70293 --- Comment #1 from Kirill Yukhin --- We've got duplication of patterns (make mddump): ;; /export/users/kyukhin/gcc/git/gcc2/gcc/config/i386/sse.md: 17107 (define_insn ("avx2_pbroadcastv8hi") [ (set (match_operand:V8HI 0 ("register_operand") ("=x")) (vec_duplicate:V8HI (vec_select:HI (match_operand:V8HI 1 ("nonimmediate_operand") ("xm")) (parallel [ (const_int 0 [0]) ] ] ("TARGET_AVX2") ("vpbroadcastw\t{%1, %0|%0, %w1}") [ (set_attr ("type") ("ssemov")) (set_attr ("prefix_extra") ("1")) (set_attr ("prefix") ("vex")) (set_attr ("mode") ("TI")) ]) ... (define_insn ("avx512vl_vec_dupv8hi") [ (set (match_operand:V8HI 0 ("register_operand") ("=v")) (vec_duplicate:V8HI (vec_select:HI (match_operand:V8HI 1 ("nonimmediate_operand") ("vm")) (parallel [ (const_int 0 [0]) ] ] ("(TARGET_AVX512BW) && (TARGET_AVX512VL)") ("vpbroadcastw\t{%1, %0|%0, %1}") [ (set_attr ("type") ("ssemov")) (set_attr ("prefix") ("evex")) (set_attr ("mode") ("TI")) ]) That's why we've got unsatisfied constraints on xmmN, N>15.
[Bug target/70293] New: [ICE, AVX-512] Wrong reg constraints in vec_dup
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70293 Bug ID: 70293 Summary: [ICE, AVX-512] Wrong reg constraints in vec_dup Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: kyukhin at gcc dot gnu.org Target Milestone: --- Created attachment 38018 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38018&action=edit Reproducer Attached testcase ICEs when compiled as: ./xgcc -B. -mtune=broadwell -mavx512vl -O2 -S ~/pixman-sse.i 1_0.32.6-r0/pixman-0.32.6/pixman/pixman-sse2.c: In function ‘fast_composite_scaled_bilinear_sse2__8__none\ _OVER’: /home/donn/c/8.x/wrl-projects/intel-skylake-standard-glibc_std/bitbake_build/tmp/work/skylake-avx512-64-wrs-linux/pix\ man/1_0.32.6-r0/pixman-0.32.6/pixman/pixman-sse2.c:6059:1: error: insn does not satisfy its constraints: (insn 5050 5049 1065 58 (set (reg/v:V8HI 56 xmm19 [orig:670 D.27517 ] [670]) (vec_duplicate:V8HI (vec_select:HI (reg/v:V8HI 56 xmm19 [orig:670 D.27517 ] [670]) (parallel [ (const_int 0 [0]) ] /home/donn/c/8.x/wrl-projects/intel-skylake-standard-glibc_std/bitbake_build/tmp/sysroots/x\ 86_64-linux/usr/lib/x86_64-wrs-linux/gcc/x86_64-wrs-linux/5.2.0/include/emmintrin.h:606 4153 {avx2_pbroadcastv8hi} (nil)) /home/donn/c/8.x/wrl-projects/intel-skylake-standard-glibc_std/bitbake_build/tmp/work/skylake-avx512-64-wrs-linux/pix\ man/1_0.32.6-r0/pixman-0.32.6/pixman/pixman-sse2.c:6059:1: internal compiler error: in extract_constrain_insn, at rec\ og.c:2190 0xdaccab _fatal_insn(char const*, rtx_def const*, char const*, int, char const*) /export/users/kyukhin/gcc/git/gcc2/gcc/rtl-error.c:108 0xdacd0b _fatal_insn_not_found(rtx_def const*, char const*, int, char const*) /export/users/kyukhin/gcc/git/gcc2/gcc/rtl-error.c:119 0xd50b31 extract_constrain_insn(rtx_insn*) /export/users/kyukhin/gcc/git/gcc2/gcc/recog.c:2190 0xd5f1d3 copyprop_hardreg_forward_1 /export/users/kyukhin/gcc/git/gcc2/gcc/regcprop.c:774 0xd60afe execute /export/users/kyukhin/gcc/git/gcc2/gcc/regcprop.c:1280 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <http://gcc.gnu.org/bugs.html> for instructions.
[Bug target/70325] ICE on __builtin_ia32_storedquqi256_mask
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70325 Kirill Yukhin changed: What|Removed |Added CC||kyukhin at gcc dot gnu.org Assignee|unassigned at gcc dot gnu.org |kyukhin at gcc dot gnu.org --- Comment #1 from Kirill Yukhin --- Reproducible with: ./xg++ -B. -O2 -S 1.c
[Bug target/70293] [ICE, AVX-512] Wrong reg constraints in vec_dup
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70293 --- Comment #4 from Kirill Yukhin --- Author: kyukhin Date: Mon Mar 21 10:51:04 2016 New Revision: 234363 URL: https://gcc.gnu.org/viewcvs?rev=234363&root=gcc&view=rev Log: PR target/70293 gcc/ * config/i386 (define_insn "*vec_dup"/AVX2): Block third alternative for AVX-512VL target, gcc/testsuite/ * gcc.target/i386/pr70293.c: New test. Added: trunk/gcc/testsuite/gcc.target/i386/pr70293.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/sse.md trunk/gcc/testsuite/ChangeLog
[Bug target/70293] [ICE, AVX-512] Wrong reg constraints in vec_dup
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70293 --- Comment #5 from Kirill Yukhin --- Author: kyukhin Date: Mon Mar 21 10:53:50 2016 New Revision: 234364 URL: https://gcc.gnu.org/viewcvs?rev=234364&root=gcc&view=rev Log: PR target/70293. gcc/ * config/i386 (define_insn "*vec_dup"/AVX2): Block third alternative for AVX-512VL target, gcc/testsuite/ * gcc.target/i386/pr70293.c: New test. Added: branches/gcc-5-branch/gcc/testsuite/gcc.target/i386/pr70293.c Modified: branches/gcc-5-branch/gcc/ChangeLog branches/gcc-5-branch/gcc/config/i386/sse.md branches/gcc-5-branch/gcc/testsuite/ChangeLog
[Bug target/70325] ICE on __builtin_ia32_storedquqi256_mask
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70325 --- Comment #2 from Kirill Yukhin --- I am testing this patch: commit e88ceeabc50634012fa21f47625934d9a2c2e160 Author: Kirill Yukhin Date: Mon Mar 21 14:28:58 2016 +0300 AVX-512. Fix PR70325. diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 3d8dbc4..2c56ee7 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -32431,7 +32431,7 @@ def_builtin (HOST_WIDE_INT mask, const char *name, mask &= ~OPTION_MASK_ISA_64BIT; if (mask == 0 - || (mask & ix86_isa_flags) != 0 + || (mask & ix86_isa_flags) == mask || (lang_hooks.builtin_function == lang_hooks.builtin_function_ext_scope)) diff --git a/gcc/testsuite/gcc.target/i386/pr70325.c b/gcc/testsuite/gcc.target/i386/pr70325.c new file mode 100644 index 000..e2b9342 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr70325.c @@ -0,0 +1,12 @@ +/* PR target/70325 */ +/* { dg-do compile } */ +/* { dg-options "-mavx512vl -O2" } */ + +typedef char C __attribute((__vector_size__(32))); +typedef int I __attribute((__vector_size__(32))); + +void +f(int a,I b) +{ + __builtin_ia32_storedquqi256_mask((C*)f,(C)b,a); /* { dg-warning "implicit declaration of function" } */ +}
[Bug target/70325] ICE on __builtin_ia32_storedquqi256_mask
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70325 Kirill Yukhin changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #3 from Kirill Yukhin --- Done.
[Bug target/70325] ICE on __builtin_ia32_storedquqi256_mask
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70325 Kirill Yukhin changed: What|Removed |Added Status|RESOLVED|ASSIGNED Last reconfirmed||2016-03-22 Resolution|FIXED |--- Ever confirmed|0 |1 --- Comment #4 from Kirill Yukhin --- Sorry, closed by mistake
[Bug target/70293] [ICE, AVX-512] Wrong reg constraints in vec_dup
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70293 Kirill Yukhin changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #6 from Kirill Yukhin --- Done
[Bug target/70325] ICE on __builtin_ia32_storedquqi256_mask
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70325 --- Comment #5 from Kirill Yukhin --- Author: kyukhin Date: Tue Mar 22 11:09:03 2016 New Revision: 234395 URL: https://gcc.gnu.org/viewcvs?rev=234395&root=gcc&view=rev Log: PR target/70325 gcc/ * config/i386/i386.c (def_builtin): Handle OPTION_MASK_ISA_AVX512VL to be and-ed with other bits. (const struct builtin_description bdesc_special_args[]): Remove duplicate ISA bits. gcc/testsuite/ * gcc.target/i386/pr70325.c: New test. Added: trunk/gcc/testsuite/gcc.target/i386/pr70325.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/i386.c trunk/gcc/testsuite/ChangeLog
[Bug target/70325] ICE on __builtin_ia32_storedquqi256_mask
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70325 --- Comment #6 from Kirill Yukhin --- Author: kyukhin Date: Tue Mar 22 11:13:44 2016 New Revision: 234396 URL: https://gcc.gnu.org/viewcvs?rev=234396&root=gcc&view=rev Log: PR target/70325. gcc/ * config/i386/i386.c (def_builtin): Handle OPTION_MASK_ISA_AVX512VL to be and-ed with other bits. (const struct builtin_description bdesc_special_args[]): Remove duplicate ISA bits. gcc/testsuite/ * gcc.target/i386/pr70325.c: New test. Added: branches/gcc-5-branch/gcc/testsuite/gcc.target/i386/pr70325.c Modified: branches/gcc-5-branch/gcc/ChangeLog branches/gcc-5-branch/gcc/config/i386/i386.c branches/gcc-5-branch/gcc/testsuite/ChangeLog
[Bug target/70325] ICE on __builtin_ia32_storedquqi256_mask
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70325 Kirill Yukhin changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #7 from Kirill Yukhin --- Done.
[Bug target/70406] ICE: in extract_insn, at recog.c:2287 (unrecognizable insn) with -mtune=pentium2 -mavx512f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70406 Kirill Yukhin changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2016-03-25 Ever confirmed|0 |1 --- Comment #2 from Kirill Yukhin --- Reproduced.
[Bug target/70406] ICE: in extract_insn, at recog.c:2287 (unrecognizable insn) with -mtune=pentium2 -mavx512f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70406 --- Comment #3 from Kirill Yukhin --- Created attachment 38095 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38095&action=edit Bootstrapped/regtested patch Will submit to gcc-patches shortly
[Bug target/70406] ICE: in extract_insn, at recog.c:2287 (unrecognizable insn) with -mtune=pentium2 -mavx512f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70406 --- Comment #4 from Kirill Yukhin --- Author: kyukhin Date: Mon Mar 28 07:59:44 2016 New Revision: 234500 URL: https://gcc.gnu.org/viewcvs?rev=234500&root=gcc&view=rev Log: PR target/70406 gcc/ * config/i386/i386.md (define_split, andn): Fix modes. gcc/testsuite/ * gcc.target/i386/pr70406.c: New test. Added: trunk/gcc/testsuite/gcc.target/i386/pr70406.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/i386.md trunk/gcc/testsuite/ChangeLog
[Bug target/70406] ICE: in extract_insn, at recog.c:2287 (unrecognizable insn) with -mtune=pentium2 -mavx512f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70406 --- Comment #5 from Kirill Yukhin --- Author: kyukhin Date: Mon Mar 28 08:01:56 2016 New Revision: 234501 URL: https://gcc.gnu.org/viewcvs?rev=234501&root=gcc&view=rev Log: PR target/70406. gcc/ * config/i386/i386.md (define_split, andn): Fix modes. gcc/testsuite/ * gcc.target/i386/pr70406.c: New test. Added: branches/gcc-5-branch/gcc/testsuite/gcc.target/i386/pr70406.c Modified: branches/gcc-5-branch/gcc/ChangeLog branches/gcc-5-branch/gcc/config/i386/i386.md branches/gcc-5-branch/gcc/testsuite/ChangeLog
[Bug target/70406] ICE: in extract_insn, at recog.c:2287 (unrecognizable insn) with -mtune=pentium2 -mavx512f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70406 Kirill Yukhin changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #6 from Kirill Yukhin --- Done.
[Bug target/70429] Wrong code with -O1.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70429 Kirill Yukhin changed: What|Removed |Added CC||kyukhin at gcc dot gnu.org --- Comment #4 from Kirill Yukhin --- Seems like combiner performs invalid reassociation. This trivial addition to Jakub's PR70222 fix makes test work: --- a/gcc/combine.c +++ b/gcc/combine.c @@ -10526,7 +10526,7 @@ simplify_shift_const_1 (enum rtx_code code, machine_mode result_mode, { /* For ((unsigned) (cstULL >> count)) >> cst2 we have to make sure the result will be masked. See PR70222. */ - if (code == LSHIFTRT + if ((code == LSHIFTRT || code == ASHIFTRT) && mode != result_mode && !merge_outer_ops (&outer_op, &outer_const, AND, GET_MODE_MASK (result_mode)
[Bug target/70453] gcc generates invalid instruction vextractu64x4 (should be: vextracti64x4)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70453 Kirill Yukhin changed: What|Removed |Added CC||kyukhin at gcc dot gnu.org Assignee|unassigned at gcc dot gnu.org |kyukhin at gcc dot gnu.org --- Comment #1 from Kirill Yukhin --- Will look
[Bug target/70453] gcc generates invalid instruction vextractu64x4 (should be: vextracti64x4)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70453 Kirill Yukhin changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2016-03-30 Ever confirmed|0 |1 --- Comment #2 from Kirill Yukhin --- Confirmed
[Bug target/70453] gcc generates invalid instruction vextractu64x4 (should be: vextracti64x4)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70453 --- Comment #3 from Kirill Yukhin --- Created attachment 38133 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38133&action=edit Proposed patch I am reg-testing trivial patch
[Bug target/70453] gcc generates invalid instruction vextractu64x4 (should be: vextracti64x4)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70453 Kirill Yukhin changed: What|Removed |Added Attachment #38133|0 |1 is obsolete|| --- Comment #5 from Kirill Yukhin --- Created attachment 38135 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38135&action=edit The patch Woops, this one.
[Bug tree-optimization/70479] New: FMA is not reassociated causing x2 slowdown vs. ICC
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70479 Bug ID: 70479 Summary: FMA is not reassociated causing x2 slowdown vs. ICC Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: kyukhin at gcc dot gnu.org Target Milestone: --- Created attachment 38146 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38146&action=edit Reproducer Attached example demonstrates the issue. GCC is recent trunk. ICC is v16. Compile: GCC: g++ -march=haswell -Ofast -flto -fopenmp-simd -fpermissive m.cpp -o m.gcc ICC: icpc -O3 -ipo -fpermissive -xAVX2 -qopenmp m.cpp -o m.icc Run GCC: time ./m.icc 2 2 ICC: time ./m.gcc 2 2 Hot spot generated by GCC (annotated w/ perf hit counts): 157 │8d0:┌─→vbroad 0x4(%r13),%zmm0 193 ││ lea0x1(%rdx),%edx 173 ││ vmulps (%r14,%rax,1),%zmm0,%zmm0 2943││ vbroad 0x60(%r13),%zmm1 166 ││ vbroad 0x5c(%r13),%zmm2 151 ││ vbroad 0x58(%r13),%zmm3 144 ││ vbroad 0x54(%r13),%zmm4 164 ││ vbroad 0x50(%r13),%zmm5 170 ││ vbroad 0x4c(%r13),%zmm6 162 ││ vbroad 0x48(%r13),%zmm7 162 ││ vbroad 0x44(%r13),%zmm8 154 ││ vbroad 0x40(%r13),%zmm9 172 ││ vbroad 0x3c(%r13),%zmm10 167 ││ vbroad 0x38(%r13),%zmm11 172 ││ vbroad 0x34(%r13),%zmm12 171 ││ vbroad 0x30(%r13),%zmm13 161 ││ vbroad 0x2c(%r13),%zmm14 176 ││ vbroad 0x28(%r13),%zmm15 139 ││ vbroad 0x24(%r13),%zmm16 180 ││ vbroad 0x20(%r13),%zmm17 158 ││ vbroad 0x1c(%r13),%zmm18 165 ││ vbroad 0x18(%r13),%zmm19 140 ││ vbroad 0x10(%r13),%zmm21 179 ││ vbroad 0xc(%r13),%zmm22 146 ││ vbroad 0x8(%r13),%zmm23 170 ││ vbroad 0x0(%r13),%zmm24 170 ││ vbroad 0x14(%r13),%zmm20 168 ││ vfmadd (%r15,%rax,1),%zmm24,%zmm0 2732││ mov0xb8(%rsp),%rcx 172 ││ vfmadd (%r11,%rax,1),%zmm23,%zmm0 1649││ vfmadd (%rsi,%rax,1),%zmm22,%zmm0 3413││ vfmadd (%rcx,%rax,1),%zmm21,%zmm0 3653││ mov0xc0(%rsp),%rcx 182 ││ vfmadd (%rcx,%rax,1),%zmm20,%zmm0 2806││ mov0xc8(%rsp),%rcx 176 ││ vfmadd (%rcx,%rax,1),%zmm19,%zmm0 2439││ mov0xd0(%rsp),%rcx 179 ││ vfmadd (%rcx,%rax,1),%zmm18,%zmm0 2562││ mov0xd8(%rsp),%rcx 197 ││ vfmadd (%rcx,%rax,1),%zmm17,%zmm0 2867││ mov0xe0(%rsp),%rcx 141 ││ vfmadd (%rcx,%rax,1),%zmm16,%zmm0 3200││ mov0xe8(%rsp),%rcx 156 ││ vfmadd (%rcx,%rax,1),%zmm15,%zmm0 3557││ mov0xf0(%rsp),%rcx 158 ││ vfmadd (%rcx,%rax,1),%zmm14,%zmm0 ││ mov0xf8(%rsp),%rcx 143 ││ vfmadd (%rcx,%rax,1),%zmm13,%zmm0 3004││ mov0x100(%rsp),%rcx 177 ││ vfmadd (%rcx,%rax,1),%zmm12,%zmm0 2876││ mov0x108(%rsp),%rcx 144 ││ vfmadd (%rcx,%rax,1),%zmm11,%zmm0 2838││ mov0x110(%rsp),%rcx 168 ││ vfmadd (%rcx,%rax,1),%zmm10,%zmm0 2503││ mov0x118(%rsp),%rcx 203 ││ vfmadd (%rcx,%rax,1),%zmm9,%zmm0 2471││ mov0x120(%rsp),%rcx 185 ││ vfmadd (%rcx,%rax,1),%zmm8,%zmm0 2153││ mov0x128(%rsp),%rcx 152 ││ vfmadd (%r12,%rax,1),%zmm7,%zmm0 2091││ vfmadd (%rbx,%rax,1),%zmm6,%zmm0 3049││ vfmadd (%r10,%rax,1),%zmm5,%zmm0 3737││ vfmadd (%r9,%rax,1),%zmm4,%zmm0 3665││ vfmadd (%r8,%rax,1),%zmm3,%zmm0 3627││ vfmadd (%rdi,%rax,1),%zmm2,%zmm0 3804││ vfmadd (%rcx,%rax,1),%zmm1,%zmm0 4052││ mov0x130(%rsp),%rcx 160 ││ cmp0x138(%rsp),%edx 534 ││ vmovup %zmm0,(%rcx,%rax,1) 3235││ lea0x40(%rax),%rax 161 │└──jb 8d0 Hot spot generated by ICC (annotated w/ perf hit counts): 344 │47a:┌─→vmulps 0x204c(%r11,%r14,4),%zmm27,%zmm2 821 ││ vmulps 0x2050(%r11,%r14,4),%zmm4,%zmm1 318 ││ vmulps 0x2040(%r11,%r14,4),%zmm6,%zmm29 818 ││ vmulps 0x1840(%r11,%r14,4),%zmm7,%zmm31 275 ││ vfmadd 0x1838(%r11,%r14,4),%zmm9,%zmm31 1234 ││ vfmadd 0x183c(%r11,%r14,4),%zmm8,%zmm29 442 ││ vfmadd 0x2044(%r11,%r14,4),%zmm5,%zmm2 1110 ││ vfmadd 0x2048(%r11,%r14,4),%zmm28,%zmm1 337 ││ vaddps %zmm29,%zmm31,%zmm0 1047 ││ vaddps %zmm1,%zmm2,%zmm3 655 ││ vmulps 0x1830(%r11,%r14,4),%zmm11,%zmm30 956 ││ vmulps 0x1834(%r11,%r14,4),%zmm10,%zmm2 296 ││ vmulps 0x1024(%r11,%r14,4),%zmm15,%zmm1 1050 ││ vmulps 0x1028(%r11,%r14,4),%zmm14,%zmm31 294 ││ vaddps %zmm0,%zmm3,%zmm3 1057 ││ vfmadd 0x102c(%r11,%r14,4),%zmm13,%zmm30 344 ││ vfmadd 0x1030(%r11,%r14,4),%zmm12,%zmm2 911 ││ vfmadd 0x1020(%r11,%r14,4),%zmm16,%zmm31 332 ││ vfmadd 0x820(%r11,%r14,4),%zmm17,%zmm1
[Bug tree-optimization/70479] FMA is not reassociated causing x2 slowdown vs. ICC
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70479 --- Comment #1 from Kirill Yukhin --- (In reply to Kirill Yukhin from comment #0) > Compile: > GCC: g++ -march=haswell -Ofast -flto -fopenmp-simd -fpermissive m.cpp -o > m.gcc > ICC: icpc -O3 -ipo -fpermissive -xAVX2 -qopenmp m.cpp -o m.icc Correct compile commands (original are for Haswell) GCC: g++ -march=knl -Ofast -flto -fopenmp-simd -fpermissive m.cpp -o m.gcc ICC: icpc -O3 -ipo -fpermissive -xMIC-AVX512 -qopenmp m.cpp -o m.icc
[Bug tree-optimization/70479] FMA is not reassociated causing x2 slowdown vs. ICC
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70479 --- Comment #3 from Kirill Yukhin --- (In reply to Richard Biener from comment #2) > You mean we fail to handle ternary associative tree codes in GIMPLE reassoc? > Yes, that's true. It's not going to be easy to retro-fit there > implementation-wise. With rebalancing you mean handling reassoc-width > 1? Hi Richard, yes to both.
[Bug target/70453] gcc generates invalid instruction vextractu64x4 (should be: vextracti64x4)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70453 --- Comment #6 from Kirill Yukhin --- Author: kyukhin Date: Thu Mar 31 15:23:29 2016 New Revision: 234634 URL: https://gcc.gnu.org/viewcvs?rev=234634&root=gcc&view=rev Log: Fix PR target/70453. gcc/ * config/i386/sse.md (define_mode_attr shuffletype): Fix typo. gcc/testsuite/ * gcc.target/i386/pr70453.c: New test. Added: trunk/gcc/testsuite/gcc.target/i386/pr70453.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/sse.md trunk/gcc/testsuite/ChangeLog
[Bug target/70453] gcc generates invalid instruction vextractu64x4 (should be: vextracti64x4)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70453 --- Comment #7 from Kirill Yukhin --- Author: kyukhin Date: Thu Mar 31 15:25:33 2016 New Revision: 234635 URL: https://gcc.gnu.org/viewcvs?rev=234635&root=gcc&view=rev Log: Fix PR target/70453. gcc/ * config/i386/sse.md (define_mode_attr shuffletype): Fix typo. gcc/testsuite/ * gcc.target/i386/pr70453.c: New test. Added: branches/gcc-5-branch/gcc/testsuite/gcc.target/i386/pr70453.c Modified: branches/gcc-5-branch/gcc/ChangeLog branches/gcc-5-branch/gcc/config/i386/sse.md branches/gcc-5-branch/gcc/testsuite/ChangeLog
[Bug target/70453] gcc generates invalid instruction vextractu64x4 (should be: vextracti64x4)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70453 Kirill Yukhin changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #8 from Kirill Yukhin --- Done.
[Bug target/70510] ICE: output_operand: invalid %-code with -mavx512bw -masm=intel when emitting vpbroatcast
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70510 Kirill Yukhin changed: What|Removed |Added CC||kyukhin at gcc dot gnu.org Assignee|unassigned at gcc dot gnu.org |kyukhin at gcc dot gnu.org --- Comment #1 from Kirill Yukhin --- will take a look.
[Bug target/64393] ICE: in extract_insn, at recog.c:2327 (unrecognizable insn) with -mavx512vbmi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64393 Kirill Yukhin changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #4 from Kirill Yukhin --- Done
[Bug target/64387] ICE: in extract_insn, at recog.c:2327 (unrecognizable insn) with -ffloat-store -mavx512er
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64387 Kirill Yukhin changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED CC||kyukhin at gcc dot gnu.org Resolution|--- |FIXED --- Comment #6 from Kirill Yukhin --- Done
[Bug target/70510] ICE: output_operand: invalid %-code with -mavx512bw -masm=intel when emitting vpbroatcast
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70510 Kirill Yukhin changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2016-04-05 Ever confirmed|0 |1
[Bug target/70510] ICE: output_operand: invalid %-code with -mavx512bw -masm=intel when emitting vpbroatcast
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70510 --- Comment #3 from Kirill Yukhin --- (In reply to Uroš Bizjak from comment #2) > (In reply to Kirill Yukhin from comment #1) > > will take a look. > > I have patch in testing: > Oh, great! Thanks!
[Bug target/64386] ICE: in extract_insn, at recog.c:2327 (unrecognizable insn) with -mavx512bw
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64386 Kirill Yukhin changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #4 from Kirill Yukhin --- Done.
[Bug target/59683] ICE: in classify_argument, at config/i386/i386.c:6637 with #pragma GCC target("avx512f")
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59683 --- Comment #3 from Kirill Yukhin --- This hunk from Jakub's fix for PR61925 makes test working: diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index a41efa4..6aebaed 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -4962,6 +4962,15 @@ static GTY(()) tree ix86_previous_fndecl; void ix86_reset_previous_fndecl (void) { + tree new_tree = target_option_current_node; + cl_target_option_restore (&global_options, TREE_TARGET_OPTION (new_tree)); + if (TREE_TARGET_GLOBALS (new_tree)) +restore_target_globals (TREE_TARGET_GLOBALS (new_tree)); + else if (new_tree == target_option_default_node) +restore_target_globals (&default_target_globals); + else +TREE_TARGET_GLOBALS (new_tree) = save_target_globals_default_opts (); + ix86_previous_fndecl = NULL_TREE; }
[Bug tree-optimization/70577] [6 regression] tree-ssa/prefetch-5.c scan-tree-dump-times aprefetch failures
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70577 Kirill Yukhin changed: What|Removed |Added CC||kyukhin at gcc dot gnu.org --- Comment #8 from Kirill Yukhin --- This commit caused miscompare of spec2000/178.galgel on -march=skylake-avx512 (-Ofast -flto -funroll-loops): Newton iteration # 0Maximal derivative = 0.1526E-07 Newton iteration # 0Maximal derivative = 0.3901E-07
[Bug target/70662] vpbroadcastq assemble failure with -masm=intel -mavx512vbmi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70662 Kirill Yukhin changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2016-04-14 Assignee|unassigned at gcc dot gnu.org |kyukhin at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #1 from Kirill Yukhin --- I'll take a look.
[Bug target/70662] vpbroadcastq assemble failure with -masm=intel -mavx512vbmi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70662 --- Comment #2 from Kirill Yukhin --- Author: kyukhin Date: Fri Apr 15 08:25:49 2016 New Revision: 235008 URL: https://gcc.gnu.org/viewcvs?rev=235008&root=gcc&view=rev Log: AVX-512. Fix mem operand modifier for Intel syntax. PR target/70662 gcc/ * config/i386/sse.md: Use proper memory operand modifiers. testsuite/gcc/ * gcc.target/i386/pr70662.c: New test. Added: trunk/gcc/testsuite/gcc.target/i386/pr70662.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/sse.md trunk/gcc/testsuite/ChangeLog
[Bug target/70662] vpbroadcastq assemble failure with -masm=intel -mavx512vbmi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70662 --- Comment #3 from Kirill Yukhin --- Author: kyukhin Date: Fri Apr 15 09:36:31 2016 New Revision: 235013 URL: https://gcc.gnu.org/viewcvs?rev=235013&root=gcc&view=rev Log: AVX-512. Use proper mem ops modifier for Intel syntax in broadcast patter. PR target/70662 gcc/ * config/i386/sse.md: Use proper memory operand modifiers. gcc/testsuite. * gcc.target/i386/pr70662.c: New test. Added: branches/gcc-5-branch/gcc/testsuite/gcc.target/i386/pr70662.c Modified: branches/gcc-5-branch/gcc/ChangeLog branches/gcc-5-branch/gcc/config/i386/sse.md branches/gcc-5-branch/gcc/testsuite/ChangeLog
[Bug target/70662] vpbroadcastq assemble failure with -masm=intel -mavx512vbmi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70662 --- Comment #5 from Kirill Yukhin --- Author: kyukhin Date: Fri Apr 15 15:13:42 2016 New Revision: 235037 URL: https://gcc.gnu.org/viewcvs?rev=235037&root=gcc&view=rev Log: AVX-512, Fix mode size check. PR target/70662 gcc/ * config/i386/sse.md(define_insn "_vec_dup"): Fix mode size check. Modified: branches/gcc-5-branch/gcc/ChangeLog branches/gcc-5-branch/gcc/config/i386/sse.md
[Bug target/70662] vpbroadcastq assemble failure with -masm=intel -mavx512vbmi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70662 --- Comment #6 from Kirill Yukhin --- Author: kyukhin Date: Fri Apr 15 15:17:31 2016 New Revision: 235038 URL: https://gcc.gnu.org/viewcvs?rev=235038&root=gcc&view=rev Log: AVX-512. Fix mode size check. PR target/70662 gcc/ * config/i386/sse.md(define_insn "_vec_dup"): Fix mode size check. Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/sse.md
[Bug target/70662] vpbroadcastq assemble failure with -masm=intel -mavx512vbmi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70662 Kirill Yukhin changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #8 from Kirill Yukhin --- Done
[Bug target/70728] GCC trunk emits invalid assembly for knl target
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70728 Kirill Yukhin changed: What|Removed |Added Target||i?86/x86_64 Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2016-04-19 CC||kyukhin at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #1 from Kirill Yukhin --- I'll take a look.
[Bug target/70728] GCC trunk emits invalid assembly for knl target
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70728 --- Comment #2 from Kirill Yukhin --- This is a 5/6 regression