[Bug target/70421] New: [5/6 Regression] wrong code with v16si vector and useless cast at -O -mavx512f

2016-03-27 Thread zsojka at seznam dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70421

Bug ID: 70421
   Summary: [5/6 Regression] wrong code with v16si vector and
useless cast at -O -mavx512f
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Keywords: wrong-code
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zsojka at seznam dot cz
  Target Milestone: ---
Target: x86_64-pc-linux-gnu

Created attachment 38106
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38106&action=edit
reduced testcase

Output: (using emulation)
$ x86_64-pc-linux-gnu-gcc -O -mavx512f testcase.c
$ sde64 -- ./a.out 
1010
Aborted

$ x86_64-pc-linux-gnu-gcc -v 
Using built-in specs.
COLLECT_GCC=/repo/gcc-trunk/binary-latest/bin/x86_64-pc-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-234469-checking-yes-rtl-df-nographite/bin/../libexec/gcc/x86_64-pc-linux-gnu/6.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++
--enable-checking=yes,rtl,df --without-cloog --without-ppl --without-isl
--disable-libstdcxx-pch
--prefix=/repo/gcc-trunk//binary-trunk-234469-checking-yes-rtl-df-nographite
Thread model: posix
gcc version 6.0.0 20160324 (experimental) (GCC) 

Tested revisions:
trunk r234469 - FAIL
5-branch r234412 - FAIL
4_9-branch r234243 - OK

[Bug target/70359] [6 Regression] Code size increase for ARM compared to gcc-5.3.0

2016-03-27 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70359

--- Comment #11 from kugan at gcc dot gnu.org ---
Optimized gimple diff between 5.3 and trunk is :

-;; Function inttostr (inttostr, funcdef_no=0, decl_uid=5268, cgraph_uid=0,
symbol_order=0)
+;; Function inttostr (inttostr, funcdef_no=0, decl_uid=4222, cgraph_uid=0,
symbol_order=0)

 Removing basic block 7
 Removing basic block 8
@@ -43,7 +43,7 @@
 goto ;

   :
-  p_22 = p_2 + 4294967294;
+  p_22 = p_16 + 4294967295;
   MEM[(char *)p_16 + 4294967295B] = 45;

   :

[Bug target/70359] [6 Regression] Code size increase for ARM compared to gcc-5.3.0

2016-03-27 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70359

--- Comment #12 from kugan at gcc dot gnu.org ---
However, diff of cfgexand is significantly different:
 ;; Full RTL generated for this function:
 ;;
32: NOTE_INSN_DELETED
-   38: NOTE_INSN_BASIC_BLOCK 2
+   39: NOTE_INSN_BASIC_BLOCK 2
33: r151:SI=r0:SI
34: r152:SI=r1:SI
35: r153:SI=r2:SI
36: NOTE_INSN_FUNCTION_BEG
-   40: {r141:SI=abs(r151:SI);clobber cc:CC;}
-   41: r154:SI=r153:SI-0x1
-   42: r142:SI=r152:SI+r154:SI
-   43: r155:SI=0
-   44: r156:QI=r155:SI#0
-   45: [r142:SI]=r156:QI
-   61: L61:
-   46: NOTE_INSN_BASIC_BLOCK 4
-   47: r142:SI=r142:SI-0x1
-   48: r1:SI=0xa
-   49: r0:SI=r141:SI
-   50: r0:DI=call [`__aeabi_uidivmod'] argc:0
+   41: {r141:SI=abs(r151:SI);clobber cc:CC;}
+   42: r154:SI=r153:SI-0x1
+   43: r142:SI=r152:SI+r154:SI
+   44: r155:SI=0
+   45: r156:QI=r155:SI#0
+   46: [r142:SI]=r156:QI
+   81: pc=L62
+   82: barrier
+   84: L84:
+   83: NOTE_INSN_BASIC_BLOCK 4
+   37: r142:SI=r150:SI
+   62: L62:
+   47: NOTE_INSN_BASIC_BLOCK 5
+   48: r150:SI=r142:SI-0x1
+   49: r1:SI=0xa
+   50: r0:SI=r141:SI
+   51: r0:DI=call [`__aeabi_uidivmod'] argc:0
   REG_CALL_DECL `__aeabi_uidivmod'
   REG_EH_REGION 0x8000
-   51: r162:SI=r1:SI
+   52: r162:SI=r1:SI
   REG_EQUAL umod(r141:SI,0xa)
-   52: r163:QI=r162:SI#0
-   53: r164:SI=r163:QI#0+0x30
-   54: r165:QI=r164:SI#0
-   55: [r142:SI]=r165:QI
-   56: r1:SI=0xa
-   57: r0:SI=r141:SI
-   58: r0:SI=call [`__aeabi_uidiv'] argc:0
+   53: r163:QI=r162:SI#0
+   54: r164:SI=r163:QI#0+0x30
+   55: r165:QI=r164:SI#0
+   56: [r150:SI]=r165:QI
+   57: r1:SI=0xa
+   58: r0:SI=r141:SI
+   59: r0:SI=call [`__aeabi_uidiv'] argc:0
   REG_CALL_DECL `__aeabi_uidiv'
   REG_EH_REGION 0x8000
-   59: r169:SI=r0:SI
+   60: r169:SI=r0:SI
   REG_EQUAL udiv(r141:SI,0xa)
-   60: r141:SI=r169:SI
-   62: cc:CC=cmp(r141:SI,0)
-   63: pc={(cc:CC!=0)?L61:pc}
+   61: r141:SI=r169:SI
+   63: cc:CC=cmp(r141:SI,0)
+   64: pc={(cc:CC!=0)?L84:pc}
   REG_BR_PROB 9100
-   64: NOTE_INSN_BASIC_BLOCK 5
-   65: cc:CC=cmp(r151:SI,0)
-   66: pc={(cc:CC>=0)?L72:pc}
+   65: NOTE_INSN_BASIC_BLOCK 6
+   66: cc:CC=cmp(r151:SI,0)
+   67: pc={(cc:CC>=0)?L77:pc}
   REG_BR_PROB 6335
-   67: NOTE_INSN_BASIC_BLOCK 6
-   68: r149:SI=r142:SI-0x1
-   69: r170:SI=0x2d
-   70: r171:QI=r170:SI#0
-   71: [r142:SI-0x1]=r171:QI
-   37: r142:SI=r149:SI
-   72: L72:
-   73: NOTE_INSN_BASIC_BLOCK 7
-   74: r150:SI=r142:SI
+   68: NOTE_INSN_BASIC_BLOCK 7
+   69: r149:SI=r142:SI-0x2
+   70: r170:SI=0x2d
+   71: r171:QI=r170:SI#0
+   72: [r150:SI-0x1]=r171:QI
+   38: r150:SI=r149:SI
+   77: L77:
+   80: NOTE_INSN_BASIC_BLOCK 9
78: r0:SI=r150:SI
79: use r0:SI

[Bug bootstrap/67728] Build fails when cross-compiling with in-tree GMP and ISL

2016-03-27 Thread bernd.edlinger at hotmail dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67728

--- Comment #26 from Bernd Edlinger  ---
with unpatched trunk and mpfr-3.1.4 and mpc-1.0.3 in-tree

I've got this in mpc/src/libmpc.la:
dependency_libs=' -lmpfr /home/ed/gnu/gcc-build1/./gmp/.libs/libgmp.la -lm'

and check-mpc fails to build this:
libtool: link: /home/ed/gnu/gcc-build1/./prev-gcc/xgcc
-B/home/ed/gnu/gcc-build1/./prev-gcc/
-B/home/ed/gnu/install1/x86_64-pc-linux-gnu/bin/
-B/home/ed/gnu/install1/x86_64-pc-linux-gnu/bin/
-B/home/ed/gnu/install1/x86_64-pc-linux-gnu/lib/ -isystem
/home/ed/gnu/install1/x86_64-pc-linux-gnu/include -isystem
/home/ed/gnu/install1/x86_64-pc-linux-gnu/sys-include -g -O2 -static-libstdc++
-static-libgcc -o tabs tabs.o  ./.libs/libmpc-tests.a ../src/.libs/libmpc.a
-lmpfr /home/ed/gnu/gcc-build1/./gmp/.libs/libgmp.a -lm
/usr/bin/ld: cannot find -lmpfr

and with the patch this line in mpc/src/libmpc.la changed to:
dependency_libs=' /home/ed/gnu/gcc-build/./mpfr/.libs/libmpfr.la
/home/ed/gnu/gcc-build/./gmp/.libs/libgmp.la'

and the check-mpc succeeds on a plain x86_64-ubuntu14.04 with definitely no
gmp or mpfr libs installed

[Bug fortran/70235] [4.9/5/6 Regression] Incorrect output with PF format

2016-03-27 Thread dominiq at lps dot ens.fr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70235

--- Comment #22 from Dominique d'Humieres  ---
Created attachment 38107
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38107&action=edit
New patch with test.

With the patch we now get for y=6431.25

ru,-8pf18.2 y=  0.01

IMO this is the correct rounding. Does someone disagree with that?

What tests should be removed/added from gfortran.dg/fmt_pf.f90?

[Bug bootstrap/70422] New: [6 regression] Bootstrap comparison failure

2016-03-27 Thread sch...@linux-m68k.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70422

Bug ID: 70422
   Summary: [6 regression] Bootstrap comparison failure
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: bootstrap
  Assignee: unassigned at gcc dot gnu.org
  Reporter: sch...@linux-m68k.org
CC: jason at gcc dot gnu.org
Blocks: 64266, 70353
  Target Milestone: ---
Target: aarch64-*-*, ia64-*-*

Both aarch64 and ia64 fail to bootstrap due to comparison failure.

a478a028f1e445c05b162236d708de6935d4b5e2 is the first bad commit
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@234484
138bc75d-0d04-0410-961f-82ee72b054a4


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64266
[Bug 64266] Can GCC produce local mergeable symbols for *.__FUNCTION__ and
*.__PRETTY_FUNCTION__ functions?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70353
[Bug 70353] [5/6 regression] ICE on __PRETTY_FUNCTION__ in a constexpr function

[Bug target/70416] [SH]: error: 'asm' operand requires impossible reload when building ruby2.3

2016-03-27 Thread olegendo at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70416

Oleg Endo  changed:

   What|Removed |Added

  Attachment #38105|0   |1
is obsolete||

--- Comment #13 from Oleg Endo  ---
Created attachment 38108
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38108&action=edit
reduced test case for -O2 -fpic

It seems it can be reduced even a bit further.

[Bug bootstrap/70422] [6 regression] Bootstrap comparison failure

2016-03-27 Thread sch...@linux-m68k.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70422

--- Comment #1 from Andreas Schwab  ---
@@ -1,5 +1,5 @@

-stage2-gcc/bitmap.o: file format elf64-littleaarch64
+stage3-gcc/bitmap.o: file format elf64-littleaarch64


 Disassembly of section .text:
@@ -4788,11 +4788,11 @@
  22c:  aa0003f8mov x24, x0
  230:  b5fff200cbnzx0, 70
<_ZN10hash_tableIN8hash_mapIN21mem_alloc_descriptionI12bitmap_usageE17mem_location_hashEPS2_21simple_hashmap_traitsI19default_hash_traitsIS4_ES5_EE10hash_entryE11xcallocatorE6expandEv+0x70>
  234:  9002adrpx2, 0
<_ZN10hash_tableIN8hash_mapIN21mem_alloc_descriptionI12bitmap_usageE17mem_location_hashEPS2_21simple_hashmap_traitsI19default_hash_traitsIS4_ES5_EE10hash_entryE11xcallocatorE6expandEv>
-   234: R_AARCH64_ADR_PREL_PG_HI21
.rodata._ZN10hash_tableIN8hash_mapIN21mem_alloc_descriptionI12bitmap_usageE17mem_location_hashEPS2_21simple_hashmap_traitsI19default_hash_traitsIS4_ES5_EE10hash_entryE11xcallocatorE6expandEv.str1.8
+   234: R_AARCH64_ADR_PREL_PG_HI21
.rodata._ZN21mem_alloc_descriptionI12bitmap_usageEC2Ev.str1.8
  238:  9000adrpx0, 0
<_ZN10hash_tableIN8hash_mapIN21mem_alloc_descriptionI12bitmap_usageE17mem_location_hashEPS2_21simple_hashmap_traitsI19default_hash_traitsIS4_ES5_EE10hash_entryE11xcallocatorE6expandEv>
238: R_AARCH64_ADR_PREL_PG_HI21
.rodata._ZN10hash_tableIN8hash_mapIN21mem_alloc_descriptionI12bitmap_usageE17mem_location_hashEPS2_21simple_hashmap_traitsI19default_hash_traitsIS4_ES5_EE10hash_entryE11xcallocatorE26find_empty_slot_for_expandEj.str1.8+0x20
  23c:  9142add x2, x2, #0x0
-   23c: R_AARCH64_ADD_ABS_LO12_NC 
.rodata._ZN10hash_tableIN8hash_mapIN21mem_alloc_descriptionI12bitmap_usageE17mem_location_hashEPS2_21simple_hashmap_traitsI19default_hash_traitsIS4_ES5_EE10hash_entryE11xcallocatorE6expandEv.str1.8
+   23c: R_AARCH64_ADD_ABS_LO12_NC 
.rodata._ZN21mem_alloc_descriptionI12bitmap_usageEC2Ev.str1.8
  240:  9100add x0, x0, #0x0
240: R_AARCH64_ADD_ABS_LO12_NC 
.rodata._ZN10hash_tableIN8hash_mapIN21mem_alloc_descriptionI12bitmap_usageE17mem_location_hashEPS2_21simple_hashmap_traitsI19default_hash_traitsIS4_ES5_EE10hash_entryE11xcallocatorE26find_empty_slot_for_expandEj.str1.8+0x20
  244:  528051a1mov w1, #0x28d  // #653
@@ -4827,13 +4827,13 @@
  2a4:  f920str x0, [x1]
  2a8:  177ab   90
<_ZN10hash_tableIN8hash_mapIN21mem_alloc_descriptionI12bitmap_usageE17mem_location_hashEPS2_21simple_hashmap_traitsI19default_hash_traitsIS4_ES5_EE10hash_entryE11xcallocatorE6expandEv+0x90>
  2ac:  9002adrpx2, 0
<_ZN10hash_tableIN8hash_mapIN21mem_alloc_descriptionI12bitmap_usageE17mem_location_hashEPS2_21simple_hashmap_traitsI19default_hash_traitsIS4_ES5_EE10hash_entryE11xcallocatorE6expandEv>
-   2ac: R_AARCH64_ADR_PREL_PG_HI21
.rodata._ZN10hash_tableIN8hash_mapIN21mem_alloc_descriptionI12bitmap_usageE17mem_location_hashEPS2_21simple_hashmap_traitsI19default_hash_traitsIS4_ES5_EE10hash_entryE11xcallocatorE6expandEv.str1.8+0x10
+   2ac: R_AARCH64_ADR_PREL_PG_HI21
.rodata._ZN10hash_tableIN8hash_mapIN21mem_alloc_descriptionI12bitmap_usageE17mem_location_hashEPS2_21simple_hashmap_traitsI19default_hash_traitsIS4_ES5_EE10hash_entryE11xcallocatorE6expandEv.str1.8
  2b0:  9000adrpx0, 0
<_ZN10hash_tableIN8hash_mapIN21mem_alloc_descriptionI12bitmap_usageE17mem_location_hashEPS2_21simple_hashmap_traitsI19default_hash_traitsIS4_ES5_EE10hash_entryE11xcallocatorE6expandEv>
-   2b0: R_AARCH64_ADR_PREL_PG_HI21
.rodata._ZN10hash_tableIN8hash_mapIN21mem_alloc_descriptionI12bitmap_usageE17mem_location_hashEPS2_21simple_hashmap_traitsI19default_hash_traitsIS4_ES5_EE10hash_entryE11xcallocatorE6expandEv.str1.8+0x28
+   2b0: R_AARCH64_ADR_PREL_PG_HI21
.rodata._ZN10hash_tableIN8hash_mapIN21mem_alloc_descriptionI12bitmap_usageE17mem_location_hashEPS2_21simple_hashmap_traitsI19default_hash_traitsIS4_ES5_EE10hash_entryE11xcallocatorE6expandEv.str1.8+0x18
  2b4:  9142add x2, x2, #0x0
-   2b4: R_AARCH64_ADD_ABS_LO12_NC 
.rodata._ZN10hash_tableIN8hash_mapIN21mem_alloc_descriptionI12bitmap_usageE17mem_location_hashEPS2_21simple_hashmap_traitsI19default_hash_traitsIS4_ES5_EE10hash_entryE11xcallocatorE6expandEv.str1.8+0x10
+   2b4: R_AARCH64_ADD_ABS_LO12_NC 
.rodata._ZN10hash_tableIN8hash_mapIN21mem_alloc_descriptionI12bitmap_usageE17mem_location_hashEPS2_21simple_hashmap_traitsI19default_hash_traitsIS4_ES5_EE10hash_entryE11xcallocatorE6expandEv.str1.8
  2b8:  9100add x0, x0, #0x0
-   2b8: R_AARCH64_ADD_ABS_LO12_NC 
.rodata._ZN10hash_tableIN8hash_mapIN21mem_alloc_descriptionI12bitmap_usageE17mem_location_hashEPS2_21simple_ha

[Bug target/70421] [5/6 Regression] wrong code with v16si vector and useless cast at -O -mavx512f

2016-03-27 Thread zsojka at seznam dot cz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70421

--- Comment #1 from Zdenek Sojka  ---
The operation done by the vmovdqa32 instruction is inverted; this fixes the
assembly (-O3, intel syntax):

@@ -72,7 +72,7 @@
and rsp, -64#,
pushQWORD PTR [r10-8]   #
pushrbp #
-   mov eax, 2  # tmp108,
+   mov eax, 0xfd   # tmp108,
kmovw   k1, eax # tmp108, tmp108
xor edx, ecx# tmp106, tmp100
.cfi_escape 0x10,0x6,0x2,0x76,0

[Bug tree-optimization/59124] [4.9/5/6 Regression] Wrong warnings "array subscript is above array bounds"

2016-03-27 Thread ppalka at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59124

Patrick Palka  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
 CC||ppalka at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |ppalka at gcc dot 
gnu.org

--- Comment #35 from Patrick Palka  ---
I have a rather simple patch that teaches VRP to insert the relevant
ASSERT_EXPRs so that it knows to remove the unreachable code inserted by the
loop unrolling.

[Bug c++/70275] -w disables all -Werror flags

2016-03-27 Thread manu at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70275

--- Comment #4 from Manuel López-Ibáñez  ---
(In reply to Kevin Tucker from comment #3)
> I'm new to this.  How is is determined if this is a desired change or not?

Suggestion #10 applies also to non-patches: https://gcc.gnu.org/wiki/Community

In short, I would recommend to write to g...@gcc.gnu.org, CC the relevant
MAINTAINERS, choose an appropriate subject, write a concise but clear-cut email
so they won't simply overlook it for lack of time or lack of clarity.

[Bug driver/70423] New: -shared option description isn't clear about exactly when -fpic/-fPIC is required

2016-03-27 Thread britton.kerin at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70423

Bug ID: 70423
   Summary: -shared option description isn't clear about exactly
when -fpic/-fPIC is required
   Product: gcc
   Version: 5.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: driver
  Assignee: unassigned at gcc dot gnu.org
  Reporter: britton.kerin at gmail dot com
  Target Milestone: ---

Section 3.13 Options for Linking includes this:

-shared
Produce a shared object which can then be linked with other objects to form
an executable. Not all systems support this option. For predictable results,
you must also specify the same set of options used for compilation (-fpic,
-fPIC, or model suboptions) when you specify this linker option.

This makes it sound like -fpic/-fPIC would be required when performing a
link-only gcc invocation (i.e. with arguments consisting only of .o files).
Most people don't include -fpic/-fPIC in this situation and in fact it
apparently isn't required:

Cary Coutant wrote in a bug report elsewhere:

 The -fpic and model suboptions are not linker options. They're only
 required when you pass -shared to gcc if you're also compiling source
 files at the same time. If you're just running gcc to link a bunch of .o
 files, compiler options like -fpic are unnecessary.

This problem could be fixed by changing the -shared description to read like
this:

-shared
Produce a shared object which can then be linked with other objects to form
an executable. Not all systems support this option. For predictable results,
you must also specify the same set of options used for compilation (-fpic,
-fPIC, or model suboptions) when you specify this linker option in a gcc
invocation that will perform both compilation and linking.

Actually this is still a little imperfect since if I understand correctly the
purpose is to get the same set of options in the compile/link invocation as
those used for any previous compilations that produced object files to be
included in the link.  If the invocation compiles everything that gets linked
there's no possibility to go wrong.  But I think the above gets the point
accross.

This is worth fixing because -shared -fPIC -fpic etc. form a somewhat
complicated nest so the specifications need to be precise.  I believe most
existing build systems don't mix compilation and linking, so don't use
-fpic/-fPIC at link time, so violate the most likely (but wrong) interpretation
of the situations in which -fpic/-fPIC are required according to the current
-shared option description.

[Bug target/70421] [5/6 Regression] wrong code with v16si vector and useless cast at -O -mavx512f

2016-03-27 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70421

Jakub Jelinek  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2016-03-27
 CC||jakub at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #2 from Jakub Jelinek  ---
Untested fix:
--- gcc/config/i386/i386.c  (revision 234449)
+++ gcc/config/i386/i386.c  (working copy)
@@ -46930,7 +46930,7 @@ half:
 {
   tmp = gen_reg_rtx (mode);
   emit_insn (gen_rtx_SET (tmp, gen_rtx_VEC_DUPLICATE (mode, val)));
-  emit_insn (gen_blendm (target, tmp, target,
+  emit_insn (gen_blendm (target, target, tmp,
 force_reg (mmode,
gen_int_mode (1 << elt, mmode;
 }

Both the
(define_insn "_blendm"
  [(set (match_operand:V48_AVX512VL 0 "register_operand" "=v")
(vec_merge:V48_AVX512VL
  (match_operand:V48_AVX512VL 2 "nonimmediate_operand" "vm")
  (match_operand:V48_AVX512VL 1 "register_operand" "v")
  (match_operand: 3 "register_operand" "Yk")))]
  "TARGET_AVX512F"
  "vblendm\t{%2, %1, %0%{%3%}|%0%{%3%}, %1, %2}"
  [(set_attr "type" "ssemov")
   (set_attr "prefix" "evex")
   (set_attr "mode" "")])

(define_insn "_blendm"
  [(set (match_operand:VI12_AVX512VL 0 "register_operand" "=v")
(vec_merge:VI12_AVX512VL
  (match_operand:VI12_AVX512VL 2 "nonimmediate_operand" "vm")
  (match_operand:VI12_AVX512VL 1 "register_operand" "v")
  (match_operand: 3 "register_operand" "Yk")))]
  "TARGET_AVX512BW"
  "vpblendm\t{%2, %1, %0%{%3%}|%0%{%3%}, %1, %2}"
  [(set_attr "type" "ssemov")
   (set_attr "prefix" "evex")
   (set_attr "mode" "")])
patterns have the order of operands swapped vs. VEC_MERGE, and for VEC_MERGE
we use the
  tmp = gen_rtx_VEC_MERGE (mode, tmp, target, GEN_INT (1 << elt));
order, so I believe the above patch is right.  Will test it on Tuesday.

[Bug middle-end/70424] New: [4.9/5/6 Regression] Pointer derived from integer gets reduced alignment

2016-03-27 Thread amonakov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70424

Bug ID: 70424
   Summary: [4.9/5/6 Regression] Pointer derived from integer gets
reduced alignment
   Product: gcc
   Version: 4.9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: amonakov at gcc dot gnu.org
  Target Milestone: ---

int f(long a)
{
  int *p=(int*)(a<<1);
  //asm("" : "+r"(p));
  return *p;
}

Starting from 4.9, in the above example GCC assumes that *p is aligned to 16
bits (on 4.8 and earlier, to 32 bits, like normal int*). This causes the load
to be torn in two on strict-alignment targets; using -O0 or uncommenting the
asm restores old behavior (one 32-bit load). This change seems unintended.

On x86_64 it's visible on RTL level (note A32->A16 change):

gcc-4.8.0 -S t.c -Os -o- -dP

#(insn:TI 7 3 15 2 (set (reg:SI 0 ax [orig:66 *p_3 ] [66])
#(mem:SI (plus:DI (reg/v:DI 5 di [orig:63 a ] [63])
#(reg/v:DI 5 di [orig:63 a ] [63])) [2 *p_3+0 S4 A32]))
align.c:5 89 {*movsi_internal}
# (expr_list:REG_DEAD (reg/v:DI 5 di [orig:63 a ] [63])
#(nil)))
movl(%rdi,%rdi), %eax   # 7 *movsi_internal/1   [length
= 3]

gcc-4.9.2 -S t.c -Os -o- -dP

#(insn:TI 7 3 13 2 (set (reg:SI 0 ax [orig:90 *p_3 ] [90])
#(mem:SI (plus:DI (reg/v:DI 5 di [orig:87 a ] [87])
#(reg/v:DI 5 di [orig:87 a ] [87])) [2 *p_3+0 S4 A16]))
align.c:5 90 {*movsi_internal}
# (expr_list:REG_DEAD (reg/v:DI 5 di [orig:87 a ] [87])
#(nil)))
movl(%rdi,%rdi), %eax   # 7 *movsi_internal/1   [length
= 3]

[Bug middle-end/70424] [4.9/5/6 Regression] Pointer derived from integer gets reduced alignment

2016-03-27 Thread bugdal at aerifal dot cx
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70424

Rich Felker  changed:

   What|Removed |Added

 CC||bugdal at aerifal dot cx

--- Comment #1 from Rich Felker  ---
If correct, this can likely break MMIO access in bare-metal applications or
kernel drivers that derive the MMIO addresses via certain types of arithmetic
expressions. Accessing a 32-bit MMIO register as multiple 16-bit or 8-bit
loads/stores is likely to do the wrong thing or not work at all.

I see no reason why GCC should even try to account for the possibility that the
resulting pointer might be misaligned. Unless the pointed-to type has
__attribute__((__aligned__(1))) applied to it, misaligned access is simply UB.

[Bug bootstrap/70422] [6 regression] Bootstrap comparison failure

2016-03-27 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70422

Segher Boessenkool  changed:

   What|Removed |Added

 Target|aarch64-*-*, ia64-*-*   |aarch64-*-*, ia64-*-*,
   ||powerpc64-*-*
   Priority|P3  |P1
 CC||segher at gcc dot gnu.org

--- Comment #2 from Segher Boessenkool  ---
Also on powerpc64-linux.

[Bug bootstrap/70422] [6 regression] Bootstrap comparison failure

2016-03-27 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70422

Segher Boessenkool  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2016-03-27
 Ever confirmed|0   |1

[Bug other/70425] New: decl_expr contains too little information

2016-03-27 Thread JamesMikeDuPont at googlemail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70425

Bug ID: 70425
   Summary: decl_expr contains too little information
   Product: gcc
   Version: 4.9.2
Status: UNCONFIRMED
  Severity: minor
  Priority: P3
 Component: other
  Assignee: unassigned at gcc dot gnu.org
  Reporter: JamesMikeDuPont at googlemail dot com
  Target Milestone: ---

using gcc (Debian 4.9.2-10) 4.9.2
In the 001t.tu file, the decl_expr contains no real information. 

here is the context of relevant statements :

@9529   function_declname: @9547type: @5191scpe: @155
 srcp: eval.c:199  chain: @9548
 link: static   body: @9549

@9549   bind_exprtype: @129 vars: @9568body: @9569

@9569   statement_list   0   : @95871   : @95882   : @9589
 3   : @95904   : @95915   : @9592
 6   : @9593
@9585   identifier_node  strg: pwd  lngt: 3
@9568   var_decl name: @9585type: @144 scpe: @9529
 srcp: eval.c:201  chain: @9586
 size: @22  algn: 64   used: 1

@9587   decl_exprtype: @129
@9588   decl_exprtype: @129
@9589   modify_expr  type: @144 op 0: @9586op 1: @9615

@129void_typename: @126 algn: 8
@126type_declname: @128 type: @129 chain: @130
@128identifier_node  strg: void lngt: 4



The source code around 199 is :
  198 static void
  199 send_pwd_to_eterm ()
  200 {
  201   char *pwd, *f;
  202 
  203   f = 0;
  204   pwd = get_string_value ("PWD");
  205   if (pwd == 0)
  206 f = pwd = get_working_directory ("eterm");
  207   fprintf (stderr, "\032/%s\n", pwd);
  208   free (f);
  209 }

So can I infer that @9587 refers to line 201 for the pwd variable?

See https://archive.org/details/bash.compilation for a full snapshot of the
compile. build/eval.c.001t.tu is the file.


So please tell me if this is correct or are we missing important fields in the
decl_expr.

[Bug other/70426] New: decl_expr contains too little information

2016-03-27 Thread JamesMikeDuPont at googlemail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70426

Bug ID: 70426
   Summary: decl_expr contains too little information
   Product: gcc
   Version: 4.9.2
Status: UNCONFIRMED
  Severity: minor
  Priority: P3
 Component: other
  Assignee: unassigned at gcc dot gnu.org
  Reporter: JamesMikeDuPont at googlemail dot com
  Target Milestone: ---

using gcc (Debian 4.9.2-10) 4.9.2
In the 001t.tu file, the decl_expr contains no real information. 

here is the context of relevant statements :

@9529   function_declname: @9547type: @5191scpe: @155
 srcp: eval.c:199  chain: @9548
 link: static   body: @9549

@9549   bind_exprtype: @129 vars: @9568body: @9569

@9569   statement_list   0   : @95871   : @95882   : @9589
 3   : @95904   : @95915   : @9592
 6   : @9593
@9585   identifier_node  strg: pwd  lngt: 3
@9568   var_decl name: @9585type: @144 scpe: @9529
 srcp: eval.c:201  chain: @9586
 size: @22  algn: 64   used: 1

@9587   decl_exprtype: @129
@9588   decl_exprtype: @129
@9589   modify_expr  type: @144 op 0: @9586op 1: @9615

@129void_typename: @126 algn: 8
@126type_declname: @128 type: @129 chain: @130
@128identifier_node  strg: void lngt: 4



The source code around 199 is :
  198 static void
  199 send_pwd_to_eterm ()
  200 {
  201   char *pwd, *f;
  202 
  203   f = 0;
  204   pwd = get_string_value ("PWD");
  205   if (pwd == 0)
  206 f = pwd = get_working_directory ("eterm");
  207   fprintf (stderr, "\032/%s\n", pwd);
  208   free (f);
  209 }

So can I infer that @9587 refers to line 201 for the pwd variable?

See https://archive.org/details/bash.compilation for a full snapshot of the
compile. build/eval.c.001t.tu is the file.


So please tell me if this is correct or are we missing important fields in the
decl_expr.

[Bug target/70416] [SH]: error: 'asm' operand requires impossible reload when building ruby2.3

2016-03-27 Thread olegendo at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70416

--- Comment #14 from Oleg Endo  ---
(In reply to Kazumoto Kojima from comment #12)
> 
> (insn 516 508 510 18 (set (reg:SI 0 r0)
> (plus:SI (reg:SI 2 r2)
> (const_int 4 [0x4]))) xxx.i:100 67 {*addsi3}
>  (nil))
> 
> which is invalid.

I haven't checked the details... but we've added those "special" addsi patterns
and the above seems to be covered by at least one of them.

Maybe at that stage in the reload code it will end up using the last *addsi3
pattern and not try to look for a new pattern in the .md when it wants to
change it.  In other words, maybe it'll help if the *addsi3 patterns are merged
into a single pattern somehow.  I'll give it a try...

[Bug tree-optimization/70427] New: autofdo bootstrap generates wrong code

2016-03-27 Thread andi-gcc at firstfloor dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70427

Bug ID: 70427
   Summary: autofdo bootstrap generates wrong code
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: andi-gcc at firstfloor dot org
  Target Milestone: ---

I've been working on building gcc with an autofdo bootstrap.

Currently I always run into an crash while rebuilding tree.c with the stage2
compiler and the autofdo information 

Looking at the code it is clearly miscompiled in ipa_profile_generate_summary:

struct cgraph_edge * e = node->get_edge (stmt);
if (e && !e->indirect_unknown_callee)
  continue;


   0x0093bb16 <+326>:   callq  0x7be530
<_ZN11cgraph_node8get_edgeEP6gimple> 
   0x0093bb1b <+331>:   test   %rax,%rax   # check for NUULL
   0x0093bb1e <+334>:   mov%rax,%r8
   0x0093bb21 <+337>:   je 0x93bb2d   
<_ZL28ipa_profile_generate_summaryv+349>
   0x0093bb23 <+339>:   testb  $0x2,0x60(%rax)
   0x0093bb27 <+343>:   je 0x93baa7
<_ZL28ipa_profile_generate_summaryv+215>
   0x0093bb2d <+349>:   mov0x10(%r13),%rax # go here because of
NULL
=> 0x0093bb31 <+353>:   mov0x40(%r8),%rsi  # but we still
reference!

(gdb) p $r8
$4 = 0

The crash is on bb31 because r8 is NULL. The code checked the return value of
the call, but then references it afterwards before doing the continue.

Command line option:

cc1plus -fauto-profile=cc1plus.fda  -g -O2 tree.i

cc1plus.fda is at http://halobates.de/cc1plus.fda (too big to attach)

[Bug tree-optimization/70427] autofdo bootstrap generates wrong code

2016-03-27 Thread andi-gcc at firstfloor dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70427

--- Comment #1 from Andi Kleen  ---
Created attachment 38109
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38109&action=edit
ipa-profile input

Here's the source of the miscompiled file from the compiler

cc1plus -O2 ipa-profile.i  -S

unfortunately have to inspect assembler to see the miscompilation:

look for ipa_generate_profile_summary

then look for get_edge

call_ZN11cgraph_node8get_edgeEP6gimple
testq   %rax, %rax
movq%rax, %r15 
je  .L836< jump if rax/r15 is 0
testb   $2, 96(%rax)
je  .L837
.L836:   <--- it can be here
movq16(%r12), %rax
movq64(%r15), %rsi <-- BAD

same miscompilation here (just with another register). r15 is referenced after
being tested for NULL.

[Bug other/70428] New: -fdebug-prefix-map did not support to remap sources with relative path

2016-03-27 Thread hongxu.jia at windriver dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70428

Bug ID: 70428
   Summary: -fdebug-prefix-map did not support to remap sources
with relative path
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: other
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hongxu.jia at windriver dot com
  Target Milestone: ---

1. Prepare sources and build dir

$ pwd
/folk/hjia

$ mkdir dir1/dir2 test1/test2/ -p

$ cat > test1/test2/test.c << ENDOF
#include "test.h"

int main(int argc, char *argv[])
{
  func();
  return 0;
}

ENDOF

$ cat > test1/test2/test.h << ENDOF
void func()
{
  return;
}
ENDOF

$ cd dir1/dir2

2. Enter build dir to compile with relative path sources

$ gcc ../../test1/test2/test.c -g  -o test.o
$ objdump -g test.o | less

 <0>: Abbrev Number: 1 (DW_TAG_compile_unit)
   DW_AT_producer: (indirect string, offset: 0x47): GNU C 4.8.4
-mtune=generic -march=x86-64 -g -fstack-protector
<10>   DW_AT_language: 1(ANSI C)
<11>   DW_AT_name: (indirect string, offset: 0x5):
../../test1/test2/test.c 
<15>   DW_AT_comp_dir: (indirect string, offset: 0x1e):
/folk/hjia/dir1/dir2

Contents of the .debug_str section:


3. Compile with option -fdebug-prefix-map, it could not remap sources with
relative path

$ gcc ../../test1/test2/test.c
-fdebug-prefix-map=/folk/hjia/test1/test2=/usr/src -g  -o test.o
$ objdump -g test.o | less

 <0>: Abbrev Number: 1 (DW_TAG_compile_unit)
   DW_AT_producer: (indirect string, offset: 0x47): GNU C 4.8.4
-mtune=generic -march=x86-64 -g
-fdebug-prefix-map=/folk/hjia/test1/test2=/usr/src -fstack-protector 
<10>   DW_AT_language: 1(ANSI C)
<11>   DW_AT_name: (indirect string, offset: 0x5):
../../test1/test2/test.c 
<15>   DW_AT_comp_dir: (indirect string, offset: 0x1e):
/folk/hjia/dir1/dir2 


What we expected is:

<11>   DW_AT_name: (indirect string, offset: 0x5): /usr/src/test.c  
<15>   DW_AT_comp_dir: (indirect string, offset: 0x15):
/folk/hjia/dir1/dir2


[Bug tree-optimization/70427] autofdo bootstrap generates wrong code

2016-03-27 Thread andi-gcc at firstfloor dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70427

--- Comment #2 from Andi Kleen  ---
Created attachment 38110
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38110&action=edit
somewhat reduced input file, only single function

[Bug tree-optimization/70427] autofdo bootstrap generates wrong code

2016-03-27 Thread andi-gcc at firstfloor dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70427

--- Comment #3 from Andi Kleen  ---

Analyzing the code more it looks like the compiler generates it correctly, the
edge returned should not be 0 here.

[Bug tree-optimization/59124] [4.9/5/6 Regression] Wrong warnings "array subscript is above array bounds"

2016-03-27 Thread ppalka at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59124

--- Comment #36 from Patrick Palka  ---
Patch posted at https://gcc.gnu.org/ml/gcc-patches/2016-03/msg01439.html

[Bug bootstrap/70422] [6 regression] Bootstrap comparison failure

2016-03-27 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70422

Jim Wilson  changed:

   What|Removed |Added

 CC||wilson at gcc dot gnu.org

--- Comment #3 from Jim Wilson  ---
I can reproduce on armhf and aarch64, but not on x86_64.

stage2 is built with -g -gtoggle.  stage3 is built with -g.  Debug info is
stripped before the compare, so in theory that shouldn't matter.

I am looking at statistics.c, as it is a conveniently small file.  On aarch64,
in stage2 statistics.s, I see
.section   
.rodata._ZN10hash_tableI20stats_counter_hasher11xcallocatorE6expandEv.str1.8,"aMS",@progbits,1
.align  3
.LC17:
.string "alloc_entries"

In stage3 statistics.s I see
.section.rodata.str1.8,"aMS",@progbits,1
.align  3
...
.LC17:
.string "alloc_entries"
.zero   2
...
.section.debug_str,"MS",@progbits,1
...
.LASF1861:
.string "alloc_entries"

So something about debug info caused the string to move from the function
specific rodata section to the general rodata section, and that causes the
comparison failure.  On x86_64, the string is in the function specific rodata
section in both cases, so no comparison failure.

[Bug bootstrap/70422] [6 regression] Bootstrap comparison failure

2016-03-27 Thread wilson at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70422

--- Comment #4 from Jim Wilson  ---
The broken targets all define flag_section_anchors at -O1 and up.  x86_64 does
not.  I don't know why this makes a difference yet.