[Bug c++/84076] [6/7/8 Regression] Warning about objects through POD mistakenly claims the object is a pointer

2018-01-29 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84076

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org
Summary|[5/6/7/8 Regression]|[6/7/8 Regression] Warning
   |Warning about objects   |about objects through POD
   |through POD mistakenly  |mistakenly claims the
   |claims the object is a  |object is a pointer
   |pointer |

--- Comment #1 from Jakub Jelinek  ---
With -Wconditionally-supported you'd get additional warning:
pr84076.C: In function ‘int main()’:
pr84076.C:7:27: warning: passing objects of non-trivially-copyable type
‘std::__cxx11::string’ {aka ‘class std::__cxx11::basic_string’} through
‘...’ is conditionally supported [-Wconditionally-supported]
 printf("%s\n", str);
   ^
The reason for the std::string * in diagnostics is that is how we actually
implement the ... passing of non-trivially-copyable objects, we pass them by
invisible reference as they are passed to named arguments, and the -Wformat
code can't find anymore whether the user actually passed the std::string object
or an address of it.

[Bug c++/84076] [6/7/8 Regression] Warning about objects through POD mistakenly claims the object is a pointer

2018-01-29 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84076

--- Comment #2 from Jakub Jelinek  ---
The convert_arg_to_ellipsis call that converts in this case the non-POD class
to its address is done very shortly before calling the -Wformat check, but most
of the stuff the function is doing is needed for the following
check_format_arguments.  So, in order to hide this in diagnostics, we'd either
need to somehow mark the tree with the argument that check_format_arguments
could determine it is implicitly added, or have another on the side array with
the argument types for -Wformat* etc.
The question is though if we want or don't want to warn for
std::string str;
printf ("%p\n", str);
where str is passed as reference and thus printing address of it would work
just fine.

[Bug libgomp/84088] [nvptx] libgomp.oacc-fortran/declare-*.f90 execution fails

2018-01-29 Thread vries at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84088

--- Comment #2 from Tom de Vries  ---
Minimal version:
...
! { dg-do run } 

module vars
  implicit none
  integer z
  !$acc declare create (z)  
end module vars

program main
  use vars
  use openacc
  implicit none

  if (acc_is_present (z) .neqv. .true.) call abort

end program
...

[Bug libgomp/84096] New: Wrong prototype for omp_init_nest_lock_with_hint() in "omp.h.in"

2018-01-29 Thread cspiel at freenet dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84096

Bug ID: 84096
   Summary: Wrong prototype for omp_init_nest_lock_with_hint() in
"omp.h.in"
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libgomp
  Assignee: unassigned at gcc dot gnu.org
  Reporter: cspiel at freenet dot de
CC: jakub at gcc dot gnu.org
  Target Milestone: ---

Created attachment 43268
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43268&action=edit
Patch that corrects function prototype

As of revision 28bd6e12dc17b749e21d5e6127fee13bc12e9294 of the GIT
repository the prototype of function omp_init_nest_lock_with_hint() in
file "omp.h.in" is wrong: the first parameter must refer to a
`omp_nest_lock_t'.  See `OpenMP Application Programming Interface',
Version 4.5 (November 2015), section 3.3.2, page 273.

The attached patch corrects the problem.

[Bug c++/84080] [6/7/8 Regression] the compiler crashes when compiling the following sample file

2018-01-29 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84080

Jakub Jelinek  changed:

   What|Removed |Added

   Priority|P3  |P2
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2018-01-29
 CC||jakub at gcc dot gnu.org,
   ||jason at gcc dot gnu.org,
   ||nathan at gcc dot gnu.org
   Target Milestone|--- |6.5
Summary|the compiler crashes when   |[6/7/8 Regression] the
   |compiling the following |compiler crashes when
   |sample file |compiling the following
   ||sample file
 Ever confirmed|0   |1

--- Comment #1 from Jakub Jelinek  ---
Seems to ICE starting with r185768 when return type deduction has been added
for -std=c++1y.
The ICE in cgraph code, where we rely on being able to compute
DECL_ASSEMBLER_NAME, but the mangling code refuses to give that, as it consider
it dependent.
Even the full name:
"T foo() [with cc n = (cc)0; T = auto]"
suggests that after deducing the return type we haven't updated the template
parameter to what we've really deduced.

[Bug target/83496] MIPS BE: wrong code generates under "-Os -mbranch-cost=1"

2018-01-29 Thread laurent at guerby dot net
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83496

Laurent GUERBY  changed:

   What|Removed |Added

 CC||law at redhat dot com

--- Comment #14 from Laurent GUERBY  ---
Add Jeff in CC as maintainer of reorg (this bug is holding openwrt to gcc5).

[Bug c/84052] Using Randomizing structure layout plugin in linux kernel compilation doesn't generate proper debuginfo

2018-01-29 Thread pageexec at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84052

--- Comment #5 from PaX Team  ---
(In reply to Andrew Pinski from comment #4)
> Because debug information happens early on and has many interactions with
> the front end.

FINISH_TYPE happens early on too and the API promise gcc makes is that it's
invoked "After finishing parsing a type" (in practice that's right after
c_parser_struct_or_union_specifier for this case). clearly there's a sequencing
problem between this and the emission of debug information which means it's
either undocumented (gcc bug) or unintended (gcc bug). i don't know which it is
but clearly something is not right.

[Bug c++/84082] [7/8 Regression] ICE with broken template function definition

2018-01-29 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84082

Jakub Jelinek  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2018-01-29
 CC||jakub at gcc dot gnu.org,
   ||jason at gcc dot gnu.org
   Target Milestone|--- |7.4
 Ever confirmed|0   |1

--- Comment #1 from Jakub Jelinek  ---
Started with r245223.

[Bug tree-optimization/82819] [8 Regression] [graphite] ICE in set_codegen_error, at graphite-isl-ast-to-gimple.c:206

2018-01-29 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82819

Richard Biener  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #6 from Richard Biener  ---
Fixed.

[Bug tree-optimization/83176] [8 Regression] [graphite] ICE in set_codegen_error, at graphite-isl-ast-to-gimple.c:206

2018-01-29 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83176

Richard Biener  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Richard Biener  ---
Fixed.

[Bug c++/84082] [7/8 Regression] ICE with broken template function definition

2018-01-29 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84082

--- Comment #2 from Jakub Jelinek  ---
build_functional_cast here creates CAST_EXPR with NULL TREE_TYPE as well as
TREE_OPERAND (, 0) and cp_parser_constant_expression ->
potential_rvalue_constant_expression -> potential_constant_expression_1 is
called on it and doesn't handle that case:
  return (RECUR (TREE_OPERAND (t, 0),
 TREE_CODE (TREE_TYPE (t)) != REFERENCE_TYPE));

[Bug rtl-optimization/84068] [8 Regression] ICE: qsort checking failed: qsort comparator non-negative on sorted output: 1 with -fno-sched-critical-path-heuristic --param=max-sched-extend-regions-iters

2018-01-29 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84068

Richard Biener  changed:

   What|Removed |Added

   Keywords||ice-checking
   Target Milestone|--- |8.0
Summary|ICE: qsort checking failed: |[8 Regression] ICE: qsort
   |qsort comparator|checking failed: qsort
   |non-negative on sorted  |comparator non-negative on
   |output: 1 with  |sorted output: 1 with
   |-fno-sched-critical-path-he |-fno-sched-critical-path-he
   |uristic |uristic
   |--param=max-sched-extend-re |--param=max-sched-extend-re
   |gions-iters=5 @ aarch64 |gions-iters=5 @ aarch64

[Bug middle-end/84067] [8 regression] gcc.dg/wmul-1.c regression on aarch64 after r257077

2018-01-29 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84067

Richard Biener  changed:

   What|Removed |Added

 CC||rguenth at gcc dot gnu.org

--- Comment #2 from Richard Biener  ---
So any hint on whether the code after r257077 is better or worse than before?

[Bug middle-end/84071] [7/8 regression] nonzero_bits1 of subreg incorrect

2018-01-29 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84071

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P2

[Bug tree-optimization/84057] [8 Regression] ICE: Segmentation fault (in can_remove_branch_p)

2018-01-29 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84057

Richard Biener  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #5 from Richard Biener  ---
Fixed.

[Bug tree-optimization/84057] [8 Regression] ICE: Segmentation fault (in can_remove_branch_p)

2018-01-29 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84057

--- Comment #4 from Richard Biener  ---
Author: rguenth
Date: Mon Jan 29 09:16:09 2018
New Revision: 257139

URL: https://gcc.gnu.org/viewcvs?rev=257139&root=gcc&view=rev
Log:
2018-01-29  Richard Biener  

PR tree-optimization/84057
* tree-ssa-loop-ivcanon.c (unloop_loops): Deal with already
removed paths when removing edges.

* gcc.dg/graphite/pr84057.c: New testcase.

Added:
trunk/gcc/testsuite/gcc.dg/graphite/pr84057.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-ssa-loop-ivcanon.c

[Bug c++/84076] [6/7/8 Regression] Warning about objects through POD mistakenly claims the object is a pointer

2018-01-29 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84076

Richard Biener  changed:

   What|Removed |Added

   Keywords||diagnostic
   Target Milestone|--- |6.5

[Bug target/84077] [7/8 Regression] Likely wrong code with `__builtin_expect()` on i686-linux-gnu

2018-01-29 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84077

Richard Biener  changed:

   What|Removed |Added

   Keywords||wrong-code
 Target||x86_64-*-*, i?86-*-*
  Known to work||6.4.0
   Target Milestone|--- |7.4
Summary|Likely wrong code with  |[7/8 Regression] Likely
   |`__builtin_expect()` on |wrong code with
   |i686-linux-gnu  |`__builtin_expect()` on
   ||i686-linux-gnu
  Known to fail||7.2.0

--- Comment #2 from Richard Biener  ---
The use of __builtin_expect likely triggers a problem elsewhere, not
necessarily in the backend.

[Bug c++/84082] [7/8 Regression] ICE with broken template function definition

2018-01-29 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84082

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P2

[Bug middle-end/84083] [missed optimization] loop-invariant strlen() not hoisted out of loop

2018-01-29 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84083

Richard Biener  changed:

   What|Removed |Added

   Keywords||missed-optimization
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2018-01-29
  Component|rtl-optimization|middle-end
Version|unknown |8.0
 Ever confirmed|0   |1

--- Comment #3 from Richard Biener  ---
I think there's a dup for this.  Yes, we don't currently implement restrict
disambiguation for calls.

[Bug tree-optimization/84084] [Regression 7/8][-O2] Early VRP pass wrongly removes "ret" exit basic block causing wrong behavior

2018-01-29 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84084

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #1 from Richard Biener  ---
You are invoking undefined behavior as you correctly noted.  So the compiler is
free to optimize the code by removing the exit test -- i < 2 has to be always
true for a[i] to be not undefined.

I suggest to write (i < 2 && (val = arr[i], true)) instead.

[Bug bootstrap/84017] [6/7/8 regression] Bootstrap failure on Solaris 10/x86 with gas/ld

2018-01-29 Thread ro at CeBiTec dot Uni-Bielefeld.DE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84017

--- Comment #4 from ro at CeBiTec dot Uni-Bielefeld.DE  ---
> --- Comment #2 from Jakub Jelinek  ---
> I can't think of how the self-test failure could be related, unless it just
> results in miscompiled stage2 or stage3 compiler.

It seems that's exactly what's happening, both for cc1 and go1.

> Anyway, .gnu.linkonce* sections are used when the assembler/linker don't
> support comdat, when they do, we instead use comdat.  If the gas/ld combo on
> Solaris 10 doesn't support either comdat, or properly .gnu.linkonce, then it
> probably can't claim it SUPPORTS_ONE_ONLY and then pretty much doesn't support
> C++ at a usable level.  Or what exact part of .gnu.linkonce this combo doesn't
> support?
> In the i?86-*-solaris* target hook it could override any of the generic hooks
> that rely on it.
> Or just declare gas/ld combo unsupported on Solaris 10.

I've done some digging over the weekend.  Initially, prompted by the
linker warnings during the build

ld: warning: relocation warning: R_386_GOTOFF: file
../src/c++11/.libs/libc++11convenience.a(locale-inst.o): section
[217].rel.gnu.linkonce.t._ZNSt17moneypunct_bynameIcLb0EEC2EPKcj: symbol .LC0:
relocation against discarded COMDAT section
[138].gnu.linkonce.r._ZSt16__convert_from_vRKPiPciPKcz.str1.1: symbol not
found, relocation ignored

I was reminded of ld's -z relaxreloc option (more on that separately).
While it doesn't help in this case, it probably provides an option to
enable comdat on some versions of Solaris 10 (though certainly not for GCC 8).

However, looking more closely at the warnings, the only affect two files

../src/c++11/.libs/libc++11convenience.a(locale-inst.o):
../src/c++11/.libs/libc++11convenience.a(wlocale-inst.o):

and refer only to a single comdat section:

.gnu.linkonce.r._ZSt16__convert_from_vRKPiPciPKcz.str1.1

This is only present in the gas builds, since as seems to have no syntax
for the SHF_MERGE flag, thus explaining why only gas/ld is failing.

I've prepared a patch to disable HAVE_GAS_SHF_MERGE which allowed me to
successfully bootstrap both the gas/ld and as/ld configurations.

Rainer

[Bug c++/84082] [7/8 Regression] ICE with broken template function definition

2018-01-29 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84082

--- Comment #3 from Jakub Jelinek  ---
To be precise, the CAST_EXPR doesn't have NULL TREE_TYPE initially, it has A
type, just NULL operand.
But then r245223 comes with:
   if (processing_template_decl)
 {
   dependent_p = true;
   scope = TREE_TYPE (postfix_expression) = NULL_TREE;
 }
and clears the TREE_TYPE on the CAST_EXPR.

Slightly tweaked testcase that still ICEs:
struct A;

template void foo()
{
  static int a[A().operator=(A())];
}

which is accepted fine with extra struct A { operator int (); }; before the
template.
So, either we shouldn't clear the type and use something different to mark such
CAST_EXPRs as dependent, or potential_constant_expression_p and similar code
needs to handle CAST_EXPR with NULL TREE_TYPE.

[Bug middle-end/84067] [8 regression] gcc.dg/wmul-1.c regression on aarch64 after r257077

2018-01-29 Thread ktkachov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84067

--- Comment #3 from ktkachov at gcc dot gnu.org ---
(In reply to Richard Biener from comment #2)
> So any hint on whether the code after r257077 is better or worse than before?

Looks worse unfortunately:
For aarch64 at -O2 it generates:
foo:
mov w3, 44
mov w2, 40
mov w5, 1
mov w4, 2
smull   x3, w1, w3
smull   x2, w1, w2
str w5, [x0, x3]
add x2, x2, 400
add x1, x2, x1, sxtw 2
str w4, [x0, x1]
ret

whereas with r257077 it generates the shorter:
foo:
mov w3, 40
sxtwx2, w1
mov w4, 1
smaddl  x0, w1, w3, x0
mov w3, 2
add x1, x0, x2, lsl 2
str w4, [x0, x2, lsl 2]
str w3, [x1, 400]
ret

[Bug c/84085] Array element is unnecessary loaded twice

2018-01-29 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84085

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #1 from Richard Biener  ---
This was indeed fixed in GCC 8.  Note that the fundamental issue is that

*(&s1->a1[0][0] + n)

performs an access via 'int *' while s1->a1[N-1][N-1] performs an access
via struct S1 *.  This allows the test1 case to disambiguate the load
via s1 against that via s2 by using type-based aliasing.  This is not
possible for test2 or test3.  We now optimize this solely by the fact
that

*((&s2->a2[0][0] + n)) = *(&s1->a1[0][0] + n);

stores the same value it later loads into *((&s2->a2[0][0] + n)).  If you
change this by doing say

*((&s2->a2[0][0] + n)) = *(&s1->a1[0][0] + n) + 1;

it will no longer be optimized.

This may be too conservative reading of the C standard by GCC but we also
have to adhere other languages and tricks done by programmers like viewing
a data array via different multidimensional array shapes by means of
casting.

So, this particular case is fixed in GCC 8.

[Bug middle-end/84067] [8 regression] gcc.dg/wmul-1.c regression on aarch64 after r257077

2018-01-29 Thread ktkachov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84067

--- Comment #4 from ktkachov at gcc dot gnu.org ---
(In reply to ktkachov from comment #3)
> (In reply to Richard Biener from comment #2)
> > So any hint on whether the code after r257077 is better or worse than 
> > before?
> 
> Looks worse unfortunately:
> For aarch64 at -O2 it generates:
> foo:
>   mov w3, 44
>   mov w2, 40
>   mov w5, 1
>   mov w4, 2
>   smull   x3, w1, w3
>   smull   x2, w1, w2
>   str w5, [x0, x3]
>   add x2, x2, 400
>   add x1, x2, x1, sxtw 2
>   str w4, [x0, x1]
>   ret
> 
> whereas with r257077 it generates the shorter:

Sorry, I meant to write "with r257077 reverted..."

> foo:
>   mov w3, 40
>   sxtwx2, w1
>   mov w4, 1
>   smaddl  x0, w1, w3, x0
>   mov w3, 2
>   add x1, x0, x2, lsl 2
>   str w4, [x0, x2, lsl 2]
>   str w3, [x1, 400]
>   ret

[Bug libgomp/84086] [8 Regresssion] segfault in instantiate_scev_r for libgomp.fortran/examples-4/simd-2.f90 -O1

2018-01-29 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84086

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P1
 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
   Target Milestone|--- |8.0

--- Comment #2 from Richard Biener  ---
Confirmed.

#8  find_givs_in_stmt (stmt=0x76735460, data=0x7fffd8a0)
at /space/rguenther/src/svn/trunk/gcc/tree-ssa-loop-ivopts.c:1446
1446  if (!find_givs_in_stmt_scev (data, stmt, &iv))
(gdb) p debug_gimple_stmt (stmt)
_14 = 0;

#3  0x00df2f50 in resolve_mixers (loop=0x76a74550, 
chrec=0x767347a0, folded_casts=0x7fffd69e)
at /space/rguenther/src/svn/trunk/gcc/tree-scalar-evolution.c:2840
2840  tree ret = instantiate_scev_r (loop_preheader_edge (loop), loop,
NULL,
(gdb) l
2835{
2836  global_cache = new instantiate_cache_type;
2837  destr = true;
2838}
2839
2840  tree ret = instantiate_scev_r (loop_preheader_edge (loop), loop,
NULL,
2841 chrec, &fold_conversions, 0);
2842
2843  if (folded_casts && !*folded_casts)
2844*folded_casts = fold_conversions;
(gdb) p chrec
$5 = (tree) 0x767347a0
(gdb) p debug_generic_expr ($5)
(sizetype) _21
$6 = void
(gdb) p debug_tree ($5)
 
unit-size 
align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type
0x768a6000 precision:64 min  max >

arg:0 >
$7 = 10


so another case of stale SCEV cache.

[Bug middle-end/84089] [8 Regression] FAIL: g++.dg/cpp1y/lambda-generic-x.C -std=gnu++14 (internal compiler error)

2018-01-29 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84089

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |8.0

[Bug middle-end/84095] [8 Regression] false-positive -Wrestrict warnings for memcpy within array

2018-01-29 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84095

Richard Biener  changed:

   What|Removed |Added

   Keywords||diagnostic
 CC||msebor at gcc dot gnu.org
   Target Milestone|--- |8.0
Summary|false-positive -Wrestrict   |[8 Regression]
   |warnings for memcpy within  |false-positive -Wrestrict
   |array   |warnings for memcpy within
   ||array

[Bug c++/84082] [7/8 Regression] ICE with broken template function definition

2018-01-29 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84082

--- Comment #4 from Jakub Jelinek  ---
As that PR was a workaround for buggy code and the intent was to not reject
code that has been accepted before, perhaps we could only do the pedwarn rather
than error and clearing of TREE_TYPE (postfix_expression) if it is valid to
clear the type for it, and otherwise error and not clear the type?
Which expression types require always non-NULL TREE_TYPE, besides
{,REINTERPRET_,CONST_,STATIC_,DYNAMIC_}CAST_EXPR and IMPLICIT_CONV_EXPR?

[Bug c++/84092] [8 Regression] ICE on C++14 code with variable template: in build_qualified_name, at cp/tree.c:2043

2018-01-29 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84092

Richard Biener  changed:

   What|Removed |Added

   Keywords||ice-on-valid-code
Version|unknown |8.0
   Target Milestone|--- |8.0
Summary|ICE on C++14 code with  |[8 Regression] ICE on C++14
   |variable template: in   |code with variable
   |build_qualified_name, at|template: in
   |cp/tree.c:2043  |build_qualified_name, at
   ||cp/tree.c:2043

[Bug tree-optimization/84090] [8 Regression] ICE in gimple_redirect_edge_and_branch, at tree-cfg.c:6151

2018-01-29 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84090

Richard Biener  changed:

   What|Removed |Added

   Keywords||needs-bisection
   Target Milestone|--- |8.0

[Bug c++/84091] [8 Regression] ICE on valid C++ code: Segmentation fault

2018-01-29 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84091

Richard Biener  changed:

   What|Removed |Added

   Keywords||ice-on-valid-code
Version|unknown |8.0
   Target Milestone|--- |8.0
Summary|ICE on valid C++ code:  |[8 Regression] ICE on valid
   |Segmentation fault  |C++ code: Segmentation
   ||fault

[Bug libgomp/84086] [8 Regresssion] segfault in instantiate_scev_r for libgomp.fortran/examples-4/simd-2.f90 -O1

2018-01-29 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84086

--- Comment #3 from Richard Biener  ---
Released by

#0  release_ssa_name_fn (fn=0x76a5f160, var=)
at /space/rguenther/src/svn/early-lto-debug/gcc/tree-ssanames.c:536
#1  0x013735fe in release_ssa_name (name=)
at /space/rguenther/src/svn/early-lto-debug/gcc/tree-ssanames.h:141
#2  0x01376b5c in release_defs (stmt=)
at /space/rguenther/src/svn/early-lto-debug/gcc/tree-ssanames.c:826
#3  0x013f6081 in adjust_simduid_builtins (htab=0x0)
at /space/rguenther/src/svn/early-lto-debug/gcc/tree-vectorizer.c:237
#4  0x013f7a49 in vectorize_loops ()
at /space/rguenther/src/svn/early-lto-debug/gcc/tree-vectorizer.c:839

but that doesn't clear the SCEV cache.  [I still think maintaining this
cache over pass boundaries is a ticking bomb...  we eventually should add
a verifier that walks the hashtable looking for released SSA names - but
that's not the only issue with the cache obviously!]

So here we're coming via

1416  if (!simple_iv (loop, loop_containing_stmt (stmt), lhs, iv, true))
1417return false;

of said stmt and get at the initial SCEV via the cache which was built when
the stmt was still _14 = _21;

So a slightly more generic fix than putting a scev_reset into the vectorizer
would be

Index: gcc/tree-ssanames.c
===
--- gcc/tree-ssanames.c (revision 257139)
+++ gcc/tree-ssanames.c (working copy)
@@ -29,6 +29,8 @@ along with GCC; see the file COPYING3.
 #include "stor-layout.h"
 #include "tree-into-ssa.h"
 #include "tree-ssa.h"
+#include "cfgloop.h"
+#include "tree-scalar-evolution.h"

 /* Rewriting a function into SSA form can create a huge number of SSA_NAMEs,
many of which may be thrown away shortly after their creation if jumps
@@ -241,6 +243,9 @@ verify_ssaname_freelists (struct functio
 void
 flush_ssaname_freelist (void)
 {
+  /* If there were any SSA names released reset the SCEV cache.  */
+  if (! vec_safe_is_empty (FREE_SSANAMES_QUEUE (cfun)))
+scev_reset_htab ();
   vec_safe_splice (FREE_SSANAMES (cfun), FREE_SSANAMES_QUEUE (cfun));
   vec_safe_truncate (FREE_SSANAMES_QUEUE (cfun), 0);
 }

[Bug rtl-optimization/81443] [8 regression] build/genrecog.o: virtual memory exhausted: Cannot allocate memory

2018-01-29 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81443

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P2
 Status|ASSIGNED|NEW
  Known to work||7.3.0
   Target Milestone|7.4 |7.3

--- Comment #22 from Richard Biener  ---
Fixed on the GCC 7 branch (but not trunk?).  Adjusting target
milestone/known-to-work for the moment.

[Bug target/83008] [performance] Is it better to avoid extra instructions in data passing between loops?

2018-01-29 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008

--- Comment #32 from rguenther at suse dot de  ---
On Fri, 26 Jan 2018, sergey.shalnov at intel dot com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008
> 
> --- Comment #31 from sergey.shalnov at intel dot com ---
> Richard,
> Thank you for your latest patch. This patch is exactly that 
> I’ve discussed in this issue request.
> I tested it with SPEC20[06|17] and see no performance/stability degradation.
> 
> Could you please merge your latest patch into main trunk?

Is this bug a regression?  If not I think the change has to wait for
GCC 9.

> I will provide Intel specific changes (in gcc/config/i386) that 
> will leverage your patch to get the performance better 
> for the provided testcase (step N2)

Can you attach those?

[Bug middle-end/84067] [8 regression] gcc.dg/wmul-1.c regression on aarch64 after r257077

2018-01-29 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84067

--- Comment #5 from rguenther at suse dot de  ---
On Mon, 29 Jan 2018, ktkachov at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84067
> 
> --- Comment #3 from ktkachov at gcc dot gnu.org ---
> (In reply to Richard Biener from comment #2)
> > So any hint on whether the code after r257077 is better or worse than 
> > before?
> 
> Looks worse unfortunately:
> For aarch64 at -O2 it generates:
> foo:
> mov w3, 44
> mov w2, 40
> mov w5, 1
> mov w4, 2
> smull   x3, w1, w3
> smull   x2, w1, w2
> str w5, [x0, x3]
> add x2, x2, 400
> add x1, x2, x1, sxtw 2
> str w4, [x0, x1]
> ret
> 
> whereas with r257077 it generates the shorter:
> foo:
> mov w3, 40
> sxtwx2, w1
> mov w4, 1
> smaddl  x0, w1, w3, x0
> mov w3, 2
> add x1, x0, x2, lsl 2
> str w4, [x0, x2, lsl 2]
> str w3, [x1, 400]
> ret

So shorter is worse?  Might be because I don't understand the
difference between the 'lsl 2' and the 'sxtw 2' or the cost
of the [x1, 400] addressing.

[Bug middle-end/84067] [8 regression] gcc.dg/wmul-1.c regression on aarch64 after r257077

2018-01-29 Thread ktkachov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84067

--- Comment #6 from ktkachov at gcc dot gnu.org ---
(In reply to rguent...@suse.de from comment #5)
> On Mon, 29 Jan 2018, ktkachov at gcc dot gnu.org wrote:
> 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84067
> > 
> > --- Comment #3 from ktkachov at gcc dot gnu.org ---
> > (In reply to Richard Biener from comment #2)
> > > So any hint on whether the code after r257077 is better or worse than 
> > > before?
> > 
> > Looks worse unfortunately:
> > For aarch64 at -O2 it generates:
> > foo:
> > mov w3, 44
> > mov w2, 40
> > mov w5, 1
> > mov w4, 2
> > smull   x3, w1, w3
> > smull   x2, w1, w2
> > str w5, [x0, x3]
> > add x2, x2, 400
> > add x1, x2, x1, sxtw 2
> > str w4, [x0, x1]
> > ret
> > 
> > whereas with r257077 it generates the shorter:
> > foo:
> > mov w3, 40
> > sxtwx2, w1
> > mov w4, 1
> > smaddl  x0, w1, w3, x0
> > mov w3, 2
> > add x1, x0, x2, lsl 2
> > str w4, [x0, x2, lsl 2]
> > str w3, [x1, 400]
> > ret
> 
> So shorter is worse?  Might be because I don't understand the
> difference between the 'lsl 2' and the 'sxtw 2' or the cost
> of the [x1, 400] addressing.

Sorry, I messed up the writeup. Let me try again.
The shorter sequence (with the smaddl) is the good one and is produced
*without* r257077. After r257077 we generate the longer and worse sequence with
two smull.

[Bug fortran/84093] Invalid nested derived type constructor not rejected

2018-01-29 Thread dominiq at lps dot ens.fr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84093

Dominique d'Humieres  changed:

   What|Removed |Added

   Priority|P3  |P4
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2018-01-29
 Ever confirmed|0   |1

--- Comment #1 from Dominique d'Humieres  ---
Confirmed from 4.8 up to trunk (8.0).

[Bug fortran/84060] Wrong assignment from a class(*) variable which is a function result.

2018-01-29 Thread dominiq at lps dot ens.fr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84060

Dominique d'Humieres  changed:

   What|Removed |Added

   Priority|P3  |P4
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2018-01-29
 Ever confirmed|0   |1

--- Comment #1 from Dominique d'Humieres  ---
Confirmed for 7.3.0 and trunk (8.0). Compiling the first test with 6.4.0 gives
an ICE and

 res = 5
1
Error: Assignment to an allocatable polymorphic variable at (1) is not yet
supported

for the second test.

[Bug tree-optimization/84037] [8 Regression] Speed regression of polyhedron benchmark since r256644

2018-01-29 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84037

--- Comment #9 from Richard Biener  ---
(In reply to Martin Liška from comment #7)
> (In reply to Jakub Jelinek from comment #6)
> > Is it really r256643 and not r256644 that is causing this though?
> 
> Yes, I can verify that it's r256644 that's causing the regression.

This means that those newly vectorized loops make capacita slower.  551 is

  do m=1,n/4-1
h = A(j0+m) + A(j2+m)*E(m*inc)
A(j2+m) = A(j0+m) - A(j2+m)*E(m*inc)
A(j0+m) = h
eh = conjg(E(ntot/4-m*inc))
h = A(j1+m) - A(j3+m)*eh
A(j3+m) = A(j1+m) + A(j3+m)*eh
A(j1+m) = h
  end do

but it's actually the loops from the array expressions I guess (receiving
"interesting" locations).  105 is

do i=1,Ng1 !   .. and multiply charge with x
->do j=1,Ng2
X(i,j) = X(i,j) * D1 * (i-(Ng1+1)/2.0_dp)
  end do
end do

the variable strides are because those are arrays accessed via array
descriptors.  This means that the stride will be very likely one so
any "strided XY" vectorization will have quite a big overhead.

Of course in the end it looks like a cost model issue (which might be
just not enough factoring in of the alias runtime check).  Like in
the 105 case I expect the non-vectorized loop to be a quite "nice"
optimized nest with IVO doing a good job, etc..  If the inner loop
is vectorized conditionally this can wreck code generation and
runtime quite a bit.  Ng1/Ng2 are 1024 (at runtime, read from capacita.in).

For 105 we have

capacita.f90:105:0: note: need run-time check that (ssizetype) ((sizetype)
prephitmp_341 * 4) is nonzero
capacita.f90:105:0: note: versioning for alias required: can't determine
dependence between d1 and *_150[_65]

so its two checks needed.

capacita.f90:105:0: note: Cost model analysis:
  Vector inside of loop cost: 172
  Vector prologue cost: 60
  Vector epilogue cost: 136
  Scalar iteration cost: 60
  Scalar outside cost: 8
  Vector outside cost: 196
  prologue iterations: 0
  epilogue iterations: 2
  Calculated minimum iters for profitability: 7

I think the bug is that we're somehow thinking the vectorized arithmetic
(two multiplications) offset the use of strided loads and stores...


Testcase for this loop:

module solv_cap
  implicit none

  public  :: solveP

  integer, parameter, public :: dp = selected_real_kind(5)

  real(kind=dp), private :: Pi, eps0
  real(kind=dp), private :: D1, D2
  integer,   private, save :: Ng1=0, Ng2=0
  integer,   private, pointer, dimension(:,:)  :: Grid

contains

  subroutine solveP(P)
real(kind=dp), intent(out) :: P

real(kind=dp), allocatable, dimension(:,:)  :: Y0, X
integer :: i,j

allocate( Y0(Ng1,Ng2), X(Ng1,Ng2) )

do i=1,Ng1
  do j=1,Ng2
Y0(i,j) = D1 * (i-(Ng1+1)/2.0_dp) * Grid(i,j)
  end do
end do ! RHS for in-field E_x=1.  V = -V_in = x, where metal on grid

call solve( X, Y0 )

X = X - sum(X)/size(X) ! get rid of monopole term ..
do i=1,Ng1 !   .. and multiply charge with x
  do j=1,Ng2
X(i,j) = X(i,j) * D1 * (i-(Ng1+1)/2.0_dp)
  end do
end do
P = sum(X)*D1*D2 * 4*Pi*eps0   ! E-dipole moment in 1 V/m field

deallocate( X, Y0 )
return
  end subroutine solveP

end module solv_cap

[Bug middle-end/84095] [8 Regression] false-positive -Wrestrict warnings for memcpy within array

2018-01-29 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84095

Jakub Jelinek  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2018-01-29
 CC||jakub at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Jakub Jelinek  ---
This warning at least in its current shape doesn't belong into either -Wall or
-W IMNSHO.

  if (TREE_CODE (expr) == ADDR_EXPR)
{
  poly_int64 off;
  tree op = TREE_OPERAND (expr, 0);

  /* Determine the base object or pointer of the reference
 and its constant offset from the beginning of the base.  */
  base = get_addr_base_and_unit_offset (op, &off);

  HOST_WIDE_INT const_off;
  if (base && off.is_constant (&const_off))
{
  offrange[0] += const_off;
  offrange[1] += const_off;

  /* Stash the reference for offset validation.  */
  ref = op;

  /* Also stash the constant offset for offset validation.  */
  if (TREE_CODE (op) == COMPONENT_REF)
refoff = const_off;
}
  else
{
  size = NULL_TREE;
  base = get_base_address (TREE_OPERAND (expr, 0));
}
}

is plain wrong.  get_addr_base_and_unit_offset is solely for computation of a
constant offset (or these days poly_int64 offset).  Here you need to do
get_inner_reference instead, and treat the poly_int64 bit offset (rather than
byte offset) as the constant part and for the variable part try to use
value ranges and handle casts/extensions in the expression too properly.
I don't see how this warning can be enabled at all at -O0/-O1/-Og, unless you
warn solely about cases where you can prove overalap, rather than warning just
in case.  And the general case of the warning should be, if I don't understand
something (such as the base = get_base_address (TREE_OPERAND (expr, 0)); above,
I punt on the warning, rather than just giving false positives.  That is only a
sure way to add the warning to the kill-list of projects that -Wno-* broken
warnings, and for some users to stop using GCC.

[Bug c++/84091] [8 Regression] ICE on valid C++ code: Segmentation fault

2018-01-29 Thread paolo.carlini at oracle dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84091

Paolo Carlini  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2018-01-29
 Ever confirmed|0   |1

[Bug c++/84092] [8 Regression] ICE on C++14 code with variable template: in build_qualified_name, at cp/tree.c:2043

2018-01-29 Thread paolo.carlini at oracle dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84092

Paolo Carlini  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2018-01-29
 Ever confirmed|0   |1

[Bug c++/84092] [8 Regression] ICE on C++14 code with variable template: in build_qualified_name, at cp/tree.c:2043

2018-01-29 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84092

Jakub Jelinek  changed:

   What|Removed |Added

   Priority|P3  |P1
   Last reconfirmed|2018-01-29 00:00:00 |
 CC||jakub at gcc dot gnu.org,
   ||jason at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek  ---
Started with r251438.

[Bug c++/84097] New: [8 regression] Incorrect -Wunused-but-set-variable warning

2018-01-29 Thread sylvestre at debian dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84097

Bug ID: 84097
   Summary: [8 regression] Incorrect -Wunused-but-set-variable
warning
   Product: gcc
   Version: 8.0
Status: UNCONFIRMED
  Keywords: diagnostic
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: sylvestre at debian dot org
CC: jakub at gcc dot gnu.org, jason at gcc dot gnu.org,
sgunderson at bigfoot dot com, trippels at gcc dot gnu.org,
unassigned at gcc dot gnu.org, webrown.cpp at gmail dot com
  Target Milestone: ---

+++ This bug was initially created as a clone of Bug #82728 +++

Seen in Firefox code:
/root/firefox-gcc-last/dom/animation/KeyframeUtils.cpp: In member function
'uint32_t
mozilla::PropertyPriorityComparator::SubpropertyCount(nsCSSPropertyID) const':
/root/firefox-gcc-last/layout/style/nsCSSProps.h:689:29: error: variable 'es_'
set but not used [-Werror=unused-but-set-variable]
 es_ = (nsCSSPropertyID)((enabledstate_) |   \
 ^~~


#define CSSPROPS_FOR_SHORTHAND_SUBPROPERTIES(it_, prop_, enabledstate_)   \
  for (const nsCSSPropertyID *it_ = nsCSSProps::SubpropertyEntryFor(prop_), \
es_ = (nsCSSPropertyID)((enabledstate_) |   \
  CSSEnabledState(0));\
   *it_ != eCSSProperty_UNKNOWN; ++it_)   \
if (nsCSSProps::IsEnabled(*it_, (mozilla::CSSEnabledState) es_))


We can see that es_ is used at the last line.

This issue doesn't occur with gcc 7.

Unreduced test case
http://sylvestre.ledru.info/bordel/gcc-unused-but-set-variable.tar.gz 1.1mb
(sorry)

$ g++-8 -c -Werror=unused-but-set-variable foo.cpp

[Bug libgomp/84088] [nvptx] libgomp.oacc-fortran/declare-*.f90 execution fails

2018-01-29 Thread vries at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84088

Tom de Vries  changed:

   What|Removed |Added

 CC||pault at gcc dot gnu.org

--- Comment #3 from Tom de Vries  ---
Bisect points to commit r257065:
...
commit d9c7c3e3f6eb4621f929474e0ba44e7d61584431 (HEAD)
Author: pault 
Date:   Thu Jan 25 19:09:40 2018 +

2018-25-01  Paul Thomas  

PR fortran/37577
* array.c (gfc_match_array_ref): If standard earlier than F2008
it is an error if the reference dimension is greater than 7.
libgfortran.h : Increase GFC_MAX_DIMENSIONS to 15. Change the
dtype masks and shifts accordingly.
* trans-array.c (gfc_conv_descriptor_dtype): Use the dtype
type node to check the field.
(gfc_conv_descriptor_dtype): Access the rank field of dtype.
(duplicate_allocatable_coarray): Access the rank field of the
dtype descriptor rather than the dtype itself.
* trans-expr.c (get_scalar_to_descriptor_type): Store the type
of 'scalar' on entry and use its TREE_TYPE if it is ARRAY_TYPE
(ie. a character).
(gfc_conv_procedure_call): Pass TREE_OPERAND (tmp,0) to
get_scalar_to_descriptor_type if the actual expression is a
constant.
(gfc_trans_structure_assign): Assign the rank directly to the
dtype rank field.
* trans-intrinsic.c (gfc_conv_intrinsic_rank): Cast the result
to default integer kind.
(gfc_conv_intrinsic_sizeof): Obtain the element size from the
'elem_len' field of the dtype.
* trans-io.c (gfc_build_io_library_fndecls): Replace
gfc_int4_type_node with dtype_type_node where necessary.
(transfer_namelist_element): Use gfc_get_dtype_rank_type for
scalars.
* trans-types.c : Provide 'get_dtype_type_node' to acces the
dtype_type_node and, if necessary, build it.
The maximum size of an array element is now determined by the
maximum value of size_t.
Update the description of the array descriptor, including the
type def for the dtype_type.
(gfc_get_dtype_rank_type): Build a constructor for the dtype.
Distinguish RECORD_TYPEs that are BT_DERIVED or BT_CLASS.
(gfc_get_array_descriptor_base): Change the type of the dtype
field to dtype_type_node.
(gfc_get_array_descr_info): Get the offset to the rank field of
the dtype.
* trans-types.h : Add a prototype for 'get_dtype_type_node ()'.
* trans.h : Define the indices of the dtype fields.

2018-25-01  Paul Thomas  

PR fortran/37577
* gfortran.dg/coarray_18.f90: Allow dimension 15 for F2008.
* gfortran.dg/coarray_lib_this_image_2.f90: Change 'array1' to
'array01' in the tree dump comparison.
* gfortran.dg/coarray_lib_token_4.f90: Likewise.
* gfortran.dg/inline_sum_1.f90: Similar - allow two digits.
* gfortran.dg/rank_1.f90: Allow dimension 15 for F2008.

2018-25-01  Paul Thomas  

PR fortran/37577
* caf/single.c (_gfortran_caf_failed_images): Access the 'type'
and 'elem_len' fields of the dtype instead of the shifts.
(_gfortran_caf_stopped_images): Likewise.
* intrinsics/associated.c (associated): Compare the 'type' and
'elem_len' fields instead of the dtype.
* caf/date_and_time.c : Access the dtype fields rather using
shifts and masks.
* io/transfer.c (transfer_array ): Comment on item count.
(set_nml_var,st_set_nml_var): Change dtype type and use fields.
(st_set_nml_dtio_var): Likewise.
* libgfortran.h : Change definition of GFC_ARRAY_DESCRIPTOR and
add a typedef for the dtype_type. Change the GFC_DTYPE_* macros
to access the dtype fields.



git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@257065
138bc75d-0d04-0410-961f-82ee72b054a4
...

Will reconfirm with full build.

[Bug libstdc++/84087] string::assign problem with two arguments

2018-01-29 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84087

Jonathan Wakely  changed:

   What|Removed |Added

   Keywords||rejects-valid
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2018-01-29
 Ever confirmed|0   |1

--- Comment #1 from Jonathan Wakely  ---
(In reply to Berni from comment #0)
> From C++14, string::assign can be called with two arguments: the string (1)
> and the position (2). The length (3) is optional.

Yes this was changed by https://wg21.link/lwg2268

> This worked in gcc 7.2.0.

No it didn't, libstdc++ has never implemented the change.

[Bug c++/84098] New: [8 Regression] ICE when using a lambda in a in-class static member initialization

2018-01-29 Thread benni.buch at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84098

Bug ID: 84098
   Summary: [8 Regression] ICE when using a lambda in a in-class
static member initialization
   Product: gcc
   Version: 8.0.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: benni.buch at gmail dot com
  Target Milestone: ---

struct A{};

template < typename >
struct Test{
static constexpr auto var = []{};
};

int main(){
(void)Test< A >::var;
}


$ g++ -std=c++11 main.cpp 
main.cpp: In instantiation of 'struct Test':
main.cpp:9:20:   required from here
main.cpp:5:34: internal compiler error: in lookup_template_class_1, at
cp/pt.c:8957
 static constexpr auto var = []{};
  ^
0x612d1c lookup_template_class_1
../../gcc/gcc/cp/pt.c:8957
0x612d1c lookup_template_class(tree_node*, tree_node*, tree_node*, tree_node*,
int, int)
../../gcc/gcc/cp/pt.c:9167
0x94017a tsubst_aggr_type
../../gcc/gcc/cp/pt.c:12035
0x93a69e tsubst(tree_node*, tree_node*, int, tree_node*)
../../gcc/gcc/cp/pt.c:13668
0x9412b2 tsubst_decl
../../gcc/gcc/cp/pt.c:12965
0x93a7af tsubst(tree_node*, tree_node*, int, tree_node*)
../../gcc/gcc/cp/pt.c:13586
0x958cea instantiate_class_template_1
../../gcc/gcc/cp/pt.c:10635
0x958cea instantiate_class_template(tree_node*)
../../gcc/gcc/cp/pt.c:10920
0x999afd complete_type(tree_node*)
../../gcc/gcc/cp/typeck.c:136
0x8f833a cp_parser_nested_name_specifier_opt
../../gcc/gcc/cp/parser.c:6441
0x8f9e45 cp_parser_simple_type_specifier
../../gcc/gcc/cp/parser.c:17111
0x8faf57 cp_parser_postfix_expression
../../gcc/gcc/cp/parser.c:6945
0x8fbbc0 cp_parser_unary_expression
../../gcc/gcc/cp/parser.c:8281
0x8dc73f cp_parser_cast_expression
../../gcc/gcc/cp/parser.c:9049
0x8dc901 cp_parser_cast_expression
../../gcc/gcc/cp/parser.c:9001
0x8dcf4a cp_parser_binary_expression
../../gcc/gcc/cp/parser.c:9150
0x8de714 cp_parser_assignment_expression
../../gcc/gcc/cp/parser.c:9437
0x8dee28 cp_parser_expression
../../gcc/gcc/cp/parser.c:9606
0x8e0ad8 cp_parser_expression_statement
../../gcc/gcc/cp/parser.c:11075
0x8e683d cp_parser_statement
../../gcc/gcc/cp/parser.c:10879
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

$ g++ --version
g++ (GCC) 8.0.1 20180129 (experimental)
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Known to work with g++-5 and g++6, both give an error message but don't run
into an ICE, clang accepts the code.

$ g++-6 -std=c++11 main.cpp 
main.cpp:5:27: error: 'constexpr const Test:: Test::var',
declared using local type 'const Test::', is used but never
defined [-fpermissive]
 static constexpr auto var = []{};
   ^~~

$ g++-6 --version
g++-6 (Ubuntu/Linaro 6.3.0-18ubuntu2~16.04) 6.3.0 20170519
Copyright (C) 2016 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ g++-5 --version
g++-5 (Ubuntu 5.4.1-2ubuntu1~16.04) 5.4.1 20160904
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

[Bug c++/84098] [8 Regression] ICE when using a lambda in a in-class static member initialization

2018-01-29 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84098

Marek Polacek  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2018-01-29
 CC||mpolacek at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Marek Polacek  ---
Confirmed.

[Bug c++/84098] [8 Regression] ICE when using a lambda in a in-class static member initialization

2018-01-29 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84098

Marek Polacek  changed:

   What|Removed |Added

   Keywords||ice-on-valid-code
   Target Milestone|--- |8.0

[Bug libstdc++/81122] [DR 2381] parsing f stopped after '0' when reading std::hexfloat >> f;

2018-01-29 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81122

--- Comment #14 from Jonathan Wakely  ---
Yes that's expected. It's the same issue. Libstdc++ still follows the C++98
spec which means there is no such thing as a hex float, and so "0x" cannot be
the start of a floating point value, it's just "0".

There are two parts to support for parsing hex floats: firstly the standard
needs to be updated to allow 'p' and 'P' to be accumulated and passed to
strtod. 

Secondly, implementations need to be updated to recognize hex floats (which
means parsing numbers beginning with "0x" and also supporting exponents
following 'p' or 'P').

As I said in comment 9:

"I don't want to change our implementation yet, until the intention of the
committee is clear."

"We know you can't currently read hex floats using istreams with libstdc++, we
know it's a defect in the standard, we're working on it."

[Bug c++/84098] [8 Regression] ICE when using a lambda in a in-class static member initialization

2018-01-29 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84098

--- Comment #2 from Marek Polacek  ---
Started with r257093.

[Bug c++/84098] [8 Regression] ICE when using a lambda in a in-class static member initialization

2018-01-29 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84098

--- Comment #3 from Marek Polacek  ---
The new assert says
 8955   /* Lambda closures are regenerated in tsubst_lambda_expr, not
 8956  instantiated here.  */
 8957   gcc_assert (!LAMBDA_TYPE_P (template_type));
but here we haven't gotten around to calling tsubst_lambda_expr.

[Bug libgomp/84088] [nvptx] libgomp.oacc-fortran/declare-*.f90 execution fails

2018-01-29 Thread vries at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84088

--- Comment #4 from Tom de Vries  ---
(In reply to Tom de Vries from comment #3)
> Will reconfirm with full build.

Did clean builds of r257064 and r257065. Minimal test passes at r257064, fails
at r257065. Confirmed.

[Bug tree-optimization/84037] [8 Regression] Speed regression of polyhedron benchmark since r256644

2018-01-29 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84037

--- Comment #10 from Richard Biener  ---
So strided stores are costed as

  /* Costs of the stores.  */
  if (memory_access_type == VMAT_ELEMENTWISE
  || memory_access_type == VMAT_GATHER_SCATTER)
{
  /* N scalar stores plus extracting the elements.  */
  unsigned int assumed_nunits = vect_nunits_for_cost (vectype);
  inside_cost += record_stmt_cost (body_cost_vec,
   ncopies * assumed_nunits,
   scalar_store, stmt_info, 0, vect_body);
}
...
  if (memory_access_type == VMAT_ELEMENTWISE
  || memory_access_type == VMAT_STRIDED_SLP)
{
  /* N scalar stores plus extracting the elements.  */
  unsigned int assumed_nunits = vect_nunits_for_cost (vectype);
  inside_cost += record_stmt_cost (body_cost_vec,
   ncopies * assumed_nunits,
   vec_to_scalar, stmt_info, 0, vect_body);
}

there's the issue of "overloading" vec_to_scalar with extraction.  It's costed
as generic sse_op which IMHO is reasonable here (vextract*).

The scalar cost is 12 for each of the following stmts

  _66 = *_150[_65];
  d1.76_67 = d1;
  _160 = d1.76_67 * _73;
  _74 = _66 * _160;
  *_150[_65] = _74;

the vector variant is adding the construction/extraction cost compared
to the scalar variant and wins with the two multiplications being costed
once instead of four times.  We don't actually factor in the "win" by
hoisting the vectorized load of 'd1' only in the vector case.

With AVX2 things become even more "cheap" vectorized.  And we of course
peel the epilogue completely.

Ideally we'd interchange this specific loop but interchange doesn't do
anything here because we get niters that might be zero.  Later dependences
would probably wreck things but here this also is a missed optimization.
We have two paths running into the loop loading ng1 and checking it
against zero properly but the PHI result doesn't have this range info
merged (well, VRP sets the info but it needs LIM / PRE to see the
opportunity so it's only set by late VRP).

[Bug libstdc++/83658] any::emplace deletes invalid memory when an overloaded operator new() throws

2018-01-29 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83658

Jonathan Wakely  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2018-01-29
   Assignee|unassigned at gcc dot gnu.org  |redi at gcc dot gnu.org
   Target Milestone|--- |7.4
 Ever confirmed|0   |1

--- Comment #1 from Jonathan Wakely  ---
As requested by https://gcc.gnu.org/bugs/ please don't post links to code
hosted elsewhere, provide the code here in bugzilla:

#include 
#include 
#include 
#include 

struct AllocThrows {
AllocThrows() {}
AllocThrows(const AllocThrows&) {}

// Not marked noexcept so std::any doesn't inline AllocThrows instances.
AllocThrows(AllocThrows&&) {}
template 
static void* operator new(size_t, Args&&... args) {
std::cout << "throwing bad alloc\n";
throw std::bad_alloc();
}

static void operator delete(void*) noexcept {
std::cout << "deleting\n";
}
};

int main() {
std::any a;
try {
a.emplace();  // double-deletes!
} catch(...) {}
}


(In reply to Jon Cohen from comment #0)
> __do_emplace sets the manager pointer before attempting to create a new
> object.  When new() throws after calling reset(), _M_ptr doesn't point to
> valid memory.  When, later, the destructor is called on the any, the manager
> pointer still is non-null, so the destructor, via reset(), will trigger the
> call to _S_manage(_Op_destroy, ...), calling delete on the invalid _M_ptr.  

It's not invalid, it's guaranteed to be a null pointer.

This is definitely a bug (because a.has_value() is true after the exception,
and user-provided deallocation functions aren't required to handle null
pointers) but there's no double-delete.

[Bug middle-end/78809] Inline strcmp with small constant strings

2018-01-29 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809

--- Comment #31 from Wilco  ---
(In reply to Qing Zhao from comment #30)
> (in reply to Wilco from comment #29)
> > 
> > The new test is better, however it uses i % 15 which means an expensive
> > division by constant every loop iteration. It's best to change to i & 15. 
> > Also
> > using an array of string pointers means you get something like:
> > 
> > result += strcmp (p[i & 15], "abc");
> > 
> > Using this I get ~80% speedup for n=3 on AArch64, similar to your set 2.
> I will try with these modification. 
> > 
> > As for benchmarking, I'm not so sure that SPEC2006 or SPEC2017 call strcmp 
> > with
> > constant strings.
> do you have any suggestion on other real applications?

Not really - I haven't seen strcmp with a constant string in a benchmark other
than Dhrystone (and that has a long string). What I typically do is get traces
from running various benchmarks and create a microbenchmark that mimicks the
behaviour, but that's probably overkill for this optimization.

[Bug libstdc++/83658] any::emplace deletes invalid memory when an overloaded operator new() throws

2018-01-29 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83658

--- Comment #2 from Jonathan Wakely  ---
Author: redi
Date: Mon Jan 29 12:33:32 2018
New Revision: 257141

URL: https://gcc.gnu.org/viewcvs?rev=257141&root=gcc&view=rev
Log:
PR libstdc++/83658 fix exception-safety in std::any::emplace

PR libstdc++/83658
* include/std/any (any::__do_emplace): Only set _M_manager after
constructing the contained object.
* testsuite/20_util/any/misc/any_cast_neg.cc: Adjust dg-error line.
* testsuite/20_util/any/modifiers/83658.cc: New test.

Added:
trunk/libstdc++-v3/testsuite/20_util/any/modifiers/83658.cc
Modified:
trunk/libstdc++-v3/ChangeLog
trunk/libstdc++-v3/include/std/any
trunk/libstdc++-v3/testsuite/20_util/any/misc/any_cast_neg.cc

[Bug c++/84099] New: Dynamic initialization is performed in case when constant initialization is permitted

2018-01-29 Thread antoshkka at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84099

Bug ID: 84099
   Summary: Dynamic initialization is performed in case when
constant initialization is permitted
   Product: gcc
   Version: 8.0.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: antoshkka at gmail dot com
  Target Milestone: ---

The following code 

struct foo {
const char* data_;
unsigned size_;

foo(const char* data, unsigned size) noexcept
: data_(data)
, size_(size)
{}
};

foo test() {
static const foo v{"Hello", 5};
return v;
}


Produces disassembly with dynamic initialization of the `v` variable. However
in this case C++ Standard permits constant initialization:

"An implementation is permitted to perform the initialization of a variable
with static or thread storage duration as a static initialization even if such
initialization is not required to be done statically, provided that

— the dynamic version of the initialization does not change the value of any
other object of static or thread storage duration prior to its initialization,
and
— the static version of the initialization produces the same value in the
initialized variable as would be produced by the dynamic initialization if all
variables not required to be initialized statically were initialized
dynamically.
"

Optimal assembly would look like

.LC0:
  .string "Hello"
test():
  mov eax, OFFSET FLAT:.LC0
  mov edx, 5
  ret

[Bug c/84100] New: Function __attribute__((optimize(align-loops=32))) gives spurious warning

2018-01-29 Thread gcc at gmch dot uk
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84100

Bug ID: 84100
   Summary: Function __attribute__((optimize(align-loops=32)))
gives spurious warning
   Product: gcc
   Version: 7.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gcc at gmch dot uk
  Target Milestone: ---

With v7.2.1, compiling for recent x86_64, the command line option
"-falign-loops=32" is accepted and has the desired effect.

However I find that both:

_Pragma("GCC optimize(\"align-loops=32\")")

and:

__attribute__((optimize("align-loops=32")))

are rejected with a "warning: bad option '-falign-loops=32'" -- reported for
each and every affected function.

I have tried optimize("align-loops=x") for x=0, 1, 8 and 16 -- all are rejected
the same way.

FWIW, brief checks with Compiler Explorer suggest:

  1) this appears to be a new bug in v7, or at least
 since v6.3.

  2) v6.3 accepts the _Pragma(), but does *not* do the requested
 alignment.

  3) v5.4 accepts the _Pragma(), and does the requested alignment.

Chris

[Bug tree-optimization/84037] [8 Regression] Speed regression of polyhedron benchmark since r256644

2018-01-29 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84037

--- Comment #11 from Richard Biener  ---
So probably the big slowdown is because the vectorized loop body is so much
larger.  Unvectorized:

.L61:
vmulss  __solv_cap_MOD_d1(%rip), %xmm4, %xmm0
incl%ecx
vmulss  (%rdx), %xmm0, %xmm0
vmovss  %xmm0, (%rdx)
addq%rax, %rdx
cmpl%r12d, %ecx
jne .L61

vectorized (see how we hoist the load), with -march=haswell:

vmulss  __solv_cap_MOD_d1(%rip), %xmm4, %xmm5
movq144(%rsp), %rdi
leaq(%rbx,%r13), %rdx
xorl%r10d, %r10d
movq%rdx, %rsi
leaq0(%r13,%rdi), %rcx
movq%rcx, %rdi
vbroadcastss%xmm5, %ymm5
.p2align 4,,10
.p2align 3
.L58:
vmovss  (%rcx,%rax,2), %xmm1
vmovss  (%rsi,%rax,2), %xmm0
incl%r10d
vinsertps   $0x10, (%rcx,%r8), %xmm1, %xmm3
vinsertps   $0x10, (%rsi,%r8), %xmm0, %xmm7
vmovss  (%rcx), %xmm1
vmovss  (%rsi), %xmm0
vinsertps   $0x10, (%rcx,%rax), %xmm1, %xmm1
vinsertps   $0x10, (%rsi,%rax), %xmm0, %xmm0
addq%r9, %rcx
addq%r9, %rsi
vmovlhps%xmm7, %xmm0, %xmm0
vmovlhps%xmm3, %xmm1, %xmm1

^^ not sure why we construct in such strange way - ICC simply
does a single vmovss and then 7 vinsertps

vinsertf128 $0x1, %xmm1, %ymm0, %ymm0
vmulps  %ymm5, %ymm0, %ymm0
vmovss  %xmm0, (%rdx)
vextractps  $1, %xmm0, (%rdx,%rax)
vextractps  $2, %xmm0, (%rdx,%rax,2)
vextractps  $3, %xmm0, (%rdx,%r8)
vextractf128$0x1, %ymm0, %xmm0
addq%r9, %rdx
vmovss  %xmm0, (%rdi)
vextractps  $1, %xmm0, (%rdi,%rax)
vextractps  $2, %xmm0, (%rdi,%rax,2)
vextractps  $3, %xmm0, (%rdi,%r8)

Similar here.  But fixing this would only reduce this by a few stmts.

addq%r9, %rdi
cmpl%r10d, %r14d
jne .L58

Anyway, this size (140 bytes, 9 cache lines) probably blows any loop
stream cache limits (IIRC that was around 3 cache lines).  Compared
to 26 bytes for the scalar version (2 cache lines).

Any such considerations would be best placed in the targets finish_cost
hook where the target knows all stmts that are going to be emitted and
can in theory also cost against the scalar variant (not easily available).

The SSE variant is smaller so measuring the slowdown with SSE only would
be interesting.  Hmm, SSE variant is slower (for all of capacita) but
-fno-tree-vectorize is fastest.

[Bug c/84085] Array element is unnecessary loaded twice

2018-01-29 Thread hjl.tools at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84085

H.J. Lu  changed:

   What|Removed |Added

   Target Milestone|--- |8.0

[Bug middle-end/84067] [8 regression] gcc.dg/wmul-1.c regression on aarch64 after r257077

2018-01-29 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84067

--- Comment #7 from rguenther at suse dot de  ---
On Mon, 29 Jan 2018, ktkachov at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84067
> 
> --- Comment #6 from ktkachov at gcc dot gnu.org ---
> (In reply to rguent...@suse.de from comment #5)
> > On Mon, 29 Jan 2018, ktkachov at gcc dot gnu.org wrote:
> > 
> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84067
> > > 
> > > --- Comment #3 from ktkachov at gcc dot gnu.org ---
> > > (In reply to Richard Biener from comment #2)
> > > > So any hint on whether the code after r257077 is better or worse than 
> > > > before?
> > > 
> > > Looks worse unfortunately:
> > > For aarch64 at -O2 it generates:
> > > foo:
> > > mov w3, 44
> > > mov w2, 40
> > > mov w5, 1
> > > mov w4, 2
> > > smull   x3, w1, w3
> > > smull   x2, w1, w2
> > > str w5, [x0, x3]
> > > add x2, x2, 400
> > > add x1, x2, x1, sxtw 2
> > > str w4, [x0, x1]
> > > ret
> > > 
> > > whereas with r257077 it generates the shorter:
> > > foo:
> > > mov w3, 40
> > > sxtwx2, w1
> > > mov w4, 1
> > > smaddl  x0, w1, w3, x0
> > > mov w3, 2
> > > add x1, x0, x2, lsl 2
> > > str w4, [x0, x2, lsl 2]
> > > str w3, [x1, 400]
> > > ret
> > 
> > So shorter is worse?  Might be because I don't understand the
> > difference between the 'lsl 2' and the 'sxtw 2' or the cost
> > of the [x1, 400] addressing.
> 
> Sorry, I messed up the writeup. Let me try again.
> The shorter sequence (with the smaddl) is the good one and is produced
> *without* r257077. After r257077 we generate the longer and worse sequence 
> with
> two smull.

I see the shorter sequence with TOT, r257077 included.  The testcase
explicitely checks for no widen-mult-plus but we now have two:

   [local count: 1073741825]:
  _17 = Idx_6(D) w* 44;
  _13 = Arr_7(D) + _17;
  MEM[(int[10] *)_13] = 1;
  _4 = WIDEN_MULT_PLUS_EXPR ;
  _18 = WIDEN_MULT_PLUS_EXPR ;
  _16 = Arr_7(D) + _18;
  MEM[(int[10] *)_16] = 2;
  return;

note the "shorter" sequence I see is

foo:
mov x4, 400
mov w3, 40
mov w2, 44
mov w5, 1
smaddl  x3, w1, w3, x4
mov w4, 2
smull   x2, w1, w2
add x1, x3, x1, sxtw 2
str w5, [x0, x2]
str w4, [x0, x1]
ret

which doesn't 1:1 match either of yours.

[Bug c/84101] New: -O3 and -ftree-vectorize trying too hard for function returning trivial pair-of-uint64_t-structure

2018-01-29 Thread gcc at gmch dot uk
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84101

Bug ID: 84101
   Summary: -O3 and -ftree-vectorize trying too hard for function
returning trivial pair-of-uint64_t-structure
   Product: gcc
   Version: 7.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gcc at gmch dot uk
  Target Milestone: ---

The following:

  typedef struct uint64_pair uint64_pair_t ;
  struct uint64_pair
  {
uint64_t  w0 ;
uint64_t  w1 ;
  } ;

  uint64_pair_t pair(int num)
  {
uint64_pair_t p ;

p.w0 = num << 1 ;
p.w1 = num >> 1 ;

return p ;
  }

for recent x86_64, under v7.1.0, using "-O3", compiles to:

  pair:
   lea(%rdi,%rdi,1),%eax
   sar%edi
   movslq %edi,%rdi
   cltq   
   mov%rax,-0x18(%rsp)
   movq   -0x18(%rsp),%xmm0
   mov%rdi,-0x18(%rsp)
   movhps -0x18(%rsp),%xmm0
   movaps %xmm0,-0x18(%rsp)
   mov-0x18(%rsp),%rax
   mov-0x10(%rsp),%rdx
   retq

using "-O3 -fno-tree-vectorize", compiles to:

  pair:
   lea(%rdi,%rdi,1),%eax
   sar%edi
   movslq %edi,%rdx
   cltq   
   retq

I note that v6.3 produces the shorter code without the "-fno-tree-vectorize".

[Bug middle-end/84083] [missed optimization] loop-invariant strlen() not hoisted out of loop

2018-01-29 Thread eyalroz at technion dot ac.il
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84083

--- Comment #4 from Eyal Rozenberg  ---
(In reply to Richard Biener from comment #3)
> Yes, we don't currently implement restrict disambiguation for calls.

So, would that account for the different compilation result for test1() and
test2() in the following code:

#include 

inline size_t my_strlen(const char* __restrict__ s) 
{
const char* p = s;
while(*p != '\0') { p++; }
return p - s;
}

size_t test1()
{
static const char* hw = "Hello, world!";
return my_strlen(hw);
}

size_t test2()
{
static const char* hw = "Hello, world!";
return strlen(hw);
}

where test2() compiles to just returning a fixed value while test1() executes a
loop (See https://godbolt.org/g/CvVxru) ?

[Bug c++/83835] [7/8 Regression] constexpr constructor rejected in c++17 mode (regression WRT c++14)

2018-01-29 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83835

Marek Polacek  changed:

   What|Removed |Added

 CC||mpolacek at gcc dot gnu.org

--- Comment #2 from Marek Polacek  ---
Very similar to PR82461.

[Bug middle-end/84071] [7/8 regression] nonzero_bits1 of subreg incorrect

2018-01-29 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84071

--- Comment #5 from Wilco  ---
(In reply to Eric Botcazou from comment #3)
> > PR59461 changed nonzero_bits1 incorrectly for subregs:
> > 
> >   /* On many CISC machines, accessing an object in a wider mode
> >  causes the high-order bits to become undefined.  So they are
> >  not known to be zero.  */
> >   rtx_code extend_op;
> >   if ((!WORD_REGISTER_OPERATIONS
> >/* If this is a typical RISC machine, we only have to worry
> >   about the way loads are extended.  */
> >|| ((extend_op = load_extend_op (inner_mode)) == SIGN_EXTEND
> >? val_signbit_known_set_p (inner_mode, nonzero)
> >: extend_op != ZERO_EXTEND)
> >|| (!MEM_P (SUBREG_REG (x)) && !REG_P (SUBREG_REG (x
> >   && xmode_width > inner_width)
> > nonzero
> >   |= (GET_MODE_MASK (GET_MODE (x)) & ~GET_MODE_MASK
> > (inner_mode));
> > 
> > If WORD_REGISTER_OPERATIONS is set and load_extend_op is ZERO_EXTEND, rtl
> > like
> > 
> > (subreg:SI (reg:HI 125) 0)
> > 
> > is assumed to be always zero-extended.
> 
> That's not what the code is supposed to do.  As explained in the comment,
> the code is intended to compute the nonzero bits of the subreg from the
> nonzero_bits of the inner reg:
> 
> nonzero &= cached_nonzero_bits (SUBREG_REG (x), mode,
> known_x, known_mode, known_ret);

That's based on the inner type alone and not correct for
WORD_REGISTER_OPERATIONS. The

nonzero |= (GET_MODE_MASK (GET_MODE (x)) & ~GET_MODE_MASK;

adds in the unknown bits for the wider type. And that's the bit that is no
longer triggering.

> > This is incorrect since modes that are smaller than WORD_MODE may contain
> > random top bits. This is equally true for RISC and CISC ISAs and 
> > independent 
> > of WORD_REGISTER_OPERATIONS, so it's unclear why the !REG_P check was added.
> 
> No, that's wrong, WORD_REGISTER_OPERATIONS precisely means that the bits up
> to the word are defined when operations operate in mode smaller than a word.

They are always written but have an undefined value. Adding 2 8-bit values
results in a 9-bit value with WORD_REGISTER_OPERATIONS.

[Bug tree-optimization/84102] New: Fails to disambiguate Fortran (non-addressable?) global with array descriptor data

2018-01-29 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84102

Bug ID: 84102
   Summary: Fails to disambiguate Fortran (non-addressable?)
global with array descriptor data
   Product: gcc
   Version: 8.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

Created attachment 43269
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43269&action=edit
reduced capacita

This is split out from PR84037 with a reduced testcase.  We fail to hoist the
load of D1 from

do i=1,Ng1 !   .. and multiply charge with x
  do j=1,Ng2
X(i,j) = X(i,j) * D1 * (i-(Ng1+1)/2.0_dp)
  end do
end do

which ultimatively causes a runtime alias check for vectorization to
disambiguate
D1 and X.

Note X is allocate()d and properly gets a malloc DECL from PTA.  But the
function calls solve(X,Y0) and GCC believes this call may alter X.data
(and other fields of the descriptor).  X is

real(kind=dp), intent(out), dimension(:,:)  :: X

inside solve so that might be very well possible.  But of course X may
not end up pointing to D1 for whatever details.  My guess is that it's
the usual 'you can't take the address of random stuff without annotating
it in fortran'.  If true this would deserve sth stronger than
!TREE_ADDRESSABLE which we only trust for TU-local variables.  -fwhole-program
helps here to properly constrain D1.

[Bug libstdc++/84087] string::assign problem with two arguments

2018-01-29 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84087

Jonathan Wakely  changed:

   What|Removed |Added

   Keywords||patch
 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |redi at gcc dot gnu.org

--- Comment #2 from Jonathan Wakely  ---
Patch: https://gcc.gnu.org/ml/gcc-patches/2018-01/msg02249.html

[Bug fortran/82086] namelist read with repeat count fails when item is member of array of structures

2018-01-29 Thread jsberg at bnl dot gov
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82086

--- Comment #7 from jsberg at bnl dot gov ---
As to why I think this is a bug (and why I think Intel's compiler is doing the
right thing), referencing the 2008 standard (N1830):

10.11.2, paragraph 2:

Each object designator shall begin with a name from the
namelist-group-object-list (5.6) and shall follow the syntax of designator
(R601).

10.11.3.2, paragraph 2:

When the designator in the input record represents an array variable or a
variable of derived type, the effect is as if the variable represented were
expanded into a sequence of scalar list items, in the same way that formatted
input/output list items are expanded (9.6.3).

10.11.3.3, paragraph 1:

... The r*c form is equivalent to r successive appearances of the constant c

R601 designator is object-name
or array-element
or array-section
or coindexed-named-object
or complex-part-designator
or structure-component
or substring

R611 data-ref is part-ref [ % part-ref ] ...

R612 part-ref is part-name [ ( section-subscript-list ) ] [ image-selector ]

R618 array-section is data-ref [ ( substring-range ) ]
or complex-part-designator

[Bug bootstrap/84017] [6/7/8 regression] Bootstrap failure on Solaris 10/x86 with gas/ld

2018-01-29 Thread ro at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84017

Rainer Orth  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
URL||https://gcc.gnu.org/ml/gcc-
   ||patches/2018-01/msg02251.ht
   ||ml
   Last reconfirmed||2018-01-29
   Assignee|unassigned at gcc dot gnu.org  |ro at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #5 from Rainer Orth  ---
Mine.  Patch posted.

[Bug c++/84103] New: Dynamic initialization is performed for non-local variables in case when constant initialization is permitted

2018-01-29 Thread antoshkka at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84103

Bug ID: 84103
   Summary: Dynamic initialization is performed for non-local
variables in case when constant initialization is
permitted
   Product: gcc
   Version: 8.0.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: antoshkka at gmail dot com
  Target Milestone: ---

Following code 

struct foo {
const char* data_;
unsigned size_;

foo(const char* data, unsigned size) noexcept
: data_(data)
, size_(size)
{}
};

extern const foo v{"Hello", 5};


Produces assembly with dynamic initialization:

.LC0:
  .string "Hello"
_GLOBAL__sub_I_v:
  mov QWORD PTR v[rip], OFFSET FLAT:.LC0
  mov DWORD PTR v[rip+8], 5
  ret
v:
  .zero 16


However in this case C++ Standard permits constant initialization:

"An implementation is permitted to perform the initialization of a variable
with static or thread storage duration as a static initialization even if such
initialization is not required to be done statically, provided that

— the dynamic version of the initialization does not change the value of any
other object of static or thread storage duration prior to its initialization,
and
— the static version of the initialization produces the same value in the
initialized variable as would be produced by the dynamic initialization if all
variables not required to be initialized statically were initialized
dynamically.
"

Optimal assembly would look like the following

v:
  .quad .L.str
  .long 5 # 0x5
  .zero 4

.L.str:
  .asciz "Hello"

(clang produces the code from above)

Bug 84099 may be related to this one. That bug is about local variables
initialization, this bug is about non-local variables.

[Bug fortran/84104] New: Minval gives incorrect results for certain compiler options

2018-01-29 Thread yosef at astro dot rit.edu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84104

Bug ID: 84104
   Summary: Minval gives incorrect results for certain compiler
options
   Product: gcc
   Version: 4.8.5
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: yosef at astro dot rit.edu
  Target Milestone: ---

If st_z is a complex (2d) array with no element (1,1), under certain compiler
options, minval(abs(st_z - dcmplx(1,1)) returns 0, which is not correct.

This happens with the compiler options 
gfortran -O2  -ffast-math -fno-finite-math-only test.f90
Removing any one option from the command line removes the bug. Here's the test
code
program test
  implicit none
  integer, parameter :: nx = 5
  integer, parameter :: ny = 5
  integer, parameter :: wp = kind(1.0d0)

  complex(kind=wp), dimension(nx,ny):: st_z

  integer:: i,j

  do j=1, ny
do i=1, nx
  st_z(i,j) = dcmplx(-i,-j)
end do
  end do

  if (minval(abs(st_z - dcmplx(1,1))) < 1.0d-10) then
  write(*,*) "Bug triggered"
  endif
end program test

The bug is triggered for options
gfortran -OX  -ffast-math -fno-finite-math-only test.f90 

where X is 1, 2, 3, 4. But is not triggered if X is 0.
Also, removing "-fno-finite-math-only", or removing "-ffast-math"
suppresses the bug regardless of X.


Details of fortran version:

gfortran -v
Using built-in specs.
COLLECT_GCC=gfortran
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/4.8.5/lto-wrapper
Target: x86_64-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man
--infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla
--enable-bootstrap --enable-shared --enable-threads=posix
--enable-checking=release --with-system-zlib --enable-__cxa_atexit
--disable-libunwind-exceptions --enable-gnu-unique-object
--enable-linker-build-id --with-linker-hash-style=gnu
--enable-languages=c,c++,objc,obj-c++,java,fortran,ada,go,lto --enable-plugin
--enable-initfini-array --disable-libgcj
--with-isl=/builddir/build/BUILD/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/isl-install
--with-cloog=/builddir/build/BUILD/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/cloog-install
--enable-gnu-indirect-function --with-tune=generic --with-arch_32=x86-64
--build=x86_64-redhat-linux
Thread model: posix
gcc version 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC)

[Bug tree-optimization/84037] [8 Regression] Speed regression of polyhedron benchmark since r256644

2018-01-29 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84037

--- Comment #12 from Richard Biener  ---
I have opened PR84102 for the missed optimizations in this particular loop.  I
believe now the interesting one is the other.

  30.25%  a.outa.out [.] __solv_cap_MOD_fourir2d
  24.83%  a.outa.out [.] __solv_cap_MOD_fourir
  24.83%  a.outa.out [.] __solv_cap_MOD_fourir2dx
  18.78%  a.out[unknown] [k] 0x813366e7

and the 551 loops are in the innermost nest of fourir() which is called/inlined
to fourir*.  It's actually not array expressions but the different inline
copies we vectorize.  Cost model for that one:

capacita2.f90:551:0: note: Cost model analysis:
  Vector inside of loop cost: 3756
  Vector prologue cost: 64
  Vector epilogue cost: 1712
  Scalar iteration cost: 516
  Scalar outside cost: 4
  Vector outside cost: 1776
  prologue iterations: 0
  epilogue iterations: 4
  Calculated minimum iters for profitability: 0
capacita2.f90:551:0: note:   Runtime profitability threshold = 8
capacita2.f90:551:0: note:   Static estimate profitability threshold = 11545608

the loop is hybrid SLP, VF is 8

capacita2.f90:551:0: note: improved number of alias checks from 36 to 6

so here it's also dependence analysis breaking down because the step is
unknown:

(compute_affine_dependence
  stmt_a: _170 = REALPART_EXPR <*a.0_107[_54]>;
  stmt_b: _196 = REALPART_EXPR <*a.0_107[_73]>;
(analyze_overlapping_iterations
  (chrec_a = 0)
  (chrec_b = 0)
  (overlap_iterations_a = [0])
  (overlap_iterations_b = [0]))
(analyze_overlapping_iterations
  (chrec_a = {((integer(kind=8)) j0_119 + 1) * iftmp.476_91, +,
iftmp.476_91}_3)
  (chrec_b = {((integer(kind=8)) j1_124 + 1) * iftmp.476_91, +,
iftmp.476_91}_3)
(analyze_siv_subscript
  siv test failed: unimplemented)
  (overlap_iterations_a = not known)
  (overlap_iterations_b = not known))
) -> dependence analysis failed

the only thing we know about iftmp.476_91 is that it isn't zero (but I'm sure
dependence analysis doesn't use that fact).  I think this specific dependence
should be computable...?  [by just dividing both chrecs by iftmp.476_91?]

[Bug c++/84091] [8 Regression] ICE on valid C++ code: Segmentation fault

2018-01-29 Thread matthias.hochsteger at tuwien dot ac.at
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84091

Matthias Hochsteger  changed:

   What|Removed |Added

 CC||matthias.hochsteger@tuwien.
   ||ac.at

--- Comment #1 from Matthias Hochsteger  ---
Introduced with

commit fa01d4a50ef3115a509c67af897c854001597ea7 (HEAD)
Author: jason 
Date:   Fri Jan 26 15:25:23 2018 +

PR c++/82514 - ICE with local class in generic lambda.

* pt.c (regenerated_lambda_fn_p): Remove.
(enclosing_instantiation_of): Don't use it.
(tsubst_function_decl): Call enclosing_instantiation_of.

* pt.c (lookup_template_class_1): Add sanity check.
* name-lookup.c (do_pushtag): Don't add closures to local_classes.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@257093
138bc75d-0d04-0410-961f-82ee72b054a4

[Bug c++/84092] [8 Regression] ICE on C++14 code with variable template: in build_qualified_name, at cp/tree.c:2043

2018-01-29 Thread paolo.carlini at oracle dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84092

Paolo Carlini  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |paolo.carlini at oracle 
dot com

--- Comment #2 from Paolo Carlini  ---
Seems doable.

[Bug tree-optimization/84090] [8 Regression] ICE in gimple_redirect_edge_and_branch, at tree-cfg.c:6151

2018-01-29 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84090

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek  ---
Can't reproduce.

[Bug libstdc++/83833] chi_squared_distribution::param() forgot to change the member gamma_distribution

2018-01-29 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83833

--- Comment #8 from Jonathan Wakely  ---
Author: redi
Date: Mon Jan 29 13:58:49 2018
New Revision: 257144

URL: https://gcc.gnu.org/viewcvs?rev=257144&root=gcc&view=rev
Log:
PR libstdc++/83833 fix chi_squared_distribution::param(const param&)

Backport from mainline
2018-01-15  Jonathan Wakely  

PR libstdc++/83833
* include/bits/random.h (chi_squared_distribution::param): Update
gamma distribution parameter.
* testsuite/26_numerics/random/chi_squared_distribution/83833.cc: New
test.

Added:
   
branches/gcc-7-branch/libstdc++-v3/testsuite/26_numerics/random/chi_squared_distribution/83833.cc
Modified:
branches/gcc-7-branch/libstdc++-v3/ChangeLog
branches/gcc-7-branch/libstdc++-v3/include/bits/random.h

[Bug libstdc++/83658] any::emplace deletes invalid memory when an overloaded operator new() throws

2018-01-29 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83658

Jonathan Wakely  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Jonathan Wakely  ---
Fixed for 7.4 and 8.1

[Bug libstdc++/83658] any::emplace deletes invalid memory when an overloaded operator new() throws

2018-01-29 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83658

--- Comment #3 from Jonathan Wakely  ---
Author: redi
Date: Mon Jan 29 13:58:54 2018
New Revision: 257145

URL: https://gcc.gnu.org/viewcvs?rev=257145&root=gcc&view=rev
Log:
PR libstdc++/83658 fix exception-safety in std::any::emplace

PR libstdc++/83658
* include/std/any (any::__do_emplace): Only set _M_manager after
constructing the contained object.
* testsuite/20_util/any/misc/any_cast_neg.cc: Adjust dg-error line.
* testsuite/20_util/any/modifiers/83658.cc: New test.

Added:
branches/gcc-7-branch/libstdc++-v3/testsuite/20_util/any/modifiers/83658.cc
Modified:
branches/gcc-7-branch/libstdc++-v3/ChangeLog
branches/gcc-7-branch/libstdc++-v3/include/std/any
   
branches/gcc-7-branch/libstdc++-v3/testsuite/20_util/any/misc/any_cast_neg.cc

[Bug libstdc++/83833] chi_squared_distribution::param() forgot to change the member gamma_distribution

2018-01-29 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83833

--- Comment #9 from Jonathan Wakely  ---
Author: redi
Date: Mon Jan 29 14:07:27 2018
New Revision: 257146

URL: https://gcc.gnu.org/viewcvs?rev=257146&root=gcc&view=rev
Log:
PR libstdc++/83833 fix failing test on ia32

PR libstdc++/83833
* testsuite/26_numerics/random/chi_squared_distribution/83833.cc:
Add -ffloat-store to options for m68k and ia32.

Modified:
trunk/libstdc++-v3/ChangeLog
   
trunk/libstdc++-v3/testsuite/26_numerics/random/chi_squared_distribution/83833.cc

[Bug c++/83942] [8 Regression] False -Wunused-but-set-variable when const scoped enum is cast to int

2018-01-29 Thread jason at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83942

Jason Merrill  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |jason at gcc dot gnu.org

[Bug tree-optimization/82965] [8 regression][armeb] gcc.dg/vect/pr79347.c starts failing after r254379

2018-01-29 Thread amker at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82965

--- Comment #7 from amker at gcc dot gnu.org ---
While I am understanding the issue.  The dump of ifcvt pass is as below:


;;   basic block 2, loop depth 0, count 118111601 (estimated locally), maybe
hot
;;prev block 0, next block 3, flags: (NEW, REACHABLE, VISITED)
;;pred:   ENTRY [always]  count:118111601 (estimated locally)
(FALLTHRU,EXECUTABLE)
  # VUSE <.MEM_12(D)>
  c.1_17 = cD.1595;
  if (c.1_17 > 0)
goto ; [89.00%]
  else
goto ; [11.00%]
;;succ:   3 [89.0% (guessed)]  count:105119325 (estimated locally)
(TRUE_VALUE,EXECUTABLE)
;;5 [11.0% (guessed)]  count:12992276 (estimated locally)
(FALSE_VALUE,EXECUTABLE)

;;   basic block 3, loop depth 0, count 105119325 (estimated locally), maybe
hot
;;prev block 2, next block 4, flags: (NEW, REACHABLE, VISITED)
;;pred:   2 [89.0% (guessed)]  count:105119325 (estimated locally)
(TRUE_VALUE,EXECUTABLE)
  # VUSE <.MEM_12(D)>
  ;;...
;;succ:   4 [always]  count:105119325 (estimated locally)
(FALLTHRU,EXECUTABLE)

;;   basic block 4, loop depth 1, count 955630224 (estimated locally), maybe
hot
;;prev block 3, next block 6, flags: (NEW, REACHABLE, VISITED)
;;pred:   3 [always]  count:105119325 (estimated locally)
(FALLTHRU,EXECUTABLE)
;;6 [always]  count:850510900 (estimated locally)
(FALLTHRU,DFS_BACK,EXECUTABLE)
  # RANGE [0, 2147483647] NONZERO 2147483647
  # i_18 = PHI <0(3), i_14(6)>
  ;;...
  if (i_14 < c.1_17)
goto ; [89.00%]
  else
goto ; [11.00%]
;;succ:   6 [89.0% (guessed)]  count:850510900 (estimated locally)
(TRUE_VALUE,EXECUTABLE)
;;5 [11.0% (guessed)]  count:105119324 (estimated locally)
(FALSE_VALUE,EXECUTABLE)

;;   basic block 6, loop depth 1, count 850510900 (estimated locally), maybe
hot
;;prev block 4, next block 5, flags: (NEW, VISITED)
;;pred:   4 [89.0% (guessed)]  count:850510900 (estimated locally)
(TRUE_VALUE,EXECUTABLE)
  goto ; [100.00%]
;;succ:   4 [always]  count:850510900 (estimated locally)
(FALLTHRU,DFS_BACK,EXECUTABLE)

;;   basic block 5, loop depth 0, count 118111601 (estimated locally), maybe
hot
;;prev block 6, next block 1, flags: (NEW, REACHABLE, VISITED)
;;pred:   2 [11.0% (guessed)]  count:12992276 (estimated locally)
(FALSE_VALUE,EXECUTABLE)
;;4 [11.0% (guessed)]  count:105119324 (estimated locally)
(FALSE_VALUE,EXECUTABLE)
  ;;...
  return;
;;succ:   EXIT [always (guessed)]  count:118111601 (estimated locally)

Question is for edges  and  and local count as:
basic block 2, loop depth 0, count 118111601
basic block 4, loop depth 1, count 955630224
basic block 5, loop depth 0, count 118111601

Is count of bb5 is wrong?

[Bug bootstrap/84017] [6/7/8 regression] Bootstrap failure on Solaris 10/x86 with gas/ld

2018-01-29 Thread ro at CeBiTec dot Uni-Bielefeld.DE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84017

--- Comment #6 from ro at CeBiTec dot Uni-Bielefeld.DE  ---
> --- Comment #4 from ro at CeBiTec dot Uni-Bielefeld.DE  Uni-Bielefeld.DE> ---
[...]
> I was reminded of ld's -z relaxreloc option (more on that separately).
> While it doesn't help in this case, it probably provides an option to
> enable comdat on some versions of Solaris 10 (though certainly not for GCC 8).

Just for the record: here's the patch that does this:

https://gcc.gnu.org/ml/gcc-patches/2018-01/msg02257.html

[Bug target/83831] [RX] Unused bclr,bnot,bset insns

2018-01-29 Thread olegendo at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83831

--- Comment #3 from Oleg Endo  ---
Created attachment 43270
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43270&action=edit
Patch for GCC 7

Tested with "make -k check" on rx-sim for c and c++ with no new failures.

[Bug middle-end/84067] [8 regression] gcc.dg/wmul-1.c regression on aarch64 after r257077

2018-01-29 Thread ktkachov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84067

--- Comment #8 from ktkachov at gcc dot gnu.org ---
(In reply to rguent...@suse.de from comment #7)
> On Mon, 29 Jan 2018, ktkachov at gcc dot gnu.org wrote:
> 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84067
> > 
> > --- Comment #6 from ktkachov at gcc dot gnu.org ---
> > (In reply to rguent...@suse.de from comment #5)
> > > On Mon, 29 Jan 2018, ktkachov at gcc dot gnu.org wrote:
> > > 
> > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84067
> > > > 
> > > > --- Comment #3 from ktkachov at gcc dot gnu.org ---
> > > > (In reply to Richard Biener from comment #2)
> > > > > So any hint on whether the code after r257077 is better or worse than 
> > > > > before?
> > > > 
> > > > Looks worse unfortunately:
> > > > For aarch64 at -O2 it generates:
> > > > foo:
> > > > mov w3, 44
> > > > mov w2, 40
> > > > mov w5, 1
> > > > mov w4, 2
> > > > smull   x3, w1, w3
> > > > smull   x2, w1, w2
> > > > str w5, [x0, x3]
> > > > add x2, x2, 400
> > > > add x1, x2, x1, sxtw 2
> > > > str w4, [x0, x1]
> > > > ret
> > > > 
> > > > whereas with r257077 it generates the shorter:
> > > > foo:
> > > > mov w3, 40
> > > > sxtwx2, w1
> > > > mov w4, 1
> > > > smaddl  x0, w1, w3, x0
> > > > mov w3, 2
> > > > add x1, x0, x2, lsl 2
> > > > str w4, [x0, x2, lsl 2]
> > > > str w3, [x1, 400]
> > > > ret
> > > 
> > > So shorter is worse?  Might be because I don't understand the
> > > difference between the 'lsl 2' and the 'sxtw 2' or the cost
> > > of the [x1, 400] addressing.
> > 
> > Sorry, I messed up the writeup. Let me try again.
> > The shorter sequence (with the smaddl) is the good one and is produced
> > *without* r257077. After r257077 we generate the longer and worse sequence 
> > with
> > two smull.
> 
> I see the shorter sequence with TOT, r257077 included.  The testcase
> explicitely checks for no widen-mult-plus but we now have two:
> 
>[local count: 1073741825]:
>   _17 = Idx_6(D) w* 44;
>   _13 = Arr_7(D) + _17;
>   MEM[(int[10] *)_13] = 1;
>   _4 = WIDEN_MULT_PLUS_EXPR ;
>   _18 = WIDEN_MULT_PLUS_EXPR ;
>   _16 = Arr_7(D) + _18;
>   MEM[(int[10] *)_16] = 2;
>   return;
> 
> note the "shorter" sequence I see is
> 
> foo:
> mov x4, 400
> mov w3, 40
> mov w2, 44
> mov w5, 1
> smaddl  x3, w1, w3, x4
> mov w4, 2
> smull   x2, w1, w2
> add x1, x3, x1, sxtw 2
> str w5, [x0, x2]
> str w4, [x0, x1]
> ret
> 
> which doesn't 1:1 match either of yours.

Hmm, the exact instruction mix will depend a lot on the cpu tuning in question
because the RTX costs affect the widening multiplication expansion, but at the
tree level I see only one WIDEN_MULT_PLUS_EXPR with current ToT (with r257077):

   [local count: 1073741825]:
  _1 = (long unsigned int) Idx_6(D);
  _2 = Idx_6(D) w* 40;
  _3 = Arr_7(D) + _2;
  _12 = Idx_6(D) w* 4;
  _11 = Idx_6(D) w* 44;
  _13 = Arr_7(D) + _11;
  MEM[(int[10] *)_13] = 1;
  _4 = _2 + 400;
  _5 = Arr_7(D) + _4;
  _14 = WIDEN_MULT_PLUS_EXPR ;
  _16 = Arr_7(D) + _14;
  MEM[(int[10] *)_16] = 2;
  return;

[Bug libstdc++/83833] chi_squared_distribution::param() forgot to change the member gamma_distribution

2018-01-29 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83833

--- Comment #10 from Jonathan Wakely  ---
Author: redi
Date: Mon Jan 29 14:44:48 2018
New Revision: 257149

URL: https://gcc.gnu.org/viewcvs?rev=257149&root=gcc&view=rev
Log:
PR libstdc++/83833 fix chi_squared_distribution::param(const param&)

Backport from mainline
2018-01-15  Jonathan Wakely  

PR libstdc++/83833
* include/bits/random.h (chi_squared_distribution::param): Update
gamma distribution parameter.
* testsuite/26_numerics/random/chi_squared_distribution/83833.cc: New
test.

Added:
   
branches/gcc-6-branch/libstdc++-v3/testsuite/26_numerics/random/chi_squared_distribution/83833.cc
Modified:
branches/gcc-6-branch/libstdc++-v3/ChangeLog
branches/gcc-6-branch/libstdc++-v3/include/bits/random.h

[Bug libstdc++/83833] chi_squared_distribution::param() forgot to change the member gamma_distribution

2018-01-29 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83833

--- Comment #11 from Jonathan Wakely  ---
Author: redi
Date: Mon Jan 29 14:45:00 2018
New Revision: 257150

URL: https://gcc.gnu.org/viewcvs?rev=257150&root=gcc&view=rev
Log:
PR libstdc++/83833 fix failing test on ia32

PR libstdc++/83833
* testsuite/26_numerics/random/chi_squared_distribution/83833.cc:
Add -ffloat-store to options for m68k and ia32.

Modified:
branches/gcc-7-branch/libstdc++-v3/ChangeLog
   
branches/gcc-7-branch/libstdc++-v3/testsuite/26_numerics/random/chi_squared_distribution/83833.cc

[Bug libstdc++/83833] chi_squared_distribution::param() forgot to change the member gamma_distribution

2018-01-29 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83833

Jonathan Wakely  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED
   Target Milestone|--- |6.5

--- Comment #12 from Jonathan Wakely  ---
Fixed for 6.5, 7.4 and 8.1

[Bug libstdc++/83833] chi_squared_distribution::param() forgot to change the member gamma_distribution

2018-01-29 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83833

--- Comment #10 from Jonathan Wakely  ---
Author: redi
Date: Mon Jan 29 14:44:48 2018
New Revision: 257149

URL: https://gcc.gnu.org/viewcvs?rev=257149&root=gcc&view=rev
Log:
PR libstdc++/83833 fix chi_squared_distribution::param(const param&)

Backport from mainline
2018-01-15  Jonathan Wakely  

PR libstdc++/83833
* include/bits/random.h (chi_squared_distribution::param): Update
gamma distribution parameter.
* testsuite/26_numerics/random/chi_squared_distribution/83833.cc: New
test.

Added:
   
branches/gcc-6-branch/libstdc++-v3/testsuite/26_numerics/random/chi_squared_distribution/83833.cc
Modified:
branches/gcc-6-branch/libstdc++-v3/ChangeLog
branches/gcc-6-branch/libstdc++-v3/include/bits/random.h

--- Comment #11 from Jonathan Wakely  ---
Author: redi
Date: Mon Jan 29 14:45:00 2018
New Revision: 257150

URL: https://gcc.gnu.org/viewcvs?rev=257150&root=gcc&view=rev
Log:
PR libstdc++/83833 fix failing test on ia32

PR libstdc++/83833
* testsuite/26_numerics/random/chi_squared_distribution/83833.cc:
Add -ffloat-store to options for m68k and ia32.

Modified:
branches/gcc-7-branch/libstdc++-v3/ChangeLog
   
branches/gcc-7-branch/libstdc++-v3/testsuite/26_numerics/random/chi_squared_distribution/83833.cc

[Bug libstdc++/83626] std::experimental::filesystem::remove_all throws exception instead of returning 0 if path doesn't exist

2018-01-29 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83626

Jonathan Wakely  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED
   Target Milestone|--- |6.5

--- Comment #16 from Jonathan Wakely  ---
Fixed for 6.5, 7.3 and 8.1

[Bug lto/84105] New: [8 regression] Segmentation fault in pp_tree_identifier() during LTO

2018-01-29 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84105

Bug ID: 84105
   Summary: [8 regression] Segmentation fault in
pp_tree_identifier() during LTO
   Product: gcc
   Version: 8.0.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: lto
  Assignee: unassigned at gcc dot gnu.org
  Reporter: arnd at linaro dot org
CC: marxin at gcc dot gnu.org
  Target Milestone: ---

I got an ICE while building the linux kernel module net/sctp/sctp.ko with
i386-linux-gcc-8.0.1, currently using r257114. A slightly older gcc-8.0.0
(dated 20180107, exact revision unknown) doesn't have this problem.

  /bin/bash /git/arm-soc/scripts/gcc-ld -fuse-linker-plugin -flto=jobserver
-flto  -fno-strict-aliasing -fno-fat-lto-objects -Wno-attribute-alias
-fwhole-program  -fno-strict-aliasing -fdump-ipa-cgraph
-fdump-ipa-inline-details -fipa-cp-clone -r -m elf_i386 -T
/git/arm-soc/scripts/module-common.lds --build-id  -o net/sctp/sctp.ko
net/sctp/sctp.o net/sctp/sctp.mod.o ;  true
during IPA pass: inline
dump file: net/sctp/sctp.ko.ltrans0.079i.inline
/git/arm-soc/net/sctp/sm_sideeffect.c: In function 'sctp_do_sm':
/git/arm-soc/net/sctp/sm_sideeffect.c:1155:5: internal compiler error:
Segmentation fault
 int sctp_do_sm(struct net *net, enum sctp_event event_type,
 ^
0xa42b7f crash_signal
/home/arnd/git/gcc/gcc/toplev.c:325
0xaf0659 pp_tree_identifier(pretty_printer*, tree_node*)
/home/arnd/git/gcc/gcc/tree-pretty-print.c:4006
0xaf0966 dump_decl_name
/home/arnd/git/gcc/gcc/tree-pretty-print.c:261
0xaf42ea dump_generic_node(pretty_printer*, tree_node*, int, unsigned long,
bool)
/home/arnd/git/gcc/gcc/tree-pretty-print.c:1826
0xaf769a print_declaration(pretty_printer*, tree_node*, int, unsigned long)
/home/arnd/git/gcc/gcc/tree-pretty-print.c:
0xaf7997 print_generic_decl(_IO_FILE*, tree_node*, unsigned long)
/home/arnd/git/gcc/gcc/tree-pretty-print.c:122
0xb4603a dump_scope_block
/home/arnd/git/gcc/gcc/tree-ssa-live.c:647
0xb471b9 dump_scope_blocks(_IO_FILE*, unsigned long)
/home/arnd/git/gcc/gcc/tree-ssa-live.c:678
0xb471b9 remove_unused_locals()
/home/arnd/git/gcc/gcc/tree-ssa-live.c:870
0x97af44 execute_function_todo
/home/arnd/git/gcc/gcc/passes.c:1972
0x97b8b9 execute_todo
/home/arnd/git/gcc/gcc/passes.c:2048
0x97dac5 execute_one_ipa_transform_pass
/home/arnd/git/gcc/gcc/passes.c:2245
0x97dac5 execute_all_ipa_transforms()
/home/arnd/git/gcc/gcc/passes.c:2281
0x6d681c cgraph_node::expand()
/home/arnd/git/gcc/gcc/cgraphunit.c:2132
0x6d7b38 expand_all_functions
/home/arnd/git/gcc/gcc/cgraphunit.c:2275
0x6d7b38 symbol_table::compile()
/home/arnd/git/gcc/gcc/cgraphunit.c:2624
0x656c51 lto_main()
/home/arnd/git/gcc/gcc/lto/lto.c:3349
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.

I have not been able to create a simple test case for it, but can provide steps
for reproducing, or help test patches. If necessary, I can do a bisection, but
maybe someone can see from the backtrace what is happening, or has a duplicate
bugreport.

>From what I can tell, the ICE is caused by a typedef inside of a function,
moving the typedef outside of the function avoids the problem. See the source
code at:

https://elixir.free-electrons.com/linux/v4.15/source/net/sctp/sm_sideeffect.c#L1172

[Bug c++/84082] [7/8 Regression] ICE with broken template function definition

2018-01-29 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84082

Jakub Jelinek  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org

--- Comment #5 from Jakub Jelinek  ---
Created attachment 43271
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43271&action=edit
gcc8-pr84082.patch

Untested fix.

[Bug fortran/84093] Invalid nested derived type constructor not rejected

2018-01-29 Thread neil.n.carlson at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84093

--- Comment #2 from Neil Carlson  ---
The forced cascade of keyword use is rather annoying, so perhaps someone was
thinking the current gfortran behavior is a useful extension, and it almost is.
But consider this example:

type :: parent
  type(parent), pointer :: next => null()
end type

type, extends(parent) :: child
  integer :: n
end type

type(child) :: c
type(parent), pointer :: p

allocate(p)
allocate(p%next)

c = child(parent=p,n=1)
if (.not.associated(c%next,p%next)) stop 1

c = child(p,1)
if (.not.associated(c%next,p)) stop 2

end

GFortran doesn't distinguish between the two constructor expressions, treating
the second the same as the first, when in fact they are quite different.

[Bug tree-optimization/84037] [8 Regression] Speed regression of polyhedron benchmark since r256644

2018-01-29 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84037

Richard Biener  changed:

   What|Removed |Added

 CC||amker at gcc dot gnu.org

--- Comment #13 from Richard Biener  ---
So performance counters on my Broadwell machine say that there are zero hits
for this vectorized loop with LSD.UOPS while there are very many for the scalar
case.  This means the loop body is too large to trigger the LSD (and probably
also to fit the uop cache).

One of the issue is that we require 4 registers for the indexes into the loads

vmovq   (%rax,%r11,2), %xmm7
vpinsrq $1, (%rax,%r13), %xmm7, %xmm4
vmovq   (%rax), %xmm7
vpinsrq $1, (%rax,%r11), %xmm7, %xmm9
vmovq   (%rax,%r15), %xmm7
vpinsrq $1, (%rax,%r12), %xmm7, %xmm3
vmovq   (%rax,%r11,4), %xmm7
vpinsrq $1, (%rax,%r14), %xmm7, %xmm1
vinserti128 $0x1, %xmm4, %ymm9, %ymm9

from

  _1507 = (void *) ivtmp.760_1462;
  _792 = MEM[base: _1507, offset: 0B];
  _1508 = (void *) ivtmp.760_1462;
  _794 = MEM[base: _1508, index: _331, offset: 0B];
  _1509 = (void *) ivtmp.760_1462;
  _796 = MEM[base: _1509, index: _331, step: 2, offset: 0B];
  _1510 = (void *) ivtmp.760_1462;
  _1511 = _331 * 3;
  _798 = MEM[base: _1510, index: _1511, offset: 0B];
  _1512 = (void *) ivtmp.760_1462;
  _800 = MEM[base: _1512, index: _331, step: 4, offset: 0B];
  _1513 = (void *) ivtmp.760_1462;
  _1514 = _331 * 5;
  _802 = MEM[base: _1513, index: _1514, offset: 0B];
  _1515 = (void *) ivtmp.760_1462;
  _1516 = _331 * 6;
  _804 = MEM[base: _1515, index: _1516, offset: 0B];
  _1517 = (void *) ivtmp.760_1462;
  _1518 = _331 * 7;
  _806 = MEM[base: _1517, index: _1518, offset: 0B];
  vect_cst__808 = {_792, _794, _796, _798, _800, _802, _804, _806};

where IVOPTs did a reasonable job.  Later LIM hoists all the invariant
_311 * N indexes.  And IVOPTs failed to realize that _331 * 3 can be used
for _331 * 6 by using step == 2.  But in the end the register optimal
decision is probably to strength-reduce this (the vectorizer generates
strength-reduced code).

We do end up spilling most IVs in this loop.

[Bug c++/84091] [8 Regression] ICE on valid C++ code: Segmentation fault

2018-01-29 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84091

Jakub Jelinek  changed:

   What|Removed |Added

   Priority|P3  |P1
 CC||jakub at gcc dot gnu.org,
   ||jason at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek  ---
Started with r257093.

[Bug c++/83503] [8 Regression] bogus -Wattributes for const and pure on function template specialization

2018-01-29 Thread jason at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83503

Jason Merrill  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
 CC||jason at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |jason at gcc dot gnu.org

[Bug target/83008] [performance] Is it better to avoid extra instructions in data passing between loops?

2018-01-29 Thread sergey.shalnov at intel dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83008

--- Comment #33 from sergey.shalnov at intel dot com ---
Richard,
I'm not sure is it a regression or not. I see code has been visibly refactored 
in this commit
https://github.com/gcc-mirror/gcc/commit/ee6e9ba576099aed29f1097195c649fc796ecf5e
in 2013 year.

However this fix is highly important for our workloads and really desirable for
GCC8.
For example, your patch plus SKX cost model changes give us ~+1% in geomean
with spec2017 intrate.

It would be really good for us to make this patch into GCC8.

Cost model changes planned to be (will be proposed separately):

diff --git a/gcc/config/i386/x86-tune-costs.h
b/gcc/config/i386/x86-tune-costs.h
index e943d13..d5e6ef6 100644
--- a/gcc/config/i386/x86-tune-costs.h
+++ b/gcc/config/i386/x86-tune-costs.h
@@ -1557,7 +1557,7 @@ struct processor_costs skylake_cost = {
   {4, 4, 4},   /* cost of loading integer registers
   in QImode, HImode and SImode.
   Relative to reg-reg move (2).  */
-  {6, 6, 6},   /* cost of storing integer registers */
+  {6, 6, 3},   /* cost of storing integer registers */
   2,   /* cost of reg,reg fld/fst */
   {6, 6, 8},   /* cost of loading fp registers
   in SFmode, DFmode and XFmode */
@@ -1572,7 +1572,7 @@ struct processor_costs skylake_cost = {
   {6, 6, 6, 10, 20},   /* cost of loading SSE registers
   in 32,64,128,256 and 512-bit */
   {6, 6, 6, 10, 20},   /* cost of unaligned loads.  */
-  {8, 8, 8, 8, 16},/* cost of storing SSE registers
+  {8, 8, 8, 14, 24},   /* cost of storing SSE registers
   in 32,64,128,256 and 512-bit */
   {8, 8, 8, 8, 16},/* cost of unaligned stores.  */
   2, 2,/* SSE->integer and
integer->SSE moves */

I know that you have some concerns about costs above, I'm saving it for next
discussion 
because your patch is a good foundation to proceed with costs tuning.

Sergey

[Bug libgomp/84086] [8 Regresssion] segfault in instantiate_scev_r for libgomp.fortran/examples-4/simd-2.f90 -O1

2018-01-29 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84086

--- Comment #4 from Richard Biener  ---
Author: rguenth
Date: Mon Jan 29 15:22:55 2018
New Revision: 257152

URL: https://gcc.gnu.org/viewcvs?rev=257152&root=gcc&view=rev
Log:
2018-01-29  Richard Biener  

PR tree-optimization/84086
* tree-ssanames.c: Include cfgloop.h and tree-scalar-evolution.h.
(flush_ssaname_freelist): When SSA names were released reset
the SCEV hash table.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/tree-ssanames.c

  1   2   3   >