[Bug c/78352] GCC lacks support for the Apple "blocks" extension to the C family of languages

2020-11-08 Thread grobian at gentoo dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78352

--- Comment #14 from Fabian Groffen  ---
(In reply to Eric Gallager from comment #13)
> If we could get in touch with an actual lawyer to review which laws
> specifically are getting in the way here, that could be helpful. I won my
> election to the New Hampshire State Legislature so if there's any
> legislation I could pass to make it legal to apply those patches here in NH,
> I'd love to know how to write it.

FWIW: if Iain wrote a new patch, then we don't need Apple's original work which
from my experience, frankly is messy.  There's lots of stuff in there
intertwined, so going by a specification e.g. Clang's
(https://clang.llvm.org/docs/BlockLanguageSpec.html) is probably the best way
forward in any case.

[Bug c/78352] GCC lacks support for the Apple "blocks" extension to the C family of languages

2020-11-08 Thread iains at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78352

--- Comment #15 from Iain Sandoe  ---
(In reply to Fabian Groffen from comment #14)
> (In reply to Eric Gallager from comment #13)
> > If we could get in touch with an actual lawyer to review which laws
> > specifically are getting in the way here,

I would expect that the determination has been made by the FSF lawyers (but I
am not an authority here, just repeating the policy put to me when I started
work on the Darwin port, years ago).

> that could be helpful. I won my
> > election to the New Hampshire State Legislature 

congrats!

>>so if there's any
> > legislation I could pass to make it legal to apply those patches here in NH,
> > I'd love to know how to write it.

IMO the technical issues with reusing 4.2.1 code are so significant that it
would be a poor use of your time chasing a way to include stuff that we'd need
to rewrite anyway (see below)

> FWIW: if Iain wrote a new patch, then we don't need Apple's original work
> which from my experience, frankly is messy.

Indeed, it isn't suitable for the current source base - there have been a lot
of changes since 4.2.1.  As a secondary consideration, I also want to move
Objective-C style metadata generation until after LTO has run (and Apple blocks
also makes use of that style meta-data).

>  There's lots of stuff in there
> intertwined, so going by a specification e.g. Clang's
> (https://clang.llvm.org/docs/BlockLanguageSpec.html) is probably the best
> way forward in any case.

Which is what I was doing + 1:1 comparison with clang's output ( on the grounds
that the ABI is defined by the actual output regardless of what the
documentation says ;) ) 

Sorry that there hasn't been much progress on this - it *was* top of my GCC11
TODO list, and then Apple Si. came along and torpedoed that...

[Bug rtl-optimization/97459] __uint128_t remainder for division by 3

2020-11-08 Thread tkoenig at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97459

--- Comment #14 from Thomas Koenig  ---
Created attachment 49520
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49520&action=edit
Numbers a, b so that 2^b  ≡ 1 mod a up to b=64, larger b taken if several
solutions exist, plus the multiplicative inverse for 2^128

I've added the multiplicative inverse to the table, calculated with
maxima by inv_mod(x,2^128). Output is in hex, to make it easier to
break down into two numbers.

Is there any more info that I could provide?

[Bug testsuite/97680] [11 Regression] new test case c-c++-common/zero-scratch-regs-10.c in r11-4578 has excess errors

2020-11-08 Thread iains at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97680

--- Comment #6 from Iain Sandoe  ---
(In reply to Iain Sandoe from comment #5)
> I added xfail-if for powerpc-darwin (8,9, 10 and 11).
> 
> https://gcc.gnu.org/pipermail/gcc-cvs/2020-November/336720.html
> 
> Since i don't think I will have time this cycle to implement it (there are
> much more pressing demands on the time) - at least the tests will then XPASS
> if/when the impl. is done. 

Unfortunately, that's not enough; the XFAIL only covers the run and we have to
skip the tests completely to avoid testsuite output noise (which I've done for
powerpc-darwin).

[Bug rtl-optimization/97459] __uint128_t remainder for division by 3

2020-11-08 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97459

--- Comment #15 from Jakub Jelinek  ---
I plan to work on this early in stage3.
And we really shouldn't use any tables, GCC should figure it all out.
So, for double-word modulo by constant that would be expanded using a libcall,
go for x from the word bitsize to double-word bitsize and check if (1max << x)
% cst
is 1 (and prefer what we've agreed on for 3), and fall back to multiplications
(see #c8) if there aren't any other options and the costs don't say it is too
costly.

[Bug fortran/97589] Segementation fault when allocating coarrays.

2020-11-08 Thread tkoenig at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97589

--- Comment #3 from Thomas Koenig  ---
Simplified test case:

program main
  type foo
 real, allocatable, dimension(:) :: a[:]
  end type foo
  type (foo) :: x
  sync all
  allocate (x%a(10)[*])
end program main

[Bug c++/97755] New: Explicit default constructor is called during copy-list-initialization with a warning only

2020-11-08 Thread egor_suvorov at mail dot ru via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97755

Bug ID: 97755
   Summary: Explicit default constructor is called during
copy-list-initialization with a warning only
   Product: gcc
   Version: 10.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: egor_suvorov at mail dot ru
  Target Milestone: ---

Consider the following test case:
https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/testsuite/g%2B%2B.dg/cpp0x/initlist40.C

// PR c++/54835, DR 1518
// { dg-do compile { target c++11 } }
struct A
{
  explicit A(int = 42);
};
int main()
{
  A a1 = { };   // { dg-error "explicit" }
  A a2 = { 24 };// { dg-error "explicit" }
}

GCC fails to compile it, but the line with 'a1' emits only a warning:
"converting to 'A' from initializer list would use explicit constructor
'A::A(int)'". Hence, if I comment out the line with 'a2', compilation succeeds.

However, if I modify the test case slightly:

struct A
{
  explicit A();
  explicit A(int);
};
int main()
{
  A a1 = { };   // { dg-error "explicit" }
  A a2 = { 24 };// { dg-error "explicit" }
}

Both messages become errors.

I believe it's a regression between GCC 5 (correctly fails both test cases) and
GCC 6 (emits warning instead of error): https://godbolt.org/z/1o81h1

Looks like the change was brought by this commit:
https://gcc.gnu.org/git/?p=gcc.git&a=commit;h=e7838ec9d2ea06e844ef23660862781b81a26329
from this bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54835

I'm suspicious that the code says "When converting from an init list we
consider explicit constructors, but actually trying to call one is an error.",
but then proceeds to call `pedwarn` instead of `error` in some cases.

[Bug c++/51242] [C++11] Unable to use strongly typed enums as bit fields

2020-11-08 Thread barry.revzin at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51242

Barry Revzin  changed:

   What|Removed |Added

 CC||barry.revzin at gmail dot com

--- Comment #31 from Barry Revzin  ---
Apparently this was fixed in 9.3?

enum class Color { Red, Green, Blue };

struct X {
Color c : 2;
};

auto x = X{.c=Color::Red};

warns on 9.2, but not anymore on 9.3 or 10.

[Bug c++/97755] Explicit default constructor is called during copy-list-initialization with a warning only

2020-11-08 Thread harald at gigawatt dot nl via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97755

Harald van Dijk  changed:

   What|Removed |Added

 CC||harald at gigawatt dot nl

--- Comment #1 from Harald van Dijk  ---
This may be in order to ensure that the following valid C++03 code is accepted
in C++11 mode as well, to limit the impact when the default language version
was changed:

  struct A {
explicit A(int = 24);
  };
  int main() {
A a[1] = {};
  }

This did not get diagnosed in GCC 5 in any mode. GCC 6 accepts it without a
warning in C++03 mode, and accepts it with a warning in C++11 mode.

[Bug rtl-optimization/97459] __uint128_t remainder for division by 3

2020-11-08 Thread tkoenig at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97459

--- Comment #16 from Thomas Koenig  ---
(In reply to Jakub Jelinek from comment #15)
> I plan to work on this early in stage3.
> And we really shouldn't use any tables, GCC should figure it all out.
> So, for double-word modulo by constant that would be expanded using a
> libcall, go for x from the word bitsize to double-word bitsize and check if
> (1max << x) % cst
> is 1

It's probably better to search from high to low, to reduce the number
of necessary shifts for division by constants like 9 or 13.

> (and prefer what we've agreed on for 3), and fall back to
> multiplications (see #c8) if there aren't any other options and the costs
> don't say it is too costly.

I think for variants where the constants aren't power of two,

#define ONE ((__uint128_t) 1)
#define TWO_64 (ONE << 64)
#define MASK60 ((1ul << 60) - 1)

void
div_rem_13 (mytype n, mytype *div, unsigned int *rem)
{
  const mytype magic = TWO_64 * 14189803133622732012u + 5675921253449092805u *
ONE; /* 0xC4EC4EC4EC4EC4EC4EC4EC4EC4EC4EC5 */
  __uint64_t a, b, c;
  unsigned int r;

  a = n & MASK60;
  b = (n >> 60);
  b = b & MASK60;
  c = (n >> 120);
  r = (a+b+c) % 13;
  n = n - r;
  *div = n * magic;
  *rem = r;
}

should be pretty efficient; there is only one shift which spans two
words.  (The assembly generated from the function looks weird
because of quite a few move instructions, but that should not be
an issue for code generated inline).

Regarding the approach in comment #8, I think I'll run some benchmarks
to see how well that works for other constants which don't fit
the pattern of being divisors for 2^n-1.

[Bug rtl-optimization/97459] __uint128_t remainder for division by 3

2020-11-08 Thread tkoenig at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97459

--- Comment #17 from Thomas Koenig  ---

To be compilable, my previous code lacks

typedef __uint128_t mytype;

> #define ONE ((__uint128_t) 1)

[Bug rtl-optimization/97756] New: Inefficient handling of 128-bit arguments

2020-11-08 Thread tkoenig at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97756

Bug ID: 97756
   Summary: Inefficient handling of 128-bit arguments
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: tkoenig at gcc dot gnu.org
  Target Milestone: ---

This is an offshoot from PR 97459.

The code

#define ONE ((__uint128_t) 1)
#define TWO_64 (ONE << 64)
#define MASK60 ((1ul << 60) - 1)

typedef __uint128_t mytype;

void
div_rem_13_v2 (mytype n, mytype *div, unsigned int *rem)
{
  const mytype magic = TWO_64 * 14189803133622732012u + 5675921253449092805u *
ONE;
  unsigned long a, b, c;
  unsigned int r;

  a = n & MASK60;
  b = (n >> 60);
  b = b & MASK60;
  c = (n >> 120);
  r = (a+b+c) % 13;
  n = n - r;
  *div = n * magic;
  *rem = r;
}

when compiled on x86_64 on Zen with -O3 -march=native has quite
some register shuffling at the beginning:

   0:   49 89 f0mov%rsi,%r8
   3:   48 89 femov%rdi,%rsi
   6:   49 89 d1mov%rdx,%r9
   9:   48 ba ff ff ff ff ffmovabs $0xfff,%rdx
  10:   ff ff 0f 
  13:   4c 89 c7mov%r8,%rdi
  16:   48 89 f0mov%rsi,%rax
  19:   49 89 c8mov%rcx,%r8
  1c:   48 89 f1mov%rsi,%rcx
  1f:   49 89 famov%rdi,%r10
  22:   48 0f ac f8 3c  shrd   $0x3c,%rdi,%rax
  27:   48 21 d1and%rdx,%rcx
  2a:   41 56   push   %r14
  2c:   49 c1 ea 38 shr$0x38,%r10
  30:   48 21 d0and%rdx,%rax
  33:   53  push   %rbx
  34:   48 bb c5 4e ec c4 4emovabs $0x4ec4ec4ec4ec4ec5,%rbx
  3b:   ec c4 4e 
  3e:   4c 01 d1add%r10,%rcx
  41:   45 31 dbxor%r11d,%r11d
  44:   48 01 c1add%rax,%rcx
  47:   48 89 c8mov%rcx,%rax
  4a:   48 f7 e3mul%rbx
  4d:   48 c1 ea 02 shr$0x2,%rdx
  51:   48 8d 04 52 lea(%rdx,%rdx,2),%rax
  55:   48 8d 04 82 lea(%rdx,%rax,4),%rax
  59:   48 89 camov%rcx,%rdx
  5c:   48 b9 ec c4 4e ec c4movabs $0xc4ec4ec4ec4ec4ec,%rcx
  63:   4e ec c4 
  66:   48 29 c2sub%rax,%rdx
  69:   48 29 d6sub%rdx,%rsi
  6c:   49 89 d6mov%rdx,%r14
  6f:   4c 19 dfsbb%r11,%rdi
  72:   48 0f af ce imul   %rsi,%rcx
  76:   48 89 f2mov%rsi,%rdx
  79:   48 89 f8mov%rdi,%rax
  7c:   c4 e2 cb f6 fb  mulx   %rbx,%rsi,%rdi
  81:   48 0f af c3 imul   %rbx,%rax
  85:   49 89 31mov%rsi,(%r9)
  88:   48 01 c8add%rcx,%rax
  8b:   48 01 c7add%rax,%rdi
  8e:   49 89 79 08 mov%rdi,0x8(%r9)
  92:   45 89 30mov%r14d,(%r8)
  95:   5b  pop%rbx
  96:   41 5e   pop%r14
  98:   c3  retq

[Bug tree-optimization/97757] New: [11 Regression] fortran save_6.f90 fails with a segv for -flto -O >= 2

2020-11-08 Thread iains at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97757

Bug ID: 97757
   Summary: [11 Regression] fortran save_6.f90 fails with a segv
for -flto -O >= 2
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: iains at gcc dot gnu.org
  Target Milestone: ---

most likely in the range r11-4777 and r11-4781.

It doesn't seem to reproduce on Linux - but it shows on a stage#1 built with
debug - so probably will show on a darwin cross.

gcc/f951 /src-local/gcc-master/gcc/testsuite/gfortran.dg/save_6.f90 -fPIC
-quiet -dumpdir a- -dumpbase save_6.f90 -dumpbase-ext .f90
-mmacosx-version-min=10.12.0 -mtune=core2 -O2 -version -fno-automatic -flto
-fintrinsic-modules-path finclude -o a-save_6.s

looks like a GGC issue

GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
during IPA pass: modref
/src-local/gcc-master/gcc/testsuite/gfortran.dg/save_6.f90:54:3: internal
compiler error: Segmentation fault: 11
   54 | end
  |   ^
0x1017d33c5 crash_signal
/src-local/gcc-master/gcc/toplev.c:330
0x1012e52d6 modref_tree::merge(modref_tree*, vec*)
/src-local/gcc-master/gcc/ipa-modref-tree.h:420
0x1012e35b9 modref_propagate_in_scc
/src-local/gcc-master/gcc/ipa-modref.c:2440
0x1012e3ac9 execute
/src-local/gcc-master/gcc/ipa-modref.c:2549

=

Process 49712 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS
(code=EXC_I386_GPFLT)
frame #0: 0x0001012e52d6
f951`modref_tree::merge(this=0xa5a5a5a5a5a5a5a5, other=0x0001469008c0,
parm_map=0x7fff5fbff330) at ipa-modref-tree.h:420
   417   Return true if something has changed.  */
   418bool merge (modref_tree  *other, vec  *parm_map)
   419{
-> 420  if (!other || every_base)
   421return false;
   422  if (other->every_base)
   423{
Target 0: (f951) stopped.

[Bug tree-optimization/97757] [11 Regression] fortran save_6.f90 fails with a segv for -flto -O >= 2

2020-11-08 Thread iains at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97757

Iain Sandoe  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2020-11-08
 Status|UNCONFIRMED |NEW
 CC||hubicka at gcc dot gnu.org
 Target||*-*-darwin*
   Keywords||ice-on-valid-code

[Bug libstdc++/97758] New: bits/std_function.h: error: unknown type name 'type_info' when using -fno-exceptions -fno-rtti

2020-11-08 Thread romain.geissler at amadeus dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97758

Bug ID: 97758
   Summary: bits/std_function.h: error: unknown type name
'type_info' when using -fno-exceptions -fno-rtti
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: romain.geissler at amadeus dot com
  Target Milestone: ---

Hi,

I am using the trunk from today (8th november, git revision
b642fca1c31b2e2175e0860daf32b4ee0d918085).

When trying to build clang with it I end up with this error (on Linux x86_64):

FAILED: lib/CodeGen/CMakeFiles/LLVMCodeGen.dir/ParallelCG.cpp.o 
/workdir/build/final-system/llvm-build/./bin/clang++  -DGTEST_HAS_RTTI=0
-D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS
-D__STDC_LIMIT_MACROS -Ilib/CodeGen -I/workdir/src/llvm-12.0.0/llvm/lib/CodeGen
-Iinclude -I/workdir/src/llvm-12.0.0/llvm/include -isystem
/workdir/build/final-system/llvm-temporary-static-dependencies/install/include
-O2
-I/workdir/build/final-system/llvm-temporary-static-dependencies/install/include
-I/workdir/build/final-system/llvm-temporary-static-dependencies/install/include/ncursesw
-fPIC -fvisibility-inlines-hidden -Werror=date-time
-Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter
-Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic
-Wno-long-long -Wimplicit-fallthrough -Wcovered-switch-default
-Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor
-Wsuggest-override -Wstring-conversion -fdiagnostics-color -ffunction-sections
-fdata-sections
-fprofile-instr-generate="/workdir/build/final-system/llvm-build/tools/clang/stage2-instrumented-bins/profiles/%4m.profraw"
-flto -O3 -DNDEBUG-fno-exceptions -fno-rtti -std=c++14 -MD -MT
lib/CodeGen/CMakeFiles/LLVMCodeGen.dir/ParallelCG.cpp.o -MF
lib/CodeGen/CMakeFiles/LLVMCodeGen.dir/ParallelCG.cpp.o.d -o
lib/CodeGen/CMakeFiles/LLVMCodeGen.dir/ParallelCG.cpp.o -c
/workdir/src/llvm-12.0.0/llvm/lib/CodeGen/ParallelCG.cpp
In file included from
/workdir/src/llvm-12.0.0/llvm/lib/CodeGen/ParallelCG.cpp:13:
In file included from
/workdir/src/llvm-12.0.0/llvm/include/llvm/CodeGen/ParallelCG.h:17:
In file included from
/opt/1A/toolchain/x86_64-v21.0.10/lib64/gcc/x86_64-1a-linux-gnu/11.0.0/../../../../include/c++/11.0.0/functional:59:
/opt/1A/toolchain/x86_64-v21.0.10/lib64/gcc/x86_64-1a-linux-gnu/11.0.0/../../../../include/c++/11.0.0/bits/std_function.h:190:31:
error: unknown type name 'type_info'
  __dest._M_access() = nullptr;
 ^
1 error generated.

Note that apparently these llvm files are compiled with -fno-exceptions
-fno-rtti, so it seems triggered by the recent changes around std::function
without rtti support.

Cheers,
Romain

[Bug libstdc++/97759] New: Could std::has_single_bit implementation be faster?

2020-11-08 Thread gcc-bugs at marehr dot dialup.fu-berlin.de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97759

Bug ID: 97759
   Summary: Could std::has_single_bit implementation be faster?
   Product: gcc
   Version: 10.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gcc-bugs at marehr dot dialup.fu-berlin.de
  Target Milestone: ---

Hello gcc-team,

we are thrilled that C++20 offers some efficient bit implementation and that we
could exchange some of our own implementation with the standardized ones,
making the code more accessible.

I replaced our implementation and noticed that `std::has_single_bit` was slower
than what we had before by around 30%. (The other functions matched our
timings.)

Additionally, we have a (micro-)benchmark that compares the standard arithmetic
bit trick
(https://graphics.stanford.edu/~seander/bithacks.html#DetermineIfPowerOf2) with
the implementation where popcount == 1. We decided to use the arithmetic
version, because we measured that it was faster than popcount on our machines
(mostly intel processors).

Interestingly, it seems that the popcount benchmark matches the
std::has_single_bit time-wise, so I guess that std::has_single_bit is
implemented via popcount.

Those timings could be reproduced at an unknown location
https://quick-bench.com/q/Y28keu_mSh25WwhO05T4SKrbHpk

I don't know how to fix this, but I would expect that the optimizer would
recognize popcount=1 and knows that there is a more efficient version. Or
change the implementation to arithmetic, where again the optimizer could decide
to replace that by a popcount if that is more efficient on some architecture?

Thank you!

[Bug tree-optimization/97760] New: GCC outputs wrong values when compiling the testcase with -O3

2020-11-08 Thread yangyang305 at huawei dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97760

Bug ID: 97760
   Summary: GCC outputs wrong values when compiling the testcase
with -O3
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: yangyang305 at huawei dot com
  Target Milestone: ---

Hi, gcc-trunk outputs wrong values when compiling the attached testcase with
-O3.


gcc -O0 test.c -w && ./a.out

159,150,150
150

gcc -O3 test.c -w && ./a.out

159,123,123
123

GCC version: 11.0.0 20201106 (experimental)

[Bug tree-optimization/97760] GCC outputs wrong values when compiling the testcase with -O3

2020-11-08 Thread yangyang305 at huawei dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97760

--- Comment #1 from yangyang  ---
Created attachment 49521
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49521&action=edit
testcase

[Bug rtl-optimization/97705] [11 regression] cc.c-torture/unsorted/dump-noaddr.c.*r.ira fails after r11-4637

2020-11-08 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97705

--- Comment #5 from CVS Commits  ---
The master branch has been updated by Kewen Lin :

https://gcc.gnu.org/g:ce4ae1f4893e322495c5d24b2f0e807a7f7cf92f

commit r11-4827-gce4ae1f4893e322495c5d24b2f0e807a7f7cf92f
Author: Kewen Lin 
Date:   Sun Nov 8 20:35:21 2020 -0600

ira: Recompute regstat as max_regno changes [PR97705]

As PR97705 shows, the commit r11-4637 caused some dumping
comparison difference error on pass ira.  It exposed one
issue about the newly introduced function remove_scratches,
which can increase the largest pseudo reg number if it
succeeds, later some function will use the max_reg_num()
to get the latest max_regno, when iterating the numbers
we can access some data structures which are allocated as
the previous max_regno, some out of array bound accesses
can occur, the failure can be random since the values
beyond the array could be random.

This patch is to free/reinit/recompute the relevant data
structures that is regstat_n_sets_and_refs and reg_info_p
to ensure we won't access beyond some array bounds.

Bootstrapped/regtested on powerpc64le-linux-gnu P9 and
powerpc64-linux-gnu P8.

gcc/ChangeLog:

PR rtl-optimization/97705
* ira.c (ira): Refactor some regstat free/init/compute invocation
into lambda function regstat_recompute_for_max_regno, and call it
when max_regno increases as remove_scratches succeeds.

[Bug libstdc++/97759] Could std::has_single_bit be faster?

2020-11-08 Thread tkoenig at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97759

Thomas Koenig  changed:

   What|Removed |Added

   Keywords||missed-optimization
   Severity|normal  |enhancement
 CC||tkoenig at gcc dot gnu.org

--- Comment #1 from Thomas Koenig  ---
Could you post the benchmark and the exact architecture where the arithmetic
version is faster?

[Bug rtl-optimization/97705] [11 regression] cc.c-torture/unsorted/dump-noaddr.c.*r.ira fails after r11-4637

2020-11-08 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97705

Kewen Lin  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #6 from Kewen Lin  ---
Should be fixed with latest trunk r11-4827.

[Bug rtl-optimization/97756] Inefficient handling of 128-bit arguments

2020-11-08 Thread tkoenig at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97756

--- Comment #1 from Thomas Koenig  ---
Actually, it was on a Ryzen 1700 (for the -march=native).

I'm at odds with architecture names...

[Bug c++/93008] Need a way to make inlining heuristics ignore whether a function is inline

2020-11-08 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93008

--- Comment #6 from Jan Hubicka  ---
I just noticed this PR and wonder if there is anything to do on inliner side. 
It uses DECL_DECLARED_INLINE that was invented to distinguish between implicit
inlines and explicit ones. So even if it would be bit misnamed it should mean
"this is an inline hint for inliner", so I guess frontend needs to distinguish
between constexpr and normal places where inline hint still means "inline
more"?

Inliner is really not on level to be able to completely ignore used inline
hints without regressing various code.

I made inline weaker for -O2 in GCC10 but for -O3 we still take it very
seriously and I do not see way out of that: in many cases it is very hard to
predict how much optimization will happen after inlining and a lot of code is
carefully crafted under assumption that some specific inline happens (and a lot
of such code is in C++)

[Bug tree-optimization/97761] New: [11 Regression] ICE in vectorizable_live_operation, at tree-vect-loop.c:8689

2020-11-08 Thread asolokha at gmx dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97761

Bug ID: 97761
   Summary: [11 Regression] ICE in vectorizable_live_operation, at
tree-vect-loop.c:8689
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: asolokha at gmx dot com
  Target Milestone: ---
Target: powerpc-*-linux-gnu

Created attachment 49522
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49522&action=edit
Testcase

gfortran-11.0.0-alpha20201108 snapshot
(g:b642fca1c31b2e2175e0860daf32b4ee0d918085) ICEs when compiling the attached
testcase w/ -mvsx -O1 -ftree-slp-vectorize -fvect-cost-model=unlimited:

% powerpc-e300c3-linux-gnu-gfortran-11.0.0 -mvsx -O1 -ftree-slp-vectorize
-fvect-cost-model=unlimited -c ar6dubil.f90
during GIMPLE pass: slp
ar6dubil.f90:11:15:

   11 |   subroutine ni (ps, bf)
  |   ^
internal compiler error: in vectorizable_live_operation, at
tree-vect-loop.c:8689
0x6f1f2c vectorizable_live_operation(vec_info*, _stmt_vec_info*,
gimple_stmt_iterator*, _slp_tree*, _slp_instance*, int, bool,
vec*)
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-11.0.0_alpha20201108/work/gcc-11-20201108/gcc/tree-vect-loop.c:8689
0x10b8087 can_vectorize_live_stmts
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-11.0.0_alpha20201108/work/gcc-11-20201108/gcc/tree-vect-stmts.c:10510
0x10df928 vect_transform_stmt(vec_info*, _stmt_vec_info*,
gimple_stmt_iterator*, _slp_tree*, _slp_instance*)
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-11.0.0_alpha20201108/work/gcc-11-20201108/gcc/tree-vect-stmts.c:10894
0x726 vect_schedule_slp_node
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-11.0.0_alpha20201108/work/gcc-11-20201108/gcc/tree-vect-slp.c:5437
0x111d0bc vect_schedule_scc
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-11.0.0_alpha20201108/work/gcc-11-20201108/gcc/tree-vect-slp.c:5599
0x111ce2f vect_schedule_scc
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-11.0.0_alpha20201108/work/gcc-11-20201108/gcc/tree-vect-slp.c:5580
0x111ce2f vect_schedule_scc
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-11.0.0_alpha20201108/work/gcc-11-20201108/gcc/tree-vect-slp.c:5580
0x111ce2f vect_schedule_scc
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-11.0.0_alpha20201108/work/gcc-11-20201108/gcc/tree-vect-slp.c:5580
0x111d40c vect_schedule_slp(vec_info*, vec<_slp_instance*, va_heap, vl_ptr>)
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-11.0.0_alpha20201108/work/gcc-11-20201108/gcc/tree-vect-slp.c:5715
0x111ebba vect_slp_region
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-11.0.0_alpha20201108/work/gcc-11-20201108/gcc/tree-vect-slp.c:4264
0x111ebba vect_slp_bbs
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-11.0.0_alpha20201108/work/gcc-11-20201108/gcc/tree-vect-slp.c:4374
0x111fa9c vect_slp_function(function*)
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-11.0.0_alpha20201108/work/gcc-11-20201108/gcc/tree-vect-slp.c:4460
0x112208b execute
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-11.0.0_alpha20201108/work/gcc-11-20201108/gcc/tree-vectorizer.c:1437

[Bug lto/80379] Redundant note: code may be misoptimized unless -fno-strict-aliasing is used

2020-11-08 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80379

--- Comment #3 from Jan Hubicka  ---
The problem here is that the hint is output at decl merging and
-fno-strict-aliasing is a function local flag. At that time we do not even know
what functions will be since units are not streamed in yet.  This means that we
do not know if some unit has function that is -fno-strict-aliasing. So
supressing the warning does not fit the implementation very easily :(