[Bug d/113667] [14 Regression] libgphobos symbols missing

2024-01-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113667

Richard Biener  changed:

   What|Removed |Added

   Keywords||ABI
   Priority|P3  |P1
   Target Milestone|--- |14.0

[Bug go/113668] [14 Regression] libgo soname bump needed for the GCC 14 release?

2024-01-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113668

Richard Biener  changed:

   What|Removed |Added

   Keywords||ABI
 CC||rguenth at gcc dot gnu.org
   Target Milestone|--- |14.0

[Bug middle-end/113669] -fsanitize=undefined failed to check a signed integer overflow

2024-01-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113669

Richard Biener  changed:

   What|Removed |Added

   Last reconfirmed||2024-01-31
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #2 from Richard Biener  ---
So confirmed.

[Bug tree-optimization/113670] ICE with vectors in named registers and -fno-vect-cost-model

2024-01-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113670

Richard Biener  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2024-01-31
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org

--- Comment #3 from Richard Biener  ---
I'll hunt it down.

[Bug tree-optimization/99395] s116 benchmark of TSVC is vectorized by clang and not by gcc

2024-01-31 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99395

--- Comment #14 from JuzheZhong  ---
Thanks Richard.

It seems that we can't fix this issue for now. Is that right ?

If I understand correctly, do you mean we should wait after SLP representations
are finished and then revisit this PR?

[Bug tree-optimization/99395] s116 benchmark of TSVC is vectorized by clang and not by gcc

2024-01-31 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99395

--- Comment #15 from rguenther at suse dot de  ---
On Wed, 31 Jan 2024, juzhe.zhong at rivai dot ai wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99395
> 
> --- Comment #14 from JuzheZhong  ---
> Thanks Richard.
> 
> It seems that we can't fix this issue for now. Is that right ?
> 
> If I understand correctly, do you mean we should wait after SLP 
> representations
> are finished and then revisit this PR?

Yes.

[Bug regression/113672] [14 Regression] FAIL: g++.dg/pch/line-map-3.C -g -I. -Dwith_PCH (test for excess errors)

2024-01-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113672

Richard Biener  changed:

   What|Removed |Added

   Keywords||testsuite-fail
   Target Milestone|--- |14.0

[Bug tree-optimization/113673] [12/13/14 Regression] ICE: verify_flow_info failed: BB 5 cannot throw but has an EH edge with -Os -finstrument-functions -fnon-call-exceptions -ftrapv

2024-01-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113673

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P2

--- Comment #2 from Richard Biener  ---
Looks like an issue in bswap with regard to EH.

[Bug c++/113674] [11/12/13/14 Regression] [[____attr____]] causes internal compiler error: in decl_attributes, at attribs.cc:776

2024-01-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113674

Richard Biener  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2024-01-31

[Bug tree-optimization/113676] [12 Regression] Miscompilation tree-vrp __builtin_unreachable

2024-01-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113676

Richard Biener  changed:

   What|Removed |Added

 Target||x86_64-*-*
Summary|[11/12 Regression]  |[12 Regression]
   |Miscompilation tree-vrp |Miscompilation tree-vrp
   |__builtin_unreachable   |__builtin_unreachable

--- Comment #1 from Richard Biener  ---
Needs -std=c++20.  I can't reproduce locally.

[Bug tree-optimization/99395] s116 benchmark of TSVC is vectorized by clang and not by gcc

2024-01-31 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99395

--- Comment #16 from JuzheZhong  ---
(In reply to rguent...@suse.de from comment #15)
> On Wed, 31 Jan 2024, juzhe.zhong at rivai dot ai wrote:
> 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99395
> > 
> > --- Comment #14 from JuzheZhong  ---
> > Thanks Richard.
> > 
> > It seems that we can't fix this issue for now. Is that right ?
> > 
> > If I understand correctly, do you mean we should wait after SLP 
> > representations
> > are finished and then revisit this PR?
> 
> Yes.

It seems to be a big refactor work.

I wonder I can do anything to help with SLP representations ?

[Bug tree-optimization/113677] Missing `VEC_PERM_EXPR <{a, CST}, CST, {0, 1, 2, ...}>` optimization

2024-01-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113677

Richard Biener  changed:

   What|Removed |Added

   Last reconfirmed||2024-01-31
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #3 from Richard Biener  ---
Yeah, most of the code in forwprop/match doesn't deal with the "new" permutes
where the result isn't the same length as the inputs.

[Bug target/113607] [14] RISC-V rv64gcv vector: Runtime mismatch at -O3

2024-01-31 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113607

--- Comment #23 from Robin Dapp  ---
> this is:
> 
> _429 = mask_patt_205.47_276[i] ? vect_cst__262[i] : (vect_cst__262 <<
> {0,..})[i];
> vect_iftmp.55_287 = mask_patt_209.54_286[i] ? _429 [i] : vect_cst__262[i]

But isn't it rather
_429 = mask_patt_205.47_276[i] ? (vect_cst__262[i] << vect_cst__262[i]) :
{0,..})[i]?

The else should be the last operand, shouldn't it?

On aarch64 we don't seem to emit a COND_SHL therefore this particular situation
does not occur.

However the simplification was introduced for aarch64:

(for cond_op (COND_BINARY)
 (simplify
  (vec_cond @0
   (cond_op:s @1 @2 @3 @4) @3)
  (cond_op (bit_and @1 @0) @2 @3 @4)))

It is supposed to simplify (in gcc.target/aarch64/sve/pre_cond_share_1.c)

  _256 = .COND_MUL (mask__108.48_193, vect_iftmp.45_187, vect_cst__190, { 0.0,
... });
  vect_prephitmp_151.50_197 = VEC_COND_EXPR ;

into COND_MUL (mask108 & mask101, vect_iftmp.45_187, vect_cst__190, { 0.0, ...
});

But that doesn't look valid to me either.  No matter what _256 is, the result
for !mask101 should be vect_cst__190 and not 0.0.

[Bug tree-optimization/113678] SLP misses up vec_concat

2024-01-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113678

Richard Biener  changed:

   What|Removed |Added

   Last reconfirmed||2024-01-31
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #1 from Richard Biener  ---
I think the SLP tree we discover is sound:

t2.c:11:14: note:   node 0x5db76f0 (max_nunits=8, refcnt=2) vector(8) char
t2.c:11:14: note:   op template: *a_7(D) = _1;
t2.c:11:14: note:   stmt 0 *a_7(D) = _1;
t2.c:11:14: note:   stmt 1 MEM[(char *)a_7(D) + 1B] = _2;
t2.c:11:14: note:   stmt 2 MEM[(char *)a_7(D) + 2B] = _3;
t2.c:11:14: note:   stmt 3 MEM[(char *)a_7(D) + 3B] = _4;
t2.c:11:14: note:   stmt 4 MEM[(char *)a_7(D) + 4B] = _1;
t2.c:11:14: note:   stmt 5 MEM[(char *)a_7(D) + 5B] = _2;
t2.c:11:14: note:   stmt 6 MEM[(char *)a_7(D) + 6B] = _3;
t2.c:11:14: note:   stmt 7 MEM[(char *)a_7(D) + 7B] = _4;
t2.c:11:14: note:   children 0x5db7778
t2.c:11:14: note:   node 0x5db7778 (max_nunits=8, refcnt=2) vector(8) char
t2.c:11:14: note:   op template: _1 = *b_6(D);
t2.c:11:14: note:   stmt 0 _1 = *b_6(D);
t2.c:11:14: note:   stmt 1 _2 = MEM[(char *)b_6(D) + 1B];
t2.c:11:14: note:   stmt 2 _3 = MEM[(char *)b_6(D) + 2B];
t2.c:11:14: note:   stmt 3 _4 = MEM[(char *)b_6(D) + 3B];
t2.c:11:14: note:   stmt 4 _1 = *b_6(D);
t2.c:11:14: note:   stmt 5 _2 = MEM[(char *)b_6(D) + 1B];
t2.c:11:14: note:   stmt 6 _3 = MEM[(char *)b_6(D) + 2B];
t2.c:11:14: note:   stmt 7 _4 = MEM[(char *)b_6(D) + 3B];
t2.c:11:14: note:   load permutation { 0 1 2 3 0 1 2 3 }

the issue is as so often

t2.c:11:14: note:   ==> examining statement: _1 = *b_6(D);
t2.c:11:14: missed:   BB vectorization with gaps at the end of a load is not
supported
t2.c:3:19: missed:   not vectorized: relevant stmt not supported: _1 = *b_6(D);
t2.c:11:14: note:   Building vector operands of 0x5db7778 from scalars instead

where we are not applying much non-ad-hoc work to deal with those
"out-of-bound" accesses.  The choice here would be obvious in doing
a single vector(4) load instead.

[Bug c/113679] New: long long minus double with gcc -m32 produces different results than other compilers or gcc -m64

2024-01-31 Thread dilyan.palauzov at aegee dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113679

Bug ID: 113679
   Summary: long long minus double with gcc -m32 produces
different results than other compilers or gcc -m64
   Product: gcc
   Version: 13.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: dilyan.palauzov at aegee dot org
  Target Milestone: ---

diff.c is:

#include 
int main(void) {
  long long l = 9223372036854775806;
  double d = 9223372036854775808.0;
  printf("%f\n", (double)l - d);
  return 0;
}


With gcc (GCC) 13.2.1 20231205 (Red Hat 13.2.1-6), gcc (Ubuntu
9.4.0-1ubuntu1~20.04.2) 9.4.0, clang 16.0.4 and clang 17.0.5:

$ gcc -m64 -o diff diff.c && ./diff
0.00
$ gcc -m32 -o diff diff.c && ./diff
-2.00
$ clang -m64 -o diff diff.c && ./diff
0.00
$ clang -m32 -o diff diff.c && ./diff
0.00

With cl.exe 19.29.3015319.29.30153 (first is x84 - 32 bit, second is 64 bit)
C:\> CALL "C:\Program Files (x86)\Microsoft Visual
Studio\2019\Community\VC\Auxiliary\Build\vcvarsall.bat" x86 10.0.17763.0
C:\> cl diff.c >nul 2>nul & .\diff.exe
0.00

C:\> CALL "C:\Program Files (x86)\Microsoft Visual
Studio\2019\Community\VC\Auxiliary\Build\vcvarsall.bat" amd64 10.0.17763.0
C:\> cl diff.c >nul 2>nul & .\diff.exe
0.00

gcc -m32 produces a different result, compared to gcc -m64, clang 17 (32 and
64bit), and MSCV Visual Studio 2019 (32 and 64bit).

[Bug target/113679] long long minus double with gcc -m32 produces different results than other compilers or gcc -m64

2024-01-31 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113679

Andrew Pinski  changed:

   What|Removed |Added

  Component|c   |target

--- Comment #1 from Andrew Pinski  ---
I suspect the issue is excessive precision with x87 fp.

[Bug target/113679] long long minus double with gcc -m32 produces different results than other compilers or gcc -m64

2024-01-31 Thread dilyan.palauzov at aegee dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113679

--- Comment #2 from Дилян Палаузов  ---
This happens only without optimizations:

$  gcc -O0 -m32 -o diff diff.c && ./diff
-2.00
$  gcc -O1 -m32 -o diff diff.c && ./diff
0.00
$  gcc -O2 -m32 -o diff diff.c && ./diff
0.00
$  gcc -O3 -m32 -o diff diff.c && ./diff
0.00

[Bug target/113679] long long minus double with gcc -m32 produces different results than other compilers or gcc -m64

2024-01-31 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113679

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #3 from Andrew Pinski  ---

[apinski@xeond2 gcc]$ ~/upstream-gcc/bin/gcc -m32  tr56.c
[apinski@xeond2 gcc]$ ./a.out
-2.00
[apinski@xeond2 gcc]$ ~/upstream-gcc/bin/gcc -m32  tr56.c
-fexcess-precision=standard
[apinski@xeond2 gcc]$ ./a.out
0.00
[apinski@xeond2 gcc]$ ~/upstream-gcc/bin/gcc -m32  tr56.c -msse2 -mfpmath=sse
[apinski@xeond2 gcc]$ ./a.out
0.00


Yes it is due to excessive precision of x87. Use either
`-fexcess-precision=standard` or `-msse2 -mfpmath=sse` if you don't want to use
the execessive precision of the x87 FP.

*** This bug has been marked as a duplicate of bug 323 ***

[Bug middle-end/323] optimized code gives strange floating point results

2024-01-31 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=323

Andrew Pinski  changed:

   What|Removed |Added

 CC||dilyan.palauzov at aegee dot 
org

--- Comment #231 from Andrew Pinski  ---
*** Bug 113679 has been marked as a duplicate of this bug. ***

[Bug target/113679] long long minus double with gcc -m32 produces different results than other compilers or gcc -m64

2024-01-31 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113679

--- Comment #4 from Jakub Jelinek  ---
Yeah, it is, that is how excess precision behaves.
Due to the cast applying just to l rather than l - d it returns 0.0 with
-fexcess-precision=standard, but if you change it to (double)(l - d) then it
will return -2.0
at all optimization levels with -fexcess-precision=standard. 
-fexcess-precision=fast
behaves depending on what instructions are actually used and where the
conversions to float or double happen due to storing of expressions or
subexpressions into memory as documented.
If you don't like excess precision and have SSE2, you can use -msse2
-mfpmath=sse.

[Bug tree-optimization/99395] s116 benchmark of TSVC is vectorized by clang and not by gcc

2024-01-31 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99395

--- Comment #17 from rguenther at suse dot de  ---
On Wed, 31 Jan 2024, juzhe.zhong at rivai dot ai wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99395
> 
> --- Comment #16 from JuzheZhong  ---
> (In reply to rguent...@suse.de from comment #15)
> > On Wed, 31 Jan 2024, juzhe.zhong at rivai dot ai wrote:
> > 
> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99395
> > > 
> > > --- Comment #14 from JuzheZhong  ---
> > > Thanks Richard.
> > > 
> > > It seems that we can't fix this issue for now. Is that right ?
> > > 
> > > If I understand correctly, do you mean we should wait after SLP 
> > > representations
> > > are finished and then revisit this PR?
> > 
> > Yes.
> 
> It seems to be a big refactor work.

It's not too bad if people wouldn't continue to add features not 
implementing SLP ...

> I wonder I can do anything to help with SLP representations ?

I hope to get back to this before stage1 re-opens and will post
another request for testing.  It's really mostly going to be making
sure all paths have coverage which means testing all the various
architectures - I can only easily test x86.  There's a branch
I worked on last year, refs/users/rguenth/heads/vect-force-slp,
which I use to hunt down cases not supporting SLP (it's a bit
overeager to trigger, and it has known holes so it's not really
a good starting point yet for folks to try other archs).

[Bug tree-optimization/113670] ICE with vectors in named registers and -fno-vect-cost-model

2024-01-31 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113670

--- Comment #4 from GCC Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:924137b9012cee5603482242de08fbf0b2030f6a

commit r14-8645-g924137b9012cee5603482242de08fbf0b2030f6a
Author: Richard Biener 
Date:   Wed Jan 31 09:09:50 2024 +0100

tree-optimization/113670 - gather/scatter to/from hard registers

The following makes sure we're not taking the address of hard
registers when vectorizing appearant gathers or scatters to/from
them.

PR tree-optimization/113670
* tree-vect-data-refs.cc (vect_check_gather_scatter):
Make sure we can take the address of the reference base.

* gcc.target/i386/pr113670.c: New testcase.

[Bug tree-optimization/113676] [12 Regression] Miscompilation tree-vrp __builtin_unreachable

2024-01-31 Thread magnus.hegdahl at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113676

--- Comment #2 from Magnus Hokland Hegdahl  ---
Hi, here's a version that doesn't need -std=c++20 or argv:

https://godbolt.org/z/Y9ooY998e

#include 

constexpr auto bit_ceil(unsigned x) -> unsigned {
if (x <= 1) return 1U;
int w = 32 - __builtin_clz(x - 1);
return 1U << w;
}

int main(int argc, char **) {
auto rounded_n = bit_ceil(static_cast(argc + 1));
auto a = std::vector(2UL * rounded_n);

for (std::size_t i = rounded_n; i-- > 1;) {
if (!(0 < i && i < rounded_n)) __builtin_unreachable();
a[i] = 0;
}
}

Exact compile command used with g++-12 (GCC) 12.3.0 on arch linux, x86_64:
g++-12 -O1 -ftree-vrp main.cpp

[Bug tree-optimization/113670] ICE with vectors in named registers and -fno-vect-cost-model

2024-01-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113670

Richard Biener  changed:

   What|Removed |Added

  Known to fail|14.0|
   Target Milestone|--- |14.0
 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED
  Known to work||14.0

--- Comment #5 from Richard Biener  ---
Fixed for trunk.

[Bug tree-optimization/99395] s116 benchmark of TSVC is vectorized by clang and not by gcc

2024-01-31 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99395

--- Comment #18 from JuzheZhong  ---
(In reply to rguent...@suse.de from comment #17)
> On Wed, 31 Jan 2024, juzhe.zhong at rivai dot ai wrote:
> 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99395
> > 
> > --- Comment #16 from JuzheZhong  ---
> > (In reply to rguent...@suse.de from comment #15)
> > > On Wed, 31 Jan 2024, juzhe.zhong at rivai dot ai wrote:
> > > 
> > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99395
> > > > 
> > > > --- Comment #14 from JuzheZhong  ---
> > > > Thanks Richard.
> > > > 
> > > > It seems that we can't fix this issue for now. Is that right ?
> > > > 
> > > > If I understand correctly, do you mean we should wait after SLP 
> > > > representations
> > > > are finished and then revisit this PR?
> > > 
> > > Yes.
> > 
> > It seems to be a big refactor work.
> 
> It's not too bad if people wouldn't continue to add features not 
> implementing SLP ...
> 
> > I wonder I can do anything to help with SLP representations ?
> 
> I hope to get back to this before stage1 re-opens and will post
> another request for testing.  It's really mostly going to be making
> sure all paths have coverage which means testing all the various
> architectures - I can only easily test x86.  There's a branch
> I worked on last year, refs/users/rguenth/heads/vect-force-slp,
> which I use to hunt down cases not supporting SLP (it's a bit
> overeager to trigger, and it has known holes so it's not really
> a good starting point yet for folks to try other archs).

Ok. It seems that you almost done with that but needs more testing in
various targets.

So, if I want to work on optimizing vectorization (start with TSVC),
I should avoid touching the failed vectorized due to data reference/dependence
analysis (e.g. this PR case, s116).

and avoid adding new features into loop vectorizer, e.g. min/max reduction with
index (s315).

To not to make your SLP refactoring work heavier.

Am I right ?

[Bug target/113633] FAIL: gcc.dg/bf-ms-attrib.c execution test, wrong size for ms_struct

2024-01-31 Thread lh_mouse at 126 dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113633

LIU Hao  changed:

   What|Removed |Added

 CC||lh_mouse at 126 dot com

--- Comment #1 from LIU Hao  ---
My suggestion is that following what MSVC produces is the only way to go.

[Bug tree-optimization/113676] [12 Regression] Miscompilation tree-vrp __builtin_unreachable

2024-01-31 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113676

Jakub Jelinek  changed:

   What|Removed |Added

 CC||aldyh at gcc dot gnu.org,
   ||amacleod at redhat dot com,
   ||jakub at gcc dot gnu.org
   Keywords|needs-bisection |

--- Comment #3 from Jakub Jelinek  ---
Bisection with -O2 -ftree-vrp
#include 

unsigned
bit_ceil (unsigned x)
{
  if (x <= 1)
return 1U;
  int w = 32 - __builtin_clz (x - 1);
  return 1U << w;
}

int
main (int argc, char **)
{
  unsigned rounded_n = bit_ceil ((unsigned) (argc + 1));
  auto a = std::vector (2UL * rounded_n);
  for (long unsigned int i = rounded_n; i-- > 1;)
{
  if (!(0 < i && i < rounded_n))
__builtin_unreachable();
  a[i] = 0;
}
}
shows this started with r12-155-gd8e1f1d24179690fd9c0f63c27b12e030010d9ea
and went away with r13-3596-ge7310e24b1c0ca67b1bb507c1330b2bf39e59e32
so nothing really backportable.

[Bug tree-optimization/113676] [12 Regression] Miscompilation tree-vrp __builtin_unreachable

2024-01-31 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113676

--- Comment #4 from Jakub Jelinek  ---
And with --param=vrp1-mode=vrp it segfaulted even with
r13-4276-gce917b0422c145779b83e005afd8433c0c86fb06 but the next revision
removed that parameter, so can't go further.

[Bug ipa/111444] [14 Regression] Wrong code at -O2/3/s on x86_64-gnu since r14-3226-gd073e2d75d9

2024-01-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111444

--- Comment #8 from Richard Biener  ---
OK, so the issue is that we're recording the IPA result with the wrong VUSE
since we're calling vn_reference_lookup_2 with !data->last_vuse_ptr but
data->finish (vr->set, vr->base_set, v) inserts a hashtable entry with
data->last_vuse.  Note it's somewhat unexpected that vn_reference_lookup_2
performs hashtable insertion which is what causes the issue.  It's also
not as easy as using the updated vuse since if we're coming from translation
through a memcpy that would be wrong.  In fact we probably want to avoid
doing any insertion if theres sth fishy going on (!data->last_vuse_ptr).

The best fix would likely be to pre-insert all the IPA-CP known constants
instead of trying to discover them "late".

I'm testing the easy fix for now.

[Bug tree-optimization/99395] s116 benchmark of TSVC is vectorized by clang and not by gcc

2024-01-31 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99395

--- Comment #19 from rguenther at suse dot de  ---
On Wed, 31 Jan 2024, juzhe.zhong at rivai dot ai wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99395
> 
> --- Comment #18 from JuzheZhong  ---
> (In reply to rguent...@suse.de from comment #17)
> > On Wed, 31 Jan 2024, juzhe.zhong at rivai dot ai wrote:
> > 
> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99395
> > > 
> > > --- Comment #16 from JuzheZhong  ---
> > > (In reply to rguent...@suse.de from comment #15)
> > > > On Wed, 31 Jan 2024, juzhe.zhong at rivai dot ai wrote:
> > > > 
> > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99395
> > > > > 
> > > > > --- Comment #14 from JuzheZhong  ---
> > > > > Thanks Richard.
> > > > > 
> > > > > It seems that we can't fix this issue for now. Is that right ?
> > > > > 
> > > > > If I understand correctly, do you mean we should wait after SLP 
> > > > > representations
> > > > > are finished and then revisit this PR?
> > > > 
> > > > Yes.
> > > 
> > > It seems to be a big refactor work.
> > 
> > It's not too bad if people wouldn't continue to add features not 
> > implementing SLP ...
> > 
> > > I wonder I can do anything to help with SLP representations ?
> > 
> > I hope to get back to this before stage1 re-opens and will post
> > another request for testing.  It's really mostly going to be making
> > sure all paths have coverage which means testing all the various
> > architectures - I can only easily test x86.  There's a branch
> > I worked on last year, refs/users/rguenth/heads/vect-force-slp,
> > which I use to hunt down cases not supporting SLP (it's a bit
> > overeager to trigger, and it has known holes so it's not really
> > a good starting point yet for folks to try other archs).
> 
> Ok. It seems that you almost done with that but needs more testing in
> various targets.
> 
> So, if I want to work on optimizing vectorization (start with TSVC),
> I should avoid touching the failed vectorized due to data reference/dependence
> analysis (e.g. this PR case, s116).

It depends on the actual case - the one in this bug at least looks like
half of it might be dealt with with the refactoring.

> and avoid adding new features into loop vectorizer, e.g. min/max reduction 
> with
> index (s315).

It's fine to add features if they works with SLP as well ;)  Note that
in the future SLP will also do the "single lane" case but it doesn't
do that on trunk.  Some features are difficult with multi-lane SLP
and probably not important in practice for that case, still handling
single-lane SLP will be important as otherwise the feature is lost.

> To not to make your SLP refactoring work heavier.
> 
> Am I right ?

Yes.  I've got early break vectorization to chase now, I was "finished"
with the parts I could exercise on x86_64 in autumn ...

[Bug debug/113637] ICE: in as_a, at machmode.h:381 with extern function declaration and _BitInt() used as VLA size

2024-01-31 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113637

--- Comment #4 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:457d2b59b58e5998e1e6967316d4e3e8f24edeed

commit r14-8651-g457d2b59b58e5998e1e6967316d4e3e8f24edeed
Author: Jakub Jelinek 
Date:   Wed Jan 31 10:56:15 2024 +0100

dwarf2out: Fix ICE on large _BitInt in loc_list_from_tree_1 [PR113637]

This spot uses SCALAR_INT_TYPE_MODE which obviously ICEs for large/huge
BITINT_TYPE types which have BLKmode.  But such large BITINT_TYPEs
certainly
don't fit into DWARF2_ADDR_SIZE either, so we can just assume it would be
false if type has BLKmode.

2024-01-31  Jakub Jelinek  

PR debug/113637
* dwarf2out.cc (loc_list_from_tree_1): Assume integral types
with BLKmode are larger than DWARF2_ADDR_SIZE.

* gcc.dg/bitint-80.c: New test.

[Bug tree-optimization/113639] ICE: in handle_operand_addr, at gimple-lower-bitint.cc:2265 at -O with _BitInt() in a struct

2024-01-31 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113639

--- Comment #3 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:90ac839a470d61ffcd9eee0d7d37ca9c385dfefb

commit r14-8650-g90ac839a470d61ffcd9eee0d7d37ca9c385dfefb
Author: Jakub Jelinek 
Date:   Wed Jan 31 10:50:33 2024 +0100

lower-bitint: Fix up VIEW_CONVERT_EXPR handling in handle_operand_addr
[PR113639]

Yet another spot where we need to treat VIEW_CONVERT_EXPR differently
from NOP_EXPR/CONVERT_EXPR.

2024-01-31  Jakub Jelinek  

PR tree-optimization/113639
* gimple-lower-bitint.cc (bitint_large_huge::handle_operand_addr):
For VIEW_CONVERT_EXPR set rhs1 to its operand.

* gcc.dg/bitint-79.c: New test.

[Bug target/113679] long long minus double with gcc -m32 produces different results than other compilers or gcc -m64

2024-01-31 Thread dilyan.palauzov at aegee dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113679

--- Comment #5 from Дилян Палаузов  ---
gcc -m64 -fexcess-precision=fast -o diff diff.c && ./diff
0.00
gcc -m32 -fexcess-precision=fast -o diff diff.c && ./diff
-2.00
clang -m32 -fexcess-precision=fast -o diff diff.c && ./diff
0.00
clang -m64 -fexcess-precision=fast -o diff diff.c && ./diff
0.00
gcc -m64 -fexcess-precision=standard -o diff diff.c && ./diff
0.00
gcc -m32 -fexcess-precision=standard -o diff diff.c && ./diff
0.00
clang -m32 -fexcess-precision=standard -o diff diff.c && ./diff
0.00
clang -m64 -fexcess-precision=standard -o diff diff.c && ./diff
0.00

If this excess precision has justification, why are the results different for
32 and 64bit code?  With

  printf("%f\n", (double)l - d);
  printf("%f\n", (double)(l - d));

there is indeed a difference:
$ gcc -m32 -fexcess-precision=standard -o diff diff.c && ./diff
0.00
-2.00

[Bug target/113679] long long minus double with gcc -m32 produces different results than other compilers or gcc -m64

2024-01-31 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113679

--- Comment #6 from Andrew Pinski  ---
Because 64bit uses the SSE2 fp instructions rather than x87 fp instructions.

[Bug rtl-optimization/113656] [x86] ICE in simplify_const_unary_operation, at simplify-rtx.cc:1954 with new -mavx10.1

2024-01-31 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113656

--- Comment #7 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:b59775b642bb2b1ecd2e6d52c988b9c432117bc8

commit r14-8652-gb59775b642bb2b1ecd2e6d52c988b9c432117bc8
Author: Jakub Jelinek 
Date:   Wed Jan 31 10:56:56 2024 +0100

simplify-rtx: Fix up last argument to simplify_gen_unary [PR113656]

When simplifying e.g. (float_truncate:SF (float_truncate:DF (reg:XF))
or (float_truncate:SF (float_extend:XF (reg:DF)) etc. into
(float_truncate:SF (reg:XF)) or (float_truncate:SF (reg:DF)) we call
simplify_gen_unary with incorrect op_mode argument, it should be
the argument's mode, but we call it with the outer mode instead.
As these are all floating point operations, the argument always
has non-VOIDmode and so we can just use that mode (as done in similar
simplifications a few lines later), but neither FLOAT_TRUNCATE nor
FLOAT_EXTEND are operations that should have the same modes of operand
and result.  This bug hasn't been a problem for years because normally
op_mode is used only if the mode of op is VOIDmode, otherwise it is
redundant, but r10-2139 added an assertion in some spots that op_mode
is right even in such cases.

2024-01-31  Jakub Jelinek  

PR rtl-optimization/113656
* simplify-rtx.cc (simplify_context::simplify_unary_operation_1)
: Fix up last argument to simplify_gen_unary.

* gcc.target/i386/pr113656.c: New test.

[Bug libstdc++/90276] PSTL tests fail in Debug Mode

2024-01-31 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90276

Jonathan Wakely  changed:

   What|Removed |Added

   Last reconfirmed|2024-01-24 00:00:00 |2019-04-29 0:00

--- Comment #4 from Jonathan Wakely  ---
In testsuite/util/pstl/test_utils.h we have:

template 
struct reverse_invoker
{
template 
void
operator()(Rest&&... rest)
{
// Random-access iterator
iterator_invoker()(std::forward(rest)...);

// Forward iterator
iterator_invoker()(std::forward(rest)...);

// Bidirectional iterator
iterator_invoker()(std::forward(rest)...);
}
};

This is called with rvalue iterators e.g.

TestUtils::invoke_on_all_policies(check_minelement(), wseq.seq.cbegin(),
wseq.seq.cend());

In the body of reverse_invoker::operator() we forward them as rvalues which
causes them to be moved into the by-value parameters of
iterator_invoker::operator()

Then we forward them again, which causes them to be moved again. The debug
iterators abort at this point, because they're singular after the first move.

So the problem is that a moved-from __debug::vector::iterator is singular, and
therefore can't be moved or copied. I wonder if that's really what we want, or
if a moved-from iterator should have the value-initialized state instead of a
singular state.

The standard is clear that a singular iterator cannot be copied or moved,
unless it was value-initialized, see [iterator.requirements.general] p7.

In any case, the PSTL test harness should probably not be using moved-from
iterators more than once.

[Bug libstdc++/90276] PSTL tests fail in Debug Mode

2024-01-31 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90276

Jonathan Wakely  changed:

   What|Removed |Added

   See Also||https://github.com/llvm/llv
   ||m-project/issues/80126

--- Comment #5 from Jonathan Wakely  ---
Reported upstream: https://github.com/llvm/llvm-project/issues/80126

[Bug target/113679] long long minus double with gcc -m32 produces different results than other compilers or gcc -m64

2024-01-31 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113679

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #7 from Jakub Jelinek  ---
And while SSE/SSE2 has instructions for performing arithmetics in IEEE754
single and double formats, x87 does not, everything is done in extended
precision (unless the FPU is configured to use smaller precision but then it
doesn't support the extended precision long double on the other side) and
conversions to IEEE754 single/double have to be done when storing the extended
precision registers into memory.
So, it is impossible to achieve the expected IEEE754 single and double
arithmetics behavior, one can get only something close to it (but with double
rounding problems) if all the temporaries are immediately stored into memory
and loaded from it again.
The -ffloat-store option does it to a limited extent (doesn't convert
everything though), but still, the performance is terrible.
C allows extended precision and specifies how to should behave, that is the
-fexcess-precision=standard model (e.g. enabled by default for
-std=c{99,11,...} options as opposed to -std=gnu..., then it is consistently
using the excess precision with some casts/assignments mandating rounding to
lower precisions, while -fexcess-precision=fast is what gcc has been
implementing before it has been introduced, excess precision is used there as
long as something is kept in the FPU registers and conversions are done when it
needs to be spilled to memory.

[Bug tree-optimization/113639] ICE: in handle_operand_addr, at gimple-lower-bitint.cc:2265 at -O with _BitInt() in a struct

2024-01-31 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113639

Jakub Jelinek  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #4 from Jakub Jelinek  ---
Fixed.

[Bug debug/113637] ICE: in as_a, at machmode.h:381 with extern function declaration and _BitInt() used as VLA size

2024-01-31 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113637

Jakub Jelinek  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #5 from Jakub Jelinek  ---
Fixed.

[Bug rtl-optimization/113656] [x86] ICE in simplify_const_unary_operation, at simplify-rtx.cc:1954 with new -mavx10.1

2024-01-31 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113656

--- Comment #8 from Jakub Jelinek  ---
Fixed on the trunk so far.

[Bug target/111403] LoongArch: Wrong code with -O -mlasx -fopenmp-simd

2024-01-31 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111403

--- Comment #3 from Xi Ruoyao  ---
It seems no longer happening with current trunk.  Let me do a bisection...

[Bug ipa/111444] [14 Regression] Wrong code at -O2/3/s on x86_64-gnu since r14-3226-gd073e2d75d9

2024-01-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111444

--- Comment #9 from Richard Biener  ---
(In reply to Richard Biener from comment #8)
> The best fix would likely be to pre-insert all the IPA-CP known constants
> instead of trying to discover them "late".
> 
> I'm testing the easy fix for now.

Hmm.  gcc.dg/ipa/pr92497-1.c FAILs because of that.  We get

__attribute__((noinline))
int bar.constprop (struct a a)
{
  intD.6 a$aD.2808;
  intD.6 D.2807;
  struct a aD.2806;
  intD.6 _4;

   [local count: 1073741824]:
  # .MEM_5 = VDEF <.MEM_2(D)>
  aD.2806 = aD.2800;
  # VUSE <.MEM_5>
  a$a_3 = aD.2806.aD.2769;

here and thus translate through the aggregate copy - the result should then
be put on aD.2806 but of course only with .MEM_5.

Maybe we can and should always use the default def here but I'm slightly
uneasy with the ref adjustment, esp. since we're going to record
for the saved operands (if those exist - the path where it goes wrong
isn't translated).

[Bug ipa/111444] [14 Regression] Wrong code at -O2/3/s on x86_64-gnu since r14-3226-gd073e2d75d9

2024-01-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111444

--- Comment #10 from Richard Biener  ---
Hmm, I have another fix.

[Bug tree-optimization/113630] [11/12/13/14 Regression] -fno-strict-aliasing introduces out-of-bounds memory access

2024-01-31 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113630

--- Comment #6 from GCC Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:724b64304ff5c8ac08a913509afd6fde38d7b767

commit r14-8653-g724b64304ff5c8ac08a913509afd6fde38d7b767
Author: Richard Biener 
Date:   Wed Jan 31 11:28:50 2024 +0100

tree-optimization/113630 - invalid code hoisting

The following avoids code hoisting (but also PRE insertion) of
expressions that got value-numbered to another one that are not
a valid replacement (but still compute the same value).  This time
because the access path ends in a structure with different size,
meaning we consider a related access as not trapping because of the
size of the base of the access.

PR tree-optimization/113630
* tree-ssa-pre.cc (compute_avail): Avoid registering a
reference with a representation with not matching base
access size.

* gcc.dg/torture/pr113630.c: New testcase.

[Bug tree-optimization/113630] [11/12/13 Regression] -fno-strict-aliasing introduces out-of-bounds memory access

2024-01-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113630

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P2
Summary|[11/12/13/14 Regression]|[11/12/13 Regression]
   |-fno-strict-aliasing|-fno-strict-aliasing
   |introduces out-of-bounds|introduces out-of-bounds
   |memory access   |memory access
  Known to work||14.0

--- Comment #7 from Richard Biener  ---
Fixed on trunk sofar.

[Bug tree-optimization/113134] gcc does not version loops with early break conditions that don't have side-effects

2024-01-31 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113134

--- Comment #19 from JuzheZhong  ---

The loop is:

bb 3 -> bb 4 -> bb 5
  |   |__⬆
  |__⬆

The condition in bb 3 is if (i_21 == 1001).
The condition in bb 4 is if (N_13(D) > i_18).

Look into lsplit:
This loop doesn't satisfy the check of:
if (split_loop (loop) || split_loop_on_cond (loop))

In split_loop_on_cond, it's trying to split the loop that condition
is loop invariant.  However, no matter bb 3 or bb 4, their conditions
are not loop invariant.

I wonder whether we should add a new kind of loop splitter like:

diff --git a/gcc/tree-ssa-loop-split.cc b/gcc/tree-ssa-loop-split.cc
index 04215fe7937..a4081b9b6f5 100644
--- a/gcc/tree-ssa-loop-split.cc
+++ b/gcc/tree-ssa-loop-split.cc
@@ -1769,7 +1769,8 @@ tree_ssa_split_loops (void)
   if (optimize_loop_for_size_p (loop))
continue;

-  if (split_loop (loop) || split_loop_on_cond (loop))
+  if (split_loop (loop) || split_loop_on_cond (loop)
+ || split_loop_for_early_break (loop))
{
  /* Mark our containing loop as having had some split inner loops.  */
  loop_outer (loop)->aux = loop;

[Bug libstdc++/99832] std::chrono::system_clock::to_time_t needs ABI tag for 32-bit time_t

2024-01-31 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99832

--- Comment #1 from Jonathan Wakely  ---
Maybe something like this:

diff --git a/libstdc++-v3/config/os/gnu-linux/os_defines.h
b/libstdc++-v3/config/os/gnu-linux/os_defines.h
index 0af29325388..f7c73560831 100644
--- a/libstdc++-v3/config/os/gnu-linux/os_defines.h
+++ b/libstdc++-v3/config/os/gnu-linux/os_defines.h
@@ -84,7 +84,13 @@
 // Since glibc 2.34 all pthreads functions are usable without linking to
 // libpthread.
 #  define _GLIBCXX_GTHREAD_USE_WEAK 0
-# endif
+// Since glibc 2.34 using -D_TIME_BITS=64 will enable 64-bit time_t
+// for "legacy ABIs", i.e. ones that historically used 32-bit time_t.
+// This internal glibc macro will be defined iff new 64-bit time_t is in use.
+#  ifdef __USE_TIME_BITS64
+#   define _GLIBCXX_TIME_BITS64 1
+#  endif
+# endif // glibc 2.34
 #endif // __linux__

 #endif
diff --git a/libstdc++-v3/include/bits/chrono.h
b/libstdc++-v3/include/bits/chrono.h
index 579c5a266be..a63782b92ff 100644
--- a/libstdc++-v3/include/bits/chrono.h
+++ b/libstdc++-v3/include/bits/chrono.h
@@ -1242,6 +1242,9 @@ _GLIBCXX_BEGIN_INLINE_ABI_NAMESPACE(_V2)
   now() noexcept;

   // Map to C API
+#ifdef _GLIBCXX_TIME_BITS64
+  [[__gnu__::__abi_tag__("__time64")]]
+#endif
   static std::time_t
   to_time_t(const time_point& __t) noexcept
   {
@@ -1249,6 +1252,9 @@ _GLIBCXX_BEGIN_INLINE_ABI_NAMESPACE(_V2)
   (__t.time_since_epoch()).count());
   }

+#ifdef _GLIBCXX_TIME_BITS64
+  [[__gnu__::__abi_tag__("__time64")]]
+#endif
   static time_point
   from_time_t(std::time_t __t) noexcept
   {

Alternatively, in  do:

#define _GLIBCXX_TIME_BITS64_ABI_TAG

and then in config/os/gnu-linux/os_defines.h:


#  ifdef __USE_TIME_BITS64
#   undef _GLIBCXX_TIME_BITS64_ABI_TAG
#   define _GLIBCXX_TIME_BITS64_ABI_TAG [[__gnu__::__abi_tag__("__time64")]]
#  endif

Then the chrono code can just use that unconditionally instead of using #ifdef

I think for musl, newer versions use 64-bit time_t unconditionally. I'm not
sure if we can (or need to) use the abi_tag there.

[Bug tree-optimization/113134] gcc does not version loops with early break conditions that don't have side-effects

2024-01-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113134

--- Comment #20 from Richard Biener  ---
I think we want split_loop () handle this case.  That means extending it to
handle loops with multiple exits.  OTOH after loop rotation to

  if (i_21 == 1001)
goto ; [1.00%]
  else
goto ; [99.00%]

   [local count: 1004539166]:
  i_18 = i_21 + 1;
  if (N_13(D) > i_18)
goto ; [94.50%]
  else
goto ; [5.50%]

it could be also IVCANONs job to rewrite the exit test so the bound is
loop invariant and it becomes a single exit.

There's another recent PR where an exit condition like i < N && i < M
should become i < MIN(N,M).

[Bug libstdc++/99832] std::chrono::system_clock::to_time_t needs ABI tag for 32-bit time_t

2024-01-31 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99832

--- Comment #2 from Jonathan Wakely  ---
(In reply to Jonathan Wakely from comment #1)
> +// Since glibc 2.34 using -D_TIME_BITS=64 will enable 64-bit time_t
> +// for "legacy ABIs", i.e. ones that historically used 32-bit time_t.
> +// This internal glibc macro will be defined iff new 64-bit time_t is in
> use.

This is correct for current glibc releases, but in glibc master
__USE_TIME_BITS64 is defined unconditionally to 0 or 1 and tells you the size
of time_t, not whether it switched to 64-bit counter to the legacy ABI:
https://inbox.sourceware.org/libc-alpha/20240118131801.600373-1-adhemerval.zane...@linaro.org/

Yay.

[Bug libstdc++/90276] PSTL tests fail in Debug Mode

2024-01-31 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90276

--- Comment #6 from Jonathan Wakely  ---
Some of the tests FAIL for different reasons:

/home/jwakely/src/gcc/build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/stl_algo.h:2051:
In function:
_FIter std::upper_bound(_FIter, _FIter, const _Tp&, _Compare) [with
_FIter = gnu_debug::_Safe_iterator*,
vector, allocator > > >, debug::vector,
allocator > >, random_access_iterator_tag>; _Tp = Num;
_Compare = main()::, Num)>]

Error: elements in iterator range [first, last) are not partitioned by the
predicate __comp and value __val.

Objects involved in the operation:
iterator "first" @ 0x7ffda0426810 {
  type = gnu_cxx::normal_iterator*, std::vector,
std::allocator > > > (mutable iterator);
  state = dereferenceable (start-of-sequence);
  references sequence with type 'std::debug::vector,
std::allocator > >' @ 0x7ffda0427730
}
iterator "last" @ 0x7ffda0426840 {
  type = gnu_cxx::normal_iterator*, std::vector,
std::allocator > > > (mutable iterator);
  state = dereferenceable;
  references sequence with type 'std::debug::vector,
std::allocator > >' @ 0x7ffda0427730
}
FAIL: 25_algorithms/pstl/alg_sorting/partial_sort.cc  -std=gnu++17 execution
test

[Bug rtl-optimization/113680] New: Missed optimization: Redundant cmp/test instructions when comparing (x - y) > 0

2024-01-31 Thread Explorer09 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113680

Bug ID: 113680
   Summary: Missed optimization: Redundant cmp/test instructions
when comparing (x - y) > 0
   Product: gcc
   Version: 13.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: Explorer09 at gmail dot com
  Target Milestone: ---

Note: This issue is not limited to x86-64. I also tested it with ARM64 gcc in
Compiler Explorer (https://godbolt.org/) and it has the same "redundant cmp
instruction" problem.

This may be related to bug #3507 but I can't make sure it's the same bug. I
apologize if I reported a duplicate.

== Test code ==

```c
#include 

void func1(int x, int y) {
int diff = x - y;
if (diff > 0)
putchar('>');
if (diff < 0)
putchar('<');
}

void func2(int x, int y) {
if ((x - y) > 0)
putchar('>');
if ((x - y) < 0)
putchar('<');
}

void func3(int x, int y) {
if (x > y)
putchar('>');
if (x < y) {
putchar('<');
}

void func4(int x, int y) {
int diff = x - y;
if (diff > 0)
putchar('>');
if (x < y) {
putchar('<');
}
```

== Actual result ==

With x86-64 "gcc -Os" it generates the following.

In short, gcc can recognize func1() and func2() as completely identical, but
didn't recognize func1() and func2() can both optimize to func3().

func4() currently generates the worst assembly, but it might be another issue
to address (something messes up the register allocation algorithm).

```x86asm
func1:
subl%esi, %edi
testl   %edi, %edi
jle .L2
movl$62, %edi
jmp .L4
.L2:
je  .L1
movl$60, %edi
.L4:
jmp putchar
.L1:
ret
func2:
jmp func1
func3:
cmpl%esi, %edi
jle .L8
movl$62, %edi
jmp .L10
.L8:
jge .L7
movl$60, %edi
.L10:
jmp putchar
.L7:
ret
func4:
pushq   %rbp
movl%edi, %ebp
pushq   %rbx
movl%esi, %ebx
pushq   %rcx
cmpl%esi, %edi
jle .L12
movl$62, %edi
callputchar
.L12:
cmpl%ebx, %ebp
jge .L11
popq%rdx
movl$60, %edi
popq%rbx
popq%rbp
jmp putchar
.L11:
popq%rax
popq%rbx
popq%rbp
ret
```

== Expected result ==

func1(), func2(), func3() and func4() are all identical. With the func3() as
the example for the best assembly. (No redundant "test" instruction; the "sub"
instruction can simplify into a "cmp".)

[Bug middle-end/113680] Missed optimization: Redundant cmp/test instructions when comparing (x - y) > 0

2024-01-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113680

Richard Biener  changed:

   What|Removed |Added

  Component|rtl-optimization|middle-end
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2024-01-31
   Keywords||easyhack,
   ||missed-optimization
 Ever confirmed|0   |1

--- Comment #1 from Richard Biener  ---
I don't think we have or had a (a - b) CMP 0 simplification pattern which
this seems to be about.  We have a +- CST CMP CST'.

Note the reverse, a < b ->  (a - b) < 0 isn't valid.

[Bug ipa/111444] [14 Regression] Wrong code at -O2/3/s on x86_64-gnu since r14-3226-gd073e2d75d9

2024-01-31 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111444

--- Comment #11 from GCC Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:cfb3f666562fb4ab896a05c234a697afb63627a4

commit r14-8655-gcfb3f666562fb4ab896a05c234a697afb63627a4
Author: Richard Biener 
Date:   Wed Jan 31 10:42:48 2024 +0100

tree-optimization/111444 - avoid insertions when skipping defs

The following avoids inserting expressions for IPA CP discovered
equivalences into the VN hashtables when we are optimistically
skipping may-defs in the attempt to prove it's redundant.

PR tree-optimization/111444
* tree-ssa-sccvn.cc (vn_reference_lookup_3): Do not use
vn_reference_lookup_2 when optimistically skipping may-defs.

* gcc.dg/torture/pr111444.c: New testcase.

[Bug ipa/111444] [14 Regression] Wrong code at -O2/3/s on x86_64-gnu since r14-3226-gd073e2d75d9

2024-01-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111444

Richard Biener  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #12 from Richard Biener  ---
Fixed.

[Bug middle-end/113680] Missed optimization: Redundant cmp/test instructions when comparing (x - y) > 0

2024-01-31 Thread Explorer09 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113680

--- Comment #2 from Kang-Che Sung  ---
I forgot to mention that such optimization is unsafe for floating points
(actually I knew that when I write my code). `(a - b) < 0` optimization
shouldn't be performed with unsigned integers either. I request only
optimizations on signed integers.

[Bug libstdc++/90276] PSTL tests fail in Debug Mode

2024-01-31 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90276

--- Comment #7 from Jonathan Wakely  ---
__pstl::__tbb_backend::__merge_func::split_merging (which should be a reserved
name) does:

if (__nx < __ny)
{
__ym = _M_ys + __ny / 2;

if (_x_orig)
__xm = std::upper_bound(_M_x_beg + _M_xs, _M_x_beg + _M_xe,
*(_M_x_beg + __ym), _M_comp) - _M_x_beg;
else
__xm = std::upper_bound(_M_z_beg + _M_xs, _M_z_beg + _M_xe,
*(_M_z_beg + __ym), _M_comp) - _M_z_beg;
}

which aborts because the range is not correctly sorted w.r.t _M_comp, as
required by upper_bound.

The range looks like this:

$1 = std::__cxx1998::vector of length 1284, capacity 1284 = {{val = 0}, {val =
5290}, {val = 9862}, {val = 8699}, {val = 5471}, {val = 4810}, {
val = 6176}, {val = 1400}, {val = 5025}, {val = 3246}, {val = 2547}, {val =
8814}, {val = 2463}, {val = 8800}, {val = 3074}, {val = 5741}, {
val = 5234}, {val = 736}, {val = 4895}, {val = 6803}, {val = 2363}, {val =
5351}, {val = 6719}, {val = 7967}, {val = 732}, {val = 1399}, {val = 7586}, 
  {val = 4659}, {val = 3800}, {val = 6956}, {val = 4087}, {val = 9090}, {val =
2293}, {val = 8702}, {val = 2263}, {val = 7765}, {val = 3233}, {
val = 8440}, {val = 3918}, {val = 8259}, {val = 6439}, {val = 6465}, {val =
6794}, {val = 3656}, {val = 10018}, {val = 4621}, {val = 9397}, {
val = 4973}, {val = 584}, {val = 9046}, {val = 6530}, {val = 2474}, {val =
4118}, {val = 2970}, {val = 162}, {val = 4850}, {val = 9401}, {val = 7748}, 
  {val = 9509}, {val = 2923}, {val = 4425}, {val = 8349}, {val = 6766}, {val =
6719}, {val = 6773}, {val = 3783}, {val = 4205}, {val = 4759}, {
val = 6976}, {val = 8123}, {val = 2739}, {val = 3136}, {val = 4309}, {val =
4286}, {val = 6792}, {val = 4048}, {val = 8908}, {val = 664}, {
val = 3774}, {val = 9019}, {val = 9710}, {val = 111}, {val = 1214}, {val =
8581}, {val = 2996}, {val = 6409}, {val = 3152}, {val = 7150}, {
val = 3878}, {val = 7415}, {val = 10073}, {val = 3057}, {val = 238}, {val =
1314}, {val = 9776}, {val = 7011}, {val = 5097}, {val = 8734}, {
val = 6524}, {val = 1794}, {val = 6578}, {val = 9263}, {val = 9962}, {val =
5640}, {val = 3271}, {val = 1229}, {val = 4441}, {val = 6932}, {
val = 1893}, {val = 2968}, {val = 425}, {val = 6356}, {val = 2994}, {val =
6671}, {val = 4658}, {val = 743}, {val = 2801}, {val = 2563}, {val = 7893}, 
  {val = 1433}, {val = 4731}, {val = 2441}, {val = 4490}, {val = 4970}, {val =
8787}, {val = 3987}, {val = 6734}, {val = 3605}, {val = 7474}, {
val = 2979}, {val = 152}, {val = 8805}, {val = 1964}, {val = 10114}, {val =
4166}, {val = 10267}, {val = 6096}, {val = 3360}, {val = 1673}, {
val = 2742}, {val = 6328}, {val = 7130}, {val = 9098}, {val = 4075}, {val =
8554}, {val = 8509}, {val = 9850}, {val = 1077}, {val = 794}, {
val = 7465}, {val = 2510}, {val = 5525}, {val = 4659}, {val = 1753}, {val =
216}, {val = 3167}, {val = 493}, {val = 1704}, {val = 1525}, {val = 7967}, 
  {val = 4683}, {val = 6709}, {val = 6493}, {val = 1400}, {val = 1297}, {val =
5412}, {val = 6420}, {val = 7394}, {val = 8772}, {val = 2846}, {
val = 10136}, {val = 9853}, {val = 9976}, {val = 3709}, {val = 8682}, {val
= 8252}, {val = 1939}, {val = 8253}, {val = 4082}, {val = 7765}, {
val = 5439}, {val = 1345}, {val = 3012}, {val = 4851}, {val = 3098}, {val =
8260}, {val = 2771}, {val = 3591}, {val = 4717}, {val = 9328}, {
val = 1279}, {val = 9401}, {val = 5758}, {val = 2525}, {val = 5554}, {val =
1809}, {val = 7937}, {val = 1696}, {val = 9203}, {val = 1183}...}


This is indeed not partitioned:

(gdb) p __val
$2 = (const Num &) @0x77994e28: {val = 687}
(gdb) p __first[46]
$4 = (Num &) @0x779930c8: {val = 9397}
(gdb) p __first[47]
$5 = (Num &) @0x779930cc: {val = 4973}
(gdb) p __first[48]
$6 = (Num &) @0x779930d0: {val = 584}   <
(gdb) p __first[49]
$7 = (Num &) @0x779930d4: {val = 9046}

I think this needs to be reported upstream too.

[Bug libstdc++/90276] PSTL tests fail in Debug Mode

2024-01-31 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90276

--- Comment #8 from Jonathan Wakely  ---
https://github.com/llvm/llvm-project/issues/80136

[Bug modula2/111627] modula2: Excess test fails with a case-preserving-case-insensitive source tree.

2024-01-31 Thread gaius at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111627

--- Comment #2 from Gaius Mulley  ---
Created attachment 57267
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57267&action=edit
Proposed fix

Here is a proposed patch, the problem was fixed by renaming conflicting
testnames.  There were some testsuite named modules which matched library names
(but used a different case).

[Bug tree-optimization/113681] New: [14 Regression] ICE in tree_profiling, at tree-profile.cc:803 since r14-6201-gf0a90c7d7333fc

2024-01-31 Thread mjires at suse dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113681

Bug ID: 113681
   Summary: [14 Regression] ICE in tree_profiling, at
tree-profile.cc:803 since r14-6201-gf0a90c7d7333fc
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mjires at suse dot cz
CC: aoliva at gcc dot gnu.org
  Target Milestone: ---

Compiling reduced testcase c-c++-common/torture/strub-inlinable2.c results in
ICE since r14-6201-gf0a90c7d7333fc which introduced this test.

$ cat strub-inlinable2.c
inline void __attribute__((strub, always_inline)) inl_int_ali() {}
void bat() { inl_int_ali(); }


$ gcc strub-inlinable2.c -fbranch-probabilities
strub-inlinable2.c:2:14: error: calling ‘always_inline’ ‘strub’ ‘inl_int_ali’
in non-‘strub’ context ‘bat’
2 | void bat() { inl_int_ali(); }
  |  ^
during IPA pass: profile
strub-inlinable2.c:2:1: internal compiler error: in tree_profiling, at
tree-profile.cc:803
2 | void bat() { inl_int_ali(); }
  | ^~~~
0x181b321 tree_profiling
/home/mjires/git/GCC/master/gcc/tree-profile.cc:803
0x181bc4c execute
/home/mjires/git/GCC/master/gcc/tree-profile.cc:990
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.


$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/home/mjires/built/master/libexec/gcc/x86_64-pc-linux-gnu/14.0.1/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /home/mjires/git/GCC/master/configure
--prefix=/home/mjires/built/master --disable-bootstrap
--enable-languages=c,c++,fortran,lto --disable-multilib --disable-libsanitizer
--enable-checking : (reconfigured) /home/mjires/git/GCC/master/configure
--prefix=/home/mjires/built/master --disable-bootstrap
--enable-languages=c,c++,fortran,lto --disable-multilib --disable-libsanitizer
--enable-checking
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 14.0.1 20240131 (experimental) (GCC)

[Bug tree-optimization/110176] [11/12/13/14 Regression] wrong code at -Os and above on x86_64-linux-gnu since r11-2446

2024-01-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110176

--- Comment #9 from Richard Biener  ---
With all VARYING we simplify

i_19 = (int) _2;
_6 = (int) _5;
Value numbering stmt = _7 = _6 <= i_19;
Applying pattern match.pd:6775, gimple-match-4.cc:1795
Match-and-simplified _6 <= i_19 to 1

where _5 is _Bool and _2 is unsigned int.  We match

 zext <= (int) 4294967295u

note that I see

Value numbering stmt = _2 = f$0_25;
Setting value number of _2 to 4294967295 (changed)
Value numbering stmt = i_19 = (int) _2;
Match-and-simplified (int) _2 to -1
RHS (int) _2 simplified to -1 
Not changing value number of i_19 from VARYING to -1
Making available beyond BB6 i_19 for value i_19

so it's odd we see the constant here, but ... we go

  (if (TREE_CODE (@10) == INTEGER_CST
   && INTEGRAL_TYPE_P (TREE_TYPE (@00))
   && !int_fits_type_p (@10, TREE_TYPE (@00)))
   (with
{
  tree min = lower_bound_in_type (TREE_TYPE (@10), TREE_TYPE (@00));
  tree max = upper_bound_in_type (TREE_TYPE (@10), TREE_TYPE (@00));
  bool above = integer_nonzerop (const_binop (LT_EXPR, type, max,
@10));
  bool below = integer_nonzerop (const_binop (LT_EXPR, type, @10,
min));
}
(if (above || below)

failing to see that we deal with a relational compare and a sign-change.

The original code from fold-const.cc had only INTEGER_TYPE support,
r6-4300-gf6c1575958f7bf made it cover all integral types (it half-way
supported BOOLEAN_TYPE already).  But the issue was latent I think.
One notable difference was that I think get_unwidened made sure to
convert a constant to the wider type while here we have @10 != @1
and the conversion not applied.  We're doing it correct in earlier code:

/* ???  The special-casing of INTEGER_CST conversion was in the original
   code and here to avoid a spurious overflow flag on the resulting
   constant which fold_convert produces.  */
(if (TREE_CODE (@1) == INTEGER_CST)

using @1 instead of @10.

Correcting that avoids the pattern from triggering in this wrong way.

[Bug other/113682] New: Branches in branchless binary search rather than cmov/csel/csinc

2024-01-31 Thread redbeard0531 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113682

Bug ID: 113682
   Summary: Branches in branchless binary search rather than
cmov/csel/csinc
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: other
  Assignee: unassigned at gcc dot gnu.org
  Reporter: redbeard0531 at gmail dot com
  Target Milestone: ---

I've been trying to eliminate unpredictable branches in a hot function where
perf counters show a high mispredict percentage. Unfortunately, I haven't been
able to find an incantation to get gcc to generate branchless code other than
inline asm, which I'd rather avoid. In this case I've even laid out the
critical lines so that they exactly match the behavior of the csinc and csel
instructions on arm64, but they are still not used.


Somewhat minimized repro:

typedef unsigned long size_t;
struct ITEM {char* addr; size_t len;};
int cmp(ITEM* user, ITEM* tree);

size_t bsearch2(ITEM* user, ITEM** tree, size_t treeSize) {
auto pos = tree;
size_t low = 0;
size_t high = treeSize;
while (low < high) {
size_t i = (low + high) / 2;
int res = cmp(user, tree[i]);

// These should be cmp + csinc + csel on arm
// and lea + test + cmov + cmov on x86.
low = res > 0 ? i + 1 : low; // csinc
high = res < 0 ? i: high; // csel

if (res == 0) return i;
}
return -1;
}


On arm64 that generates a conditional branch on res > 0:
bl  cmp(ITEM*, ITEM*)
cmp w0, 0
bgt .L15 // does low = i + 1 then loops
mov x20, x19
bne .L4 // loop


On x86_64 it does similar:
callcmp(ITEM*, ITEM*)
testeax, eax
jg  .L16 
jne .L17


Note that clang produces the desired codegen for both:

arm:
bl  cmp(ITEM*, ITEM*)
cmp w0, #0
csinc   x23, x23, x22, le
cselx19, x22, x19, lt
cbnzw0, .LBB0_1

x86:
callcmp(ITEM*, ITEM*)@PLT
lea rcx, [r12 + 1]
testeax, eax
cmovg   r13, rcx
cmovs   rbx, r12
jne .LBB0_1

(full output for all 4 available at https://www.godbolt.org/z/aWrKbYPTG. Code
snippets from trunk, but also repos on 13.2)

While ideally gcc would generate the branchless output for the supplied code,
if there is some (reasonable) incantation that would cause it to produce
branchless output, I'd be happy to have that too.

[Bug analyzer/113509] ICE: SIGSEGV in c_tree_printer (c-objc-common.cc:341) with -fanalyzer -fanalyzer-verbose-state-changes

2024-01-31 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113509

David Malcolm  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #4 from David Malcolm  ---
Should be resolved by the above patch.

[Bug target/111677] [12/13/14 Regression] darktable build on aarch64 fails with unrecognizable insn due to -fstack-protector changes

2024-01-31 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111677

--- Comment #26 from GCC Commits  ---
The master branch has been updated by Alex Coplan :

https://gcc.gnu.org/g:0529ba8168c89f24314e8750237d77bb132bea9c

commit r14-8657-g0529ba8168c89f24314e8750237d77bb132bea9c
Author: Alex Coplan 
Date:   Tue Jan 30 10:22:48 2024 +

aarch64: Avoid out-of-range shrink-wrapped saves [PR111677]

The PR shows us ICEing due to an unrecognizable TFmode save emitted by
aarch64_process_components.  The problem is that for T{I,F,D}mode we
conservatively require mems to be in range for x-register ldp/stp.  That
is because (at least for TImode) it can be allocated to both GPRs and
FPRs, and in the GPR case that is an x-reg ldp/stp, and the FPR case is
a q-register load/store.

As Richard pointed out in the PR, aarch64_get_separate_components
already checks that the offsets are suitable for a single load, so we
just need to choose a mode in aarch64_reg_save_mode that gives the full
q-register range.  In this patch, we choose V16QImode as an alternative
16-byte "bag-of-bits" mode that doesn't have the artificial range
restrictions imposed on T{I,F,D}mode.

For T{F,D}mode in GCC 15 I think we could consider relaxing the
restriction imposed in aarch64_classify_address, as typically T{F,D}mode
should be allocated to FPRs.  But such a change seems too invasive to
consider for GCC 14 at this stage (let alone backports).

Fortunately the new flexible load/store pair patterns in GCC 14 allow
this mode change to work without further changes.  The backports are
more involved as we need to adjust the load/store pair handling to cater
for V16QImode in a few places.

Note that for the testcase we are relying on the torture options to add
-funroll-loops at -O3 which is necessary to trigger the ICE on trunk
(but not on the 13 branch).

gcc/ChangeLog:

PR target/111677
* config/aarch64/aarch64.cc (aarch64_reg_save_mode): Use
V16QImode for the full 16-byte FPR saves in the vector PCS case.

gcc/testsuite/ChangeLog:

PR target/111677
* gcc.target/aarch64/torture/pr111677.c: New test.

[Bug target/111677] [12/13 Regression] darktable build on aarch64 fails with unrecognizable insn due to -fstack-protector changes

2024-01-31 Thread acoplan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111677

Alex Coplan  changed:

   What|Removed |Added

Summary|[12/13/14 Regression]   |[12/13 Regression]
   |darktable build on aarch64  |darktable build on aarch64
   |fails with unrecognizable   |fails with unrecognizable
   |insn due to |insn due to
   |-fstack-protector changes   |-fstack-protector changes

--- Comment #27 from Alex Coplan  ---
Fixed on trunk for GCC 14, keeping open for backports.

[Bug target/113357] [14 regression] m68k-linux bootstrap failure in stage2 due to segfault compiling unwind-dw2.c

2024-01-31 Thread mikpelinux at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113357

--- Comment #4 from Mikael Pettersson  ---
Confirmed:

04c9cf5c786b94fbe3f6f21f06cae73a7575ff7a is the first new commit
commit 04c9cf5c786b94fbe3f6f21f06cae73a7575ff7a
Author: Manolis Tsamis 
Date:   Mon Oct 16 13:08:12 2023 -0600

Implement new RTL optimizations pass: fold-mem-offsets

[Bug rtl-optimization/113682] Branches in branchless binary search rather than cmov/csel/csinc

2024-01-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113682

Richard Biener  changed:

   What|Removed |Added

   Keywords||missed-optimization
  Component|other   |rtl-optimization
Version|unknown |14.0
 Target||aarch64, x86_64-*-*

--- Comment #1 from Richard Biener  ---
Since there's a loop exit involved (and the loop has multiple exits)
if-conversion is made difficult here.

You could try rotating manually producing a do { } while loop with
a "nicer" exit condition and see whether that helps.

[Bug tree-optimization/113681] [14 Regression] ICE in tree_profiling, at tree-profile.cc:803 since r14-6201-gf0a90c7d7333fc

2024-01-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113681

Richard Biener  changed:

   What|Removed |Added

   Keywords||error-recovery
   Target Milestone|--- |14.0

[Bug tree-optimization/113681] [14 Regression] ICE in tree_profiling, at tree-profile.cc:803 since r14-6201-gf0a90c7d7333fc

2024-01-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113681

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P4

[Bug debug/92444] [11/12/13/14 regression] gcc generates wrong debug information at -O2 and -O3 since r10-4122-gf658ad3002a0af

2024-01-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92444

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |11.5

[Bug target/105275] [12/13/14 regression] 525.x264_r and 538.imagick_r regressed on x86_64 at -O2 with PGO after r12-7319-g90d693bdc9d718

2024-01-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105275

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |12.4

[Bug target/113679] long long minus double with gcc -m32 produces different results than other compilers or gcc -m64

2024-01-31 Thread dilyan.palauzov at aegee dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113679

--- Comment #8 from Дилян Палаузов  ---
-fexcess-precision=standard does not ensure consistent behaviour between gcc
13.2.1 20231205 (Red Hat 13.2.1-6) and clang 17.0.5.  -msse2 -mfpmath=sse does
for diff.c:

#include 
#include 
int main(void) {
  long long l = 9223372036854775806;
  double d = 9223372036854775808.0;
  printf("%f\n", (double)l - d);
  printf("%i\n", pow(3.3, 4.4) == 191.18831051580915);
  return 0;
}


$ gcc -lm -fexcess-precision=standard -m32 -o diff diff.c && ./diff
0.00
0
$ clang -lm -fexcess-precision=standard -m32 -o diff diff.c && ./diff
0.00
1
$ gcc -lm -fexcess-precision=standard -m64 -o diff diff.c && ./diff
0.00
1
$ clang -lm -fexcess-precision=standard -m64 -o diff diff.c && ./diff
0.00
1
$ gcc -lm -fexcess-precision=fast -m32 -o diff diff.c && ./diff
-2.00
1
$ clang -lm -fexcess-precision=fast -m32 -o diff diff.c && ./diff
0.00
1
$ gcc -lm -fexcess-precision=fast -m64 -o diff diff.c && ./diff
0.00
1
$ clang -lm -fexcess-precision=fast -m64 -o diff diff.c && ./diff
0.00
1
$ gcc -lm -msse2 -mfpmath=sse -m32 -o diff diff.c && ./diff
0.00
1
$ clang -lm -msse2 -mfpmath=sse -m32 -o diff diff.c && ./diff
0.00
1
$ gcc -lm -msse2 -mfpmath=sse -m64 -o diff diff.c && ./diff
0.00
1
$ clang -lm -msse2 -mfpmath=sse -m64 -o diff diff.c && ./diff
0.00
1

cl.exe also prints 0.00 and 1

[Bug rtl-optimization/110390] [13/14 regression] ICE on valid code on x86_64-linux-gnu with sel-scheduling: in av_set_could_be_blocked_by_bookkeeping_p, at sel-sched.cc:3609 since r13-3596-ge7310e24b1

2024-01-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110390

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |13.3

[Bug target/111170] [13/14 regression] Malformed manifest does not allow to run gcc on Windows XP (Accessing a corrupted shared library) since r13-6552-gd11e088210a551

2024-01-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |13.3

[Bug target/113542] [14 Regression] gcc.target/arm/bics_3.c regression after change for pr111267

2024-01-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113542

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |14.0

[Bug testsuite/113611] [14 Regression] gcc.dg/pr110279-1.c fails on cross build since gcc-14-5779-g746344dd538

2024-01-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113611

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |14.0

[Bug rtl-optimization/113546] [13/14 Regression] aarch64: bootstrap-debug-lean broken with -fcompare-debug failure since r13-2921-gf1adf45b17f7f1

2024-01-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113546

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |13.3

[Bug target/113641] [13/14 regression] 510.parest_r with PGO at O2 slower than GCC 12 (7% on Zen 3&2, 4% on CascadeLake) since r13-4272-g8caf155a3d6e23

2024-01-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113641

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |13.3

[Bug target/113679] long long minus double with gcc -m32 produces different results than other compilers or gcc -m64

2024-01-31 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113679

--- Comment #9 from Jakub Jelinek  ---
That is not what I read from what you've posted, -fexcess-precision=standard is
consistent between the compilers, -fexcess-precision=fast is not (and doesn't
have to be), neither between different compilers, nor between different
optimization levels etc.

[Bug target/113679] long long minus double with gcc -m32 produces different results than other compilers or gcc -m64

2024-01-31 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113679

--- Comment #10 from Jakub Jelinek  ---
Oh, you mean the pow equality comparison.  I think you should study something
about floating point, errors, why equality comparisons of floating point values
are usually a bad idea etc.
There is no gcc bug, just bad user expectations.

[Bug gcov-profile/113646] PGO hurts run-time of 538.imagick_r as much as 68% at -Ofast -march=native

2024-01-31 Thread jamborm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113646

--- Comment #3 from Martin Jambor  ---
(In reply to Richard Biener from comment #1)
> Did you try with -fprofile-partial-training (is that default on?  it
> probably should ...).  Can you please try training with the rate data
> instead of train
> to rule out a mismatch?

With -fprofile-partial-training the znver4 LTO vs LTOPGO regression (on a newer
master) goes down from 66% to 54%.  

So far I did not find a way to easily train with the reference run (when I add
"train_with = refrate" to the config, I always get "ERROR: The workload
specified by train_with MUST be a training workload!")

[Bug tree-optimization/110176] [11/12/13/14 Regression] wrong code at -Os and above on x86_64-linux-gnu since r11-2446

2024-01-31 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110176

--- Comment #10 from GCC Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:22dbfbe8767ff4c1d93e39f68ec7c2d5b1358beb

commit r14-8658-g22dbfbe8767ff4c1d93e39f68ec7c2d5b1358beb
Author: Richard Biener 
Date:   Wed Jan 31 14:40:24 2024 +0100

middle-end/110176 - wrong zext (bool) <= (int) 4294967295u folding

The following fixes a wrong pattern that didn't match the behavior
of the original fold_widened_comparison in that get_unwidened
returned a constant always in the wider type.  But here we're
using (int) 4294967295u without the conversion applied.  Fixed
by doing as earlier in the pattern - matching constants only
if the conversion was actually applied.

PR middle-end/110176
* match.pd (zext (bool) <= (int) 4294967295u): Make sure
to match INTEGER_CST only without outstanding conversion.

* gcc.dg/torture/pr110176.c: New testcase.

[Bug regression/113672] [14 Regression] FAIL: g++.dg/pch/line-map-3.C -g -I. -Dwith_PCH (test for excess errors)

2024-01-31 Thread lhyatt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113672

Lewis Hyatt  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|UNCONFIRMED |RESOLVED
 CC||lhyatt at gcc dot gnu.org

--- Comment #1 from Lewis Hyatt  ---
Thanks, yes the test is problematic because the warnings it looks for are
platform dependent. There is a patch to address it here:
https://gcc.gnu.org/pipermail/gcc-patches/2024-January/644487.html. It's being
tracked at the original PR, so marking as a dupe of that one.

*** This bug has been marked as a duplicate of bug 105608 ***

[Bug sanitizer/112644] [14 Regression] Some of the hwasan testcase fail after the recent merge

2024-01-31 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112644

--- Comment #9 from GCC Commits  ---
The master branch has been updated by Tamar Christina :

https://gcc.gnu.org/g:a73421bcf301911f2cbdb1c58316ddf3473ea6d5

commit r14-8659-ga73421bcf301911f2cbdb1c58316ddf3473ea6d5
Author: Tamar Christina 
Date:   Wed Jan 31 14:44:35 2024 +

libsanitizer: Sync fixes for asan interceptors from upstream

This cherry-picks and squashes the differences between commits

   
d3e5c20ab846303874a2a25e5877c72271fc798b..76e1e45922e6709392fb82aac44bebe3dbc2ea63
from LLVM upstream from compiler-rt/lib/hwasan/ to GCC on the changes
relevant
for GCC.

This is required to fix the linked PR.

As mentioned in the PR the last sync brought in a bug from upstream[1]
where
operations became non-recoverable and as such the tests in AArch64 started
failing.  This cherry picks the fix and there are minor updates needed to
GCC
after this to fix the cases.

[1] https://github.com/llvm/llvm-project/pull/74000

PR sanitizer/112644
Cherry-pick llvm-project revision
672b71cc1003533460a82f06b7d24fbdc02ffd58,
5fcf3bbb1acfe226572474636714ede86fffcce8,
3bded112d02632209bd55fb28c6c5c234c23dec3 and
76e1e45922e6709392fb82aac44bebe3dbc2ea63.

[Bug tree-optimization/110176] [11/12/13 Regression] wrong code at -Os and above on x86_64-linux-gnu since r11-2446

2024-01-31 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110176

Richard Biener  changed:

   What|Removed |Added

  Known to work||14.0
Summary|[11/12/13/14 Regression]|[11/12/13 Regression] wrong
   |wrong code at -Os and above |code at -Os and above on
   |on x86_64-linux-gnu since   |x86_64-linux-gnu since
   |r11-2446|r11-2446

--- Comment #11 from Richard Biener  ---
Fixed on trunk sofar.

[Bug preprocessor/105608] [11/12/13/14 Regression] ICE: in linemap_add with a really long defined macro on the command line r11-338-g2a0225e47868fbfc

2024-01-31 Thread lhyatt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105608

Lewis Hyatt  changed:

   What|Removed |Added

 CC||danglin at gcc dot gnu.org

--- Comment #14 from Lewis Hyatt  ---
*** Bug 113672 has been marked as a duplicate of this bug. ***

[Bug sanitizer/112644] [14 Regression] Some of the hwasan testcase fail after the recent merge

2024-01-31 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112644

--- Comment #10 from GCC Commits  ---
The master branch has been updated by Tamar Christina :

https://gcc.gnu.org/g:0debaceb11dee9781f9a8b320cb5893836324878

commit r14-8660-g0debaceb11dee9781f9a8b320cb5893836324878
Author: Tamar Christina 
Date:   Wed Jan 31 14:50:33 2024 +

hwasan: instrument new memory and string functions [PR112644]

Recent libhwasan updates[1] intercept various string and memory functions.
These functions have checking in them, which means there's no need to
inline the checking.

This patch marks said functions as intercepted, and adjusts a testcase
to handle the difference.  It also looks for HWASAN in a check in
expand_builtin.  This check originally is there to avoid using expand to
inline the behaviour of builtins like memset which are intercepted by
ASAN and hence which we rely on the function call staying as a function
call.  With the new reliance on function calls in HWASAN we need to do
the same thing for HWASAN too.

HWASAN and ASAN don't seem to however instrument the same functions.

Looking into
libsanitizer/sanitizer_common/sanitizer_common_interceptors_memintrinsics.inc
it looks like the common ones are memset, memmove and memcpy.

The rest of the routines for asan seem to be defined in
compiler-rt/lib/asan/asan_interceptors.h however compiler-rt/lib/hwasan/
does not have such a file but it does have
compiler-rt/lib/hwasan/hwasan_platform_interceptors.h which it looks like
is
forcing off everything but memset, memmove, memcpy, memcmp and bcmp.

As such I've taken those as the final list that hwasan currently supports.
This also means that on future updates this list should be cross checked.

[1]
https://discourse.llvm.org/t/hwasan-question-about-the-recent-interceptors-being-added/75351

gcc/ChangeLog:

PR sanitizer/112644
* asan.h (asan_intercepted_p): Incercept memset, memmove, memcpy
and
memcmp.
* builtins.cc (expand_builtin): Include HWASAN when checking for
builtin inlining.

gcc/testsuite/ChangeLog:

PR sanitizer/112644
* c-c++-common/hwasan/builtin-special-handling.c: Update testcase.

Co-Authored-By: Matthew Malcomson 

[Bug sanitizer/112644] [14 Regression] Some of the hwasan testcase fail after the recent merge

2024-01-31 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112644

--- Comment #11 from GCC Commits  ---
The master branch has been updated by Tamar Christina :

https://gcc.gnu.org/g:0a640455928a050315f6addd88ace5d945eba130

commit r14-8661-g0a640455928a050315f6addd88ace5d945eba130
Author: Tamar Christina 
Date:   Wed Jan 31 14:51:36 2024 +

hwasan: Remove testsuite check for a complaint message [PR112644]

With recent updates to hwasan runtime libraries, the error reporting for
this particular check is has been reworked.

I would question why it has lost this message.  To me it looks strange
that num_descriptions_printed is incremented whenever we call
PrintHeapOrGlobalCandidate whether that function prints anything or not.
(See PrintAddressDescription in libsanitizer/hwasan/hwasan_report.cpp).

The message is no longer printed because we increment this
num_descriptions_printed variable indicating that we have found some
description.

I would like to question this upstream, but it doesn't look that much of
a problem and if pressed for time we should just change our testsuite.
Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

gcc/testsuite/ChangeLog:

PR sanitizer/112644
* c-c++-common/hwasan/hwasan-thread-clears-stack.c: Update
testcase.

[Bug testsuite/113502] gcc.target/aarch64/vect-early-break-cbranch.c and gcc.target/aarch64/sve/vect-early-break-cbranch.c testcase are too sensitive

2024-01-31 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113502

--- Comment #2 from GCC Commits  ---
The master branch has been updated by Tamar Christina :

https://gcc.gnu.org/g:f7935beef7b02fbba0adf33fb2ba5c0a27d7e9ff

commit r14-8662-gf7935beef7b02fbba0adf33fb2ba5c0a27d7e9ff
Author: Tamar Christina 
Date:   Wed Jan 31 14:52:59 2024 +

AArch64: relax cbranch tests to accepted inverted branches [PR113502]

Recently something in the midend had started inverting the branches by
inverting
the condition and the branches.

While this is fine, it makes it hard to actually test.  In RTL I disable
scheduling and BB reordering to prevent this.  But in GIMPLE there seems to
be
nothing I can do.  __builtin_expect seems to have no impact on the change
since
I suspect this is happening during expand where conditions can be flipped
regardless of probability during compare_and_branch.

Since the mid-end has plenty of correctness tests, this weakens the backend
tests to just check that a correct looking sequence is emitted.

gcc/testsuite/ChangeLog:

PR testsuite/113502
* gcc.target/aarch64/sve/vect-early-break-cbranch.c: Ignore exact
branch.
* gcc.target/aarch64/vect-early-break-cbranch.c: Likewise.

[Bug c++/112580] [14 Regression]: g++.dg/modules/xtreme-header-4_b.C et al; ICE tree check: expected class 'type', have 'declaration'

2024-01-31 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112580

Patrick Palka  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=112737
 CC||ppalka at gcc dot gnu.org

--- Comment #7 from Patrick Palka  ---
The xtreme-header-{4,5,6} fails need -mx32 (rather than -m32) on x86_64-linux. 
The error is:

.../x86_64-pc-linux-gnu/libstdc++-v3/include/format:3662:28: error: invalid use
of non-static data member
‘std::basic_format_args,
wchar_t> >::__as_base ::’
 3662 | __arg._M_val = _M_values[__i];
  |^

The xtreme-header{,2} fails are also tracked by PR112737, and the error on
x86_64-linux is:

/src/gcc/testsuite/g++.dg/modules/xtreme-header-2_a.H: error: conflicting
global module declaration 'template class _Cont, class _Rg,
class ... _Args> using std::ranges::__detail::_DeduceExpr1 = decltype
(_Cont<...auto...>(declval<_Rg>(), (declval<_Args>)()...))'
...

[Bug sanitizer/112644] [14 Regression] Some of the hwasan testcase fail after the recent merge

2024-01-31 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112644

Tamar Christina  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #12 from Tamar Christina  ---
Fixed. Thanks

[Bug testsuite/113502] gcc.target/aarch64/vect-early-break-cbranch.c and gcc.target/aarch64/sve/vect-early-break-cbranch.c testcase are too sensitive

2024-01-31 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113502

Tamar Christina  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 CC||tnfchris at gcc dot gnu.org
 Status|UNCONFIRMED |RESOLVED

--- Comment #3 from Tamar Christina  ---
Fixed, thanks

[Bug c++/112737] [14 Regression] g++.dg/modules/xtreme-header-2_b.C -std=c++2b (test for excess errors)

2024-01-31 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112737

--- Comment #5 from Patrick Palka  ---
Reduced:

$ cat 112737.h
template class _Cont>
using _DeduceExpr1 = decltype(_Cont{});

$ cat 112737_a.H
#include "112737.h"

$ cat 112737_b.C
import "112737_a.H";
#include "112737.h"

$ g++ -fmodules-ts 112737_a.H 112737_b.C
In file included from 112737_b.C:2:
112737.h:2:7: error: conflicting declaration of template
‘template class _Cont> using _DeduceExpr1 = decltype
(_Cont<...auto...>{})’
2 | using _DeduceExpr1 = decltype(_Cont{});
  |   ^~~~
In file included from 112737_a.H:1,
of module ./112737_a.H, imported at 112737_b.C:1:
112737.h:2:7: note: previous declaration ‘template class _Cont>
using _DeduceExpr1 = decltype (_Cont<...auto...>{})’
2 | using _DeduceExpr1 = decltype(_Cont{});
  |   ^~~~

[Bug c++/106052] ICE with -Wmismatched-tags with partially specialized friend struct of self type

2024-01-31 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106052

Marek Polacek  changed:

   What|Removed |Added

 CC||mpolacek at gcc dot gnu.org

--- Comment #3 from Marek Polacek  ---
Started with r10-7424-g04dd734b52de12:

commit 04dd734b52de121853e1ea6b3c197a598b294e23
Author: Martin Sebor 
Date:   Fri Mar 27 12:07:45 2020 -0400

c++: avoid -Wredundant-tags on a first declaration in use [PR 93824]

[Bug middle-end/113680] Missed optimization: Redundant cmp/test instructions when comparing (x - y) > 0

2024-01-31 Thread Explorer09 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113680

--- Comment #3 from Kang-Che Sung  ---
Oops. I made a typo in the test code. func4() shouldn't have that redundant
brace.

The corrected example:

```
void func4(int x, int y) {
int diff = x - y;
if (diff > 0)
putchar('>');
if (x < y)
putchar('<');
}
```

[Bug c++/113683] New: explicit template instantiation wrongly checks private base class accessibility

2024-01-31 Thread schaumb at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113683

Bug ID: 113683
   Summary: explicit template instantiation wrongly checks private
base class accessibility
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: schaumb at gmail dot com
  Target Milestone: ---

The compiler wrongly checks that the (private) base class is accessible when
explicit template instantiation happens. (standard: C++20)

It shouldn't, see https://eel.is/c++draft/temp.spec#general-6

template
struct I{};

struct A {};

class B : A {
static const B b;
};

I need to instantiate the B::b object address with const A*.
But this line is failing:

template struct I(&B::b)>; // fails on static_cast


Simplified "real" example code: https://godbolt.org/z/zj9co5bMh

[Bug c++/112737] [14 Regression] g++.dg/modules/xtreme-header-2_b.C -std=c++2b (test for excess errors)

2024-01-31 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112737

--- Comment #6 from Patrick Palka  ---
Ah, this seems to be a general declaration matching issue not specific to
modules.  Here's a non-modules testcase:

template class TT, class T>
decltype(TT{T()}) f(); // #1

template class TT, class T>
decltype(TT{T()}) f(); // #2, should be considered a redeclaration of #1

template struct A { A(T); };

int main() {
  f(); // ambiguity error
}

We (wrongly?) consider the return types of the two f's to be different, because
the CTAD placeholders refer to different TEMPLATE_DECLs (of a logically
equivalent ttp) and structural_comptypes uses pointer identity here.  Perhaps
we need to relax structural_comptypes in this case.

[Bug modula2/111627] modula2: Excess test fails with a case-preserving-case-insensitive source tree.

2024-01-31 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111627

--- Comment #3 from GCC Commits  ---
The master branch has been updated by Gaius Mulley :

https://gcc.gnu.org/g:4fd094835a8997cdcc3d18d7d297debe1527202d

commit r14-8663-g4fd094835a8997cdcc3d18d7d297debe1527202d
Author: Gaius Mulley 
Date:   Wed Jan 31 15:44:32 2024 +

PR modula2/111627 Excess test fails with a case-preserving-case-insensitive
source tree

This patch renames gm2 testsuite modules whose names conflict with library
modules.  The conflict is not seen on case preserving case sensitive file
systems.

gcc/testsuite/ChangeLog:

PR modula2/111627
* gm2/pim/pass/stdio.mod: Moved to...
* gm2/pim/pass/teststdio.mod: ...here.
* gm2/pim/run/pass/builtins.mod: Moved to...
* gm2/pim/run/pass/testbuiltins.mod: ...here.
* gm2/pim/run/pass/math.mod: Moved to...
* gm2/pim/run/pass/testmath.mod: ...here.
* gm2/pim/run/pass/math2.mod: Moved to...
* gm2/pim/run/pass/testmath2.mod: ...here.

Signed-off-by: Gaius Mulley 

[Bug c++/112737] [14 Regression] g++.dg/modules/xtreme-header-2_b.C -std=c++2b (test for excess errors)

2024-01-31 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112737

Patrick Palka  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |ppalka at gcc dot 
gnu.org

--- Comment #7 from Patrick Palka  ---
(In reply to Patrick Palka from comment #6)
> Perhaps we need to relax structural_comptypes in this case.

I guess I can submit a patch to that effect.

[Bug target/113679] long long minus double with gcc -m32 produces different results than other compilers or gcc -m64

2024-01-31 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113679

--- Comment #11 from Jakub Jelinek  ---
Anyway, seems clang is buggy:
clang -O2 -m32 -mno-sse -mfpmath=387 -fexcess-precision=standard
#include 

int
main ()
{
#if FLT_EVAL_METHOD == 2 && LDBL_MANT_DIG == 64 && DBL_MANT_DIG == 53
  if ((double) 191.18831051580915 == 191.18831051580915)
__builtin_abort ();
#endif
}
should always succeed, because if FLT_EVAL_METHOD is 2, it ought to be
evaluated
as (long double) (double) 191.18831051580915L == 191.18831051580915L and
(double) 191.18831051580915L is 0x1.7e606a3c65c95p+7 while
191.18831051580915L is 0x1.7e606a3c65c9503ap+7L, so they aren't equal.

  1   2   >