[Bug middle-end/121470] (unsigned short)0x8000 is handled unexpectedly; due to the way const_int is handled

2025-08-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121470 --- Comment #10 from Segher Boessenkool --- (In reply to Andrew Pinski from comment #8) > (In reply to Segher Boessenkool from comment #7) > > Please stop the vandalism. This is NOT a dup. > > How is it not? > (unsigned char)0x80 vs (unsigned

[Bug middle-end/121470] (unsigned short)0x8000 is handled unexpectedly; due to the way const_int is handled

2025-08-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121470 --- Comment #9 from Segher Boessenkool --- (In reply to Segher Boessenkool from comment #7) > Please stop the vandalism. This is NOT a dup. Of course this is not "how it always worked". We used to have RTL way earlier in the pipeline already.

[Bug middle-end/121470] (unsigned short)0x8000 is handled unexpectedly; due to the way const_int is handled

2025-08-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121470 Segher Boessenkool changed: What|Removed |Added Resolution|DUPLICATE |--- Status|RESOLVED

[Bug middle-end/85344] constants with the sign bit set causes sign extension which is unexpected but not documented in the user documentation

2025-08-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85344 --- Comment #25 from Segher Boessenkool --- The number is an integer constant. 32768 is 32768, not -32768. The value got that way (potentially, but not in this case even) because it was cast to un unsigned short. All of that is done way before

[Bug middle-end/85344] constants with the sign bit set causes sign extension which is unexpected but not documented in the user documentation

2025-08-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85344 --- Comment #20 from Segher Boessenkool --- (In reply to Andrew Pinski from comment #18) > Simple answer: > When the INTEGER_CST (unsigned short) is expanded into a const_int, the sign > extend happens due to the rules of const_int. What does th

[Bug middle-end/85344] constants with the sign bit set causes sign extension which is unexpected but not documented in the user documentation

2025-08-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85344 --- Comment #19 from Segher Boessenkool --- So, apparently force_reg was called here, and it went way down from there. It never should have ended up there, but it is a very common thing, there are tens of ways to get there, no clue what happened

[Bug middle-end/85344] constants with the sign bit set causes sign extension which is unexpected but not documented in the user documentation

2025-08-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85344 --- Comment #17 from Segher Boessenkool --- When you cast to *signed* short instead, you get -32768, at tree level already. And that is correct. This is not the problem here. With the "unsigned short" code, f() here, you get +32768 at tree leve

[Bug middle-end/85344] constants with the sign bit set causes sign extension which is unexpected but not documented in the user documentation

2025-08-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85344 --- Comment #16 from Segher Boessenkool --- There _is_ no const_int there yet, btw. There is no RTL at all yet!

[Bug middle-end/85344] constants with the sign bit set causes sign extension which is unexpected but not documented in the user documentation

2025-08-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85344 --- Comment #15 from Segher Boessenkool --- What is this "TYPE_MODE"? Nothing here has type short_int, HImode: we have an integer constant value, 32768, which is cast to "unsigned short", which is a no-op: that results in an integer constant 327

[Bug middle-end/85344] constants with the sign bit set causes sign extension which is unexpected but not documented in the user documentation

2025-08-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85344 --- Comment #13 from Segher Boessenkool --- (In reply to Andrew Pinski from comment #12) > (In reply to Segher Boessenkool from comment #11) > > (In reply to Andrew Pinski from comment #8) > > > Note this is documented in the internals documentat

[Bug middle-end/85344] constants with the sign bit set causes sign extension which is unexpected but not documented in the user documentation

2025-08-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85344 --- Comment #11 from Segher Boessenkool --- (In reply to Andrew Pinski from comment #8) > Note this is documented in the internals documentation. What is? "We have a bug here"? I doubt it.

[Bug middle-end/121470] (unsigned short)0x8000 is handled unexpectedly; due to the way const_int is handled

2025-08-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121470 Segher Boessenkool changed: What|Removed |Added Status|RESOLVED|REOPENED Resolution|DUPLIC

[Bug middle-end/121470] (unsigned short)0x8000 is handled unexpectedly; due to the way const_int is handled

2025-08-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121470 Segher Boessenkool changed: What|Removed |Added Ever confirmed|0 |1 Resolution|DUPLICATE

[Bug rtl-optimization/121470] New: (unsigned short)0x8000 is expanded incorrectly

2025-08-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121470 Bug ID: 121470 Summary: (unsigned short)0x8000 is expanded incorrectly Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: r

[Bug target/119702] PPCLE: Inefficient auto-vectorization for 64-bit shifts on Power9

2025-08-07 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119702 --- Comment #19 from Segher Boessenkool --- (In reply to Avinash Jayakar from comment #17) > I looked at the slp vectorization pass that converts scalar gimple code to "straight-line paralellisation". Some "scalar" (whatever that means) things

[Bug target/119702] PPCLE: Inefficient auto-vectorization for 64-bit shifts on Power9

2025-08-07 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119702 --- Comment #18 from Segher Boessenkool --- (In reply to Surya Kumari Jangala from comment #16) > With the testcase in the "Description", we are seeing both a splat and a > shift being generated. Instead, a single add instruction is more efficie

[Bug target/119702] PPCLE: Inefficient auto-vectorization for 64-bit shifts on Power9

2025-08-05 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119702 --- Comment #15 from Segher Boessenkool --- (In reply to Avinash Jayakar from comment #14) > (In reply to Surya Kumari Jangala from comment #12) > > Ok. We also need to tackle the original issue, which is that a shift left > > can be optimized b

[Bug target/119702] PPCLE: Inefficient auto-vectorization for 64-bit shifts on Power9

2025-08-05 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119702 --- Comment #13 from Segher Boessenkool --- (In reply to Surya Kumari Jangala from comment #12) > Ok. We also need to tackle the original issue, which is that a shift left > can be optimized by generating a vector add. Perhaps tackle this issue

[Bug target/119702] PPCLE: Inefficient auto-vectorization for 64-bit shifts on Power9

2025-08-04 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119702 --- Comment #11 from Segher Boessenkool --- > Segher, is this a case of needing to add a combiner pattern to translate that > splat/shift into an add of itself? You only ever do "combiner patterns" to recognise something that combine generates

[Bug target/118480] Power9 target generates poor code for vector char splat immediate.

2025-08-02 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118480 --- Comment #15 from Segher Boessenkool --- (In reply to Steven Munroe from comment #12) > Also from PowerISA 3.1C > > The result is placed into VSR[VRT+32], except if, for any > byte element in VSR[VRB+32], the low-order 3 bits are not > equal

[Bug target/118480] Power9 target generates poor code for vector char splat immediate.

2025-08-02 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118480 --- Comment #14 from Segher Boessenkool --- (In reply to Steven Munroe from comment #11) > And as you point out the instructions vslo/vsro/vsl/vsr only care about bits > 121..127. Also older machines needed the byte splat for vsl/vsr. vslq look

[Bug target/118480] Power9 target generates poor code for vector char splat immediate.

2025-08-02 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118480 --- Comment #13 from Segher Boessenkool --- (In reply to Segher Boessenkool from comment #9) > Both vsl and vslo actually look only at the right-most byte in the shift > amount argument (bits 125..127 resp. bits 121..124). In original AltiVec i

[Bug target/118480] Power9 target generates poor code for vector char splat immediate.

2025-08-01 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118480 --- Comment #10 from Segher Boessenkool --- (In reply to Steven Munroe from comment #8) > It seems the evolution of the PowerISA and Vector intrinsics has not been > smooth. > > It is not obvious how to generate xxspltib from an intrinsic. > Ve

[Bug target/118480] Power9 target generates poor code for vector char splat immediate.

2025-08-01 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118480 --- Comment #9 from Segher Boessenkool --- Both vsl and vslo actually look only at the right-most byte in the shift amount argument (bits 125..127 resp. bits 121..124). In original AltiVec it was required to hold the same value in every lane, b

[Bug target/118890] ubsan bootstrap failure for powerpc64le-unknown-linux-gnu

2025-07-31 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118890 --- Comment #6 from Segher Boessenkool --- So, is this all done now?

[Bug target/119702] PPCLE: Inefficient auto-vectorization for 64-bit shifts on Power9

2025-07-31 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119702 --- Comment #5 from Segher Boessenkool --- But of course we need -mcpu=power8 or later for that insn.

[Bug target/119702] PPCLE: Inefficient auto-vectorization for 64-bit shifts on Power9

2025-07-31 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119702 --- Comment #4 from Segher Boessenkool --- No, we should generate code as Peter says in #c1. Doing a shift is worse code.

[Bug libgcc/115242] libgcc unwinder does not handle vector registers, even if the target machine supports them.

2025-07-31 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115242 --- Comment #10 from Segher Boessenkool --- (In reply to Florian Weimer from comment #9) > (In reply to Segher Boessenkool from comment #8) > > Can we have a testcase please? > > The test case in glibc: https://sourceware.org/bugzilla/show_bug.

[Bug libgcc/115242] libgcc unwinder does not handle vector registers, even if the target machine supports them.

2025-07-30 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115242 --- Comment #8 from Segher Boessenkool --- Can we have a testcase please? It sounds like the glibc you used was misconfigured. Of course VSX registers are not restored by the GCC unwinder stuff if you configured GCC to not support VSX register

[Bug target/118480] Power9 target generates poor code for vector char splat immediate.

2025-07-30 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118480 Segher Boessenkool changed: What|Removed |Added Last reconfirmed||2025-07-30 Ever confirmed|0

[Bug target/93738] [13/14/15/16 regression] test case gcc.target/powerpc/20050603-3.c fails

2025-07-27 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93738 --- Comment #15 from Segher Boessenkool --- (In reply to Kishan Parmar from comment #13) > Operand of 10 gets converted to below insn > > (and:SI (subreg:SI (lshiftrt:DI (reg:DI 129 [ x+-4 ]) > (const_int 12 [0xc])) 4) > (const_i

[Bug testsuite/119382] [15 Regression] gcc.target/powerpc/vsx-builtin-7.c fail starting with r15-7961-gdc47161c1f32c3

2025-07-25 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119382 --- Comment #11 from Segher Boessenkool --- The flag wil help. But it isn't as permanent as you should like: it's not really more than a side effect. So it won't really vanquish the problem.

[Bug target/115800] PowerPC GCC cannot build a little endian compile if --with-cpu=power5 is used

2025-07-25 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115800 Segher Boessenkool changed: What|Removed |Added Status|RESOLVED|WAITING Resolution|WONTFIX

[Bug target/115800] PowerPC GCC cannot build a little endian compile if --with-cpu=power5 is used

2025-07-25 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115800 --- Comment #10 from Segher Boessenkool --- (In reply to Michael Meissner from comment #8) > Given powerpcle64 requires a minimum of power8, I'm not sure it is worth > making libgfortran and libstdc++ build using --with-cpu=power5. In the past,

[Bug target/115800] PowerPC GCC cannot build a little endian compile if --with-cpu=power5 is used

2025-07-24 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115800 --- Comment #9 from Segher Boessenkool --- (In reply to Andreas Schwab from comment #7) > It is generally assumed that powerpc64le-*-* implies POWER7+ (glibc even > requires POWER8+). This is independent of the older -mlittle support (which > d

[Bug target/93738] [13/14/15/16 regression] test case gcc.target/powerpc/20050603-3.c fails

2025-07-24 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93738 --- Comment #12 from Segher Boessenkool --- > However, this pattern is failing to match in some cases, > and we end up with two separate instructions: one for rotate and another for > insert. So this is *not* a combine problem at all you say? J

[Bug target/108958] Powerpcle could generate mtvsrdd for zero extend DI to TI mode, when the TImode is in a vector register

2025-07-22 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108958 --- Comment #4 from Segher Boessenkool --- (Btw, the subject says "powerpcle", but this is about something very different: powerpc64le. "powerpcle" is also a valid first component of a target triple! Almost no one used 32-bit PowerPC in wrong-e

[Bug testsuite/120805] [16 Regression] gcc.target/powerpc/p9-vec-length-epil-4.c fail starting with r16-1645-g309dbcea2cabb3

2025-07-22 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120805 --- Comment #14 from Segher Boessenkool --- Hi! (In reply to Avinash Jayakar from comment #12) > (In reply to Segher Boessenkool from comment #10) > > As a meta-comment: almost everything using scan-assembler-times is > > obfuscated. > > > > It

[Bug testsuite/120805] [16 Regression] gcc.target/powerpc/p9-vec-length-epil-4.c fail starting with r16-1645-g309dbcea2cabb3

2025-07-21 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120805 Segher Boessenkool changed: What|Removed |Added CC||segher at gcc dot gnu.org --- Comm

[Bug target/121095] [15 Regression] Possibly unnecessary PRE pass on aarch64 for fpmr

2025-07-18 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121095 Segher Boessenkool changed: What|Removed |Added CC||segher at gcc dot gnu.org --- Comm

[Bug target/118480] Power9 target generates poor code for vector char splat immediate.

2025-07-17 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118480 --- Comment #4 from Segher Boessenkool --- It's the splitter at altivec.md:321

[Bug target/118480] Power9 target generates poor code for vector char splat immediate.

2025-07-17 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118480 --- Comment #3 from Segher Boessenkool --- Does xxspltib_constant_p return the wrong num_insns, or is the problem something lower, some splitter?

[Bug target/121007] [15 Regression] compiler hangs when building ffpmeg with -mcpu=power9 on ppc64le

2025-07-14 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121007 Segher Boessenkool changed: What|Removed |Added Status|NEW |RESOLVED Resolution|---

[Bug target/121007] [15/16 Regression] compiler hangs when building ffpmeg with -mcpu=power9 on ppc64le

2025-07-10 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121007 --- Comment #14 from Segher Boessenkool --- Thanks! If there is anything we (Power people) can do, please let us know!

[Bug target/117007] Poor optimization for small vector constants needed for vector shift/rotate/mask generation

2025-07-10 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117007 --- Comment #17 from Segher Boessenkool --- Hi! So, why do we not generate xxspltib where it would help. Please send a patch? Improvements will usually be to the xxspltib-generating code itself, not to the legacy code that generates the old (c

[Bug target/87949] PowerPC saves CR registers across calls

2025-07-10 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87949 Segher Boessenkool changed: What|Removed |Added Status|ASSIGNED|NEW

[Bug target/121007] [15/16 Regression] compiler hangs when building ffpmeg with -mcpu=power9 on ppc64le

2025-07-09 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121007 --- Comment #12 from Segher Boessenkool --- (In reply to Jakub Jelinek from comment #11) > Though even there is uninitialized read I guess from temp.a. > That said, LRA obviously shouldn't hang even on code which has UB at runtime. Of course.

[Bug target/121007] [15/16 Regression] compiler hangs when building ffpmeg with -mcpu=power9 on ppc64le

2025-07-09 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121007 --- Comment #9 from Segher Boessenkool --- Hrm, the insn here is just a mulldi instruction, a bog-standard integer multiplication (by a constant, 6 here). But insn 58 (where the problems start, "Changing address in insn 58 r218:DI&0xfff

[Bug target/121007] [15/16 Regression] compiler hangs when building ffpmeg with -mcpu=power9 on ppc64le

2025-07-09 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121007 --- Comment #8 from Segher Boessenkool --- (Also tested on powerpc-linux (where things just work), and on powerpc64-linux (the older ABI, correct-endian), where it fails just the same as on LE).

[Bug target/121007] [15/16 Regression] compiler hangs when building ffpmeg with -mcpu=power9 on ppc64le

2025-07-09 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121007 --- Comment #7 from Segher Boessenkool --- Cool, thanks! 121007.c:36:3: warning: 'v4' may be used uninitialized [-Wmaybe-uninitialized] No clue why it says "may be" there, it obviously *is* used uninitialised, this is the first time it is used

[Bug target/121007] [15/16 Regression] compiler hangs when building ffpmeg with -mcpu=power9 on ppc64le

2025-07-09 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121007 --- Comment #4 from Segher Boessenkool --- (In reply to Andrew Pinski from comment #1) > This is definitely sounding more and more like PR 93658. Yes, and maybe the error / fix / workaround will be similar: replace a VECTOR_MEM_ALTIVEC_P by VEC

[Bug target/121007] [15/16 Regression] compiler hangs when building ffpmeg with -mcpu=power9 on ppc64le

2025-07-09 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121007 --- Comment #3 from Segher Boessenkool --- Does anyone want to take this? Fame and fortune await! We need a reduced test case btw :-)

[Bug rtl-optimization/101882] [16 Regression] combine vs. insn with earlyclobber and input and output set to a hard register

2025-07-06 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101882 --- Comment #26 from Segher Boessenkool --- (In reply to chenglulu from comment #25) > > And if the input is non-sensical, the compiler output will be as well, or > > the > > compiler can give up in some cases. > > > I also don't quite agree t

[Bug rtl-optimization/120983] recog violates earlyclobber with user-defined hard register before reload (causing ICE on gcc.target/loongarch/bitwise-shift-reassoc-clobber.c)

2025-07-06 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120983 --- Comment #3 from Segher Boessenkool --- Please attach a testcase, and how to compile the code (-O2 etc.). Oh, and fill in the target field :-)

[Bug rtl-optimization/101882] [16 Regression] combine vs. insn with earlyclobber and input and output set to a hard register

2025-07-06 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101882 --- Comment #24 from Segher Boessenkool --- (In reply to Xi Ruoyao from comment #21) > (In reply to Segher Boessenkool from comment #20) > > (In reply to Peter Bergner from comment #17) > > > The reason operands 0, 1 and 4 all use the register r

[Bug rtl-optimization/101882] [16 Regression] combine vs. insn with earlyclobber and input and output set to a hard register

2025-07-06 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101882 --- Comment #23 from Segher Boessenkool --- It is a different target. Your issue has nothing at all to do with the problem we used to have. The root cause is very likely completely unrelated. Etc. etc. etc.

[Bug rtl-optimization/101882] [16 Regression] combine vs. insn with earlyclobber and input and output set to a hard register

2025-07-06 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101882 --- Comment #20 from Segher Boessenkool --- (In reply to Peter Bergner from comment #17) > The reason operands 0, 1 and 4 all use the register r23, is that each > operand is using the same pseudo, coming from variable "x", which is a user > defi

[Bug rtl-optimization/101882] [16 Regression] combine vs. insn with earlyclobber and input and output set to a hard register

2025-07-06 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101882 --- Comment #19 from Segher Boessenkool --- Hi Peter! (In reply to Peter Bergner from comment #18) > So the error message is coming from this hunk in my patch: > > + /* Both the earlyclobber operand and conflicting operand > +

[Bug rtl-optimization/101882] [16 Regression] combine vs. insn with earlyclobber and input and output set to a hard register

2025-07-05 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101882 --- Comment #16 from Segher Boessenkool --- It is allowed by recog(). Most likely your pattern is incorrect, but it is not completely impossible there is something wrong in genrecog.cc -- but that isn't combine either.

[Bug rtl-optimization/101882] [16 Regression] combine vs. insn with earlyclobber and input and output set to a hard register

2025-07-05 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101882 --- Comment #14 from Segher Boessenkool --- (match_operand:DI 1 "register_operand" "r0") That means either a general register ("r"), or the same thing as operand 0 (that's what "0" means)! So the backend explicitly allows it

[Bug target/113934] Switch avr to LRA

2025-06-27 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113934 --- Comment #14 from Segher Boessenkool --- Congratulations, and thank you!

[Bug tree-optimization/120598] Compiler is unable to vectorise scalar code

2025-06-20 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120598 Segher Boessenkool changed: What|Removed |Added Status|NEW |RESOLVED Resolution|---

[Bug tree-optimization/53947] [meta-bug] vectorizer missed-optimizations

2025-06-20 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 Bug 53947 depends on bug 120598, which changed state. Bug 120598 Summary: Compiler is unable to vectorise scalar code https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120598 What|Removed |Added -

[Bug tree-optimization/120598] Compiler is unable to vectorise scalar code

2025-06-19 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120598 --- Comment #8 from Segher Boessenkool --- (In reply to Jeevitha from comment #6) > The following dot_product function gets vectorized with the latest GCC trunk > and gcc 15.1.0: > > #include > #include > extern float dot_product(const int16_

[Bug tree-optimization/120598] Compiler is unable to vectorise scalar code

2025-06-19 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120598 --- Comment #7 from Segher Boessenkool --- [I cannot read any of the attached code, but...] The proposed manually vectorised code converts 64-bit integers to IEEE SP floats, which is extremely lossy. I don't find it very surprising the compile

[Bug target/120681] PowerPC GCC turns off pc-relative addressing on power10 when -mcmodel=large is used

2025-06-17 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120681 --- Comment #2 from Segher Boessenkool --- What is this testcase meant to test? The only thing it *does* test is if this trivial piece of code compiles at all (it doesn't test if the code generated is correct, or anything else about it!) It ju

[Bug testsuite/120519] g++.target/powerpc/mvc-symbols1.C fail starting with r16-965-g83eee43e998d0a

2025-06-10 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120519 --- Comment #10 from Segher Boessenkool --- I was not cc:'ed. And I did not approve it. It should not have been committed. We have (minimal!) process for a reason. It would be chaos without it.

[Bug rtl-optimization/74585] powerpc64: Very poor code generation for homogeneous vector aggregates passed in registers

2025-06-10 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=74585 --- Comment #17 from Segher Boessenkool --- The stack is always in memory, AFAIK :-) Do we have any targets where it is not? Do we have any targets where BLKmode is not always in memory? That is something that should be documented btw :-) Any

[Bug testsuite/120519] g++.target/powerpc/mvc-symbols1.C fail starting with r16-965-g83eee43e998d0a

2025-06-10 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120519 Segher Boessenkool changed: What|Removed |Added CC||segher at gcc dot gnu.org --- Comm

[Bug rtl-optimization/74585] powerpc64: Very poor code generation for homogeneous vector aggregates passed in registers

2025-06-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=74585 --- Comment #15 from Segher Boessenkool --- The compiler now seems to assume in earlier passes that parameters and return values are passed in memory. This is very sub-optimal, all but the last passes cannot do much useful work this way.

[Bug target/115576] [14/15/16 regression] Worse code generated for simple struct conversion since r14-2386-gbdf2737cda53a8

2025-06-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115576 --- Comment #9 from Segher Boessenkool --- This belong in simplify-rtx, not in combine.

[Bug target/108415] ICE in emit_library_call_value_1 at gcc/calls.cc:4181

2025-06-05 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108415 --- Comment #9 from Segher Boessenkool --- What is the current state here? We should simply not allow -mmodulo at all if we do not generate such insns (we do not have a -mcpu= that allows those). We do not want multiple ways to do thing, certa

[Bug rtl-optimization/108273] Inconsistent dfa state between debug and non-debug

2025-06-05 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108273 --- Comment #10 from Segher Boessenkool --- The problem seems to be in generic scheduling code, not in the Power backend. Can someone confirm this, or point out where the problem is, is show the problem no longer exists? Whatever way we can re

[Bug middle-end/119600] HOST_WIDEST_FAST_INT should be used instead of long for BITMAP_WORD in bitmap.h

2025-05-21 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119600 Segher Boessenkool changed: What|Removed |Added CC||segher at gcc dot gnu.org --- Comm

[Bug target/108958] Powerpcle could generate mtvsrdd for zero extend DI to TI mode, when the TImode is in a vector register

2025-05-15 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108958 --- Comment #2 from Segher Boessenkool --- (A good patch is like: we currently generate X (because of Y Z A), but we could do B C D instead, and generate E).

[Bug target/108958] Powerpcle could generate mtvsrdd for zero extend DI to TI mode, when the TImode is in a vector register

2025-05-15 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108958 --- Comment #1 from Segher Boessenkool --- Sure. What do we need to improve on this? Please propose a patch :-)

[Bug target/97786] rs6000 isinf etc. are pretty horrible

2025-05-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97786 --- Comment #9 from Segher Boessenkool --- (Erm,tdc *is* 3.0, but setbc is 3.1, I can never ever get this right it seems! But setb is 3.0).

[Bug target/97786] rs6000 isinf etc. are pretty horrible

2025-05-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97786 --- Comment #8 from Segher Boessenkool --- (In reply to Surya Kumari Jangala from comment #7) > Hi Segher, > > Thanks for the pointers! > We can optimize the code further and remove the branch completely. > > For P10: > > xststdcdp 0,1,48

[Bug target/113939] Switch m68k to LRA

2025-05-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113939 --- Comment #11 from Segher Boessenkool --- (In reply to John Paul Adrian Glaubitz from comment #7) > (In reply to John Paul Adrian Glaubitz from comment #6) > > I suggest we switch m68k to LRA, so we can close this bug report. Plus file > > bu

[Bug target/117818] [12/13/14/15/16 regression] vec_add incorrectly generates vadduwm for vector char const inputs.

2025-05-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117818 --- Comment #8 from Segher Boessenkool --- We still support powerpc64-* just fine. And powerpc-linux (the 32-bit target) is tested just fine as well, and the community does support it. No one cares _too_ much about it anymore, but why let it d

[Bug target/97786] rs6000 isinf etc. are pretty horrible

2025-05-07 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97786 --- Comment #6 from Segher Boessenkool --- Hi Surya! Hrm yes, xststdcdp _does_ return a sign bit as well. Do we currently say that in RTL as well? Unfortunately we cannot just follow an xststdcdp by a setb, setb tests bit 1, but the tdp sets b

[Bug target/117207] [15/16 Regression] gcc.target/powerpc/pr103515.c fail starting with r15-4225-g70c3db511ba14f

2025-04-28 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117207 Segher Boessenkool changed: What|Removed |Added CC||segher at gcc dot gnu.org --- Comm

[Bug target/118541] Incorrect transformation to xscmpgtdp for Unordered Operations

2025-04-11 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118541 --- Comment #8 from Segher Boessenkool --- (The traditional FP comparisons we do use, i.e. fcmpu. We never used fcmpo, because it is problematic, it needs access to information that in not in the RTL at the point of the comparison, that informa

[Bug target/118541] Incorrect transformation to xscmpgtdp for Unordered Operations

2025-04-10 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118541 --- Comment #7 from Segher Boessenkool --- isgreater is not supposed to set floating point exception flags at all. So whether the comparison resulted in unordered (i.e., one of the arguments was a NaN) or not, isgreater should not set VXVC in p

[Bug target/119468] PPCLE: Inefficient implementation of __builtin_parityll

2025-04-09 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119468 --- Comment #3 from Segher Boessenkool --- prtyd and popcntb are executed similarly on all hardware: same execution pipes. The extsw we currently generate is not needed at all, a very common and well-known issue, generic as well (not really rs60

[Bug target/119468] PPCLE: Inefficient implementation of __builtin_parityll

2025-04-09 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119468 Segher Boessenkool changed: What|Removed |Added CC||segher at gcc dot gnu.org --- Comm

[Bug target/119629] mismatch between [power9-64] builtins and their instructions

2025-04-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119629 --- Comment #5 from Segher Boessenkool --- (In reply to Peter Bergner from comment #2) > (In reply to Peter Bergner from comment #1) > > > but the conditions that enable the expansion of > > > __builtin_scalar_byte_in_set > > > are those of [po

[Bug target/119629] mismatch between [power9-64] builtins and their instructions

2025-04-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119629 --- Comment #6 from Segher Boessenkool --- (In reply to Segher Boessenkool from comment #5) > This needs splitting up in parts. Maybe then some parts can be correct, > even! Of course that requires explanatory comments in the patch submission

[Bug target/119629] mismatch between [power9-64] builtins and their instructions

2025-04-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119629 --- Comment #4 from Segher Boessenkool --- Hi Alex, (In reply to Alexandre Oliva from comment #0) > This raises a number of problems: > > - instructions and expanders for these builtins don't have their conditions > tested, so they must necess

[Bug target/119629] mismatch between [power9-64] builtins and their instructions

2025-04-08 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119629 Segher Boessenkool changed: What|Removed |Added CC||segher at gcc dot gnu.org --- Comm

[Bug rtl-optimization/116398] [15 Regression] gcc.target/aarch64/ashltidisi.c fails since r15-268

2025-03-14 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116398 --- Comment #26 from Segher Boessenkool --- (In reply to Richard Sandiford from comment #23) > Yeah, I'd wondered about limiting it an all cases too. Definitely seems > worth trying. But given that we're in stage 4, maybe it would make sense t

[Bug rtl-optimization/116398] [15 Regression] gcc.target/aarch64/ashltidisi.c fails since r15-268

2025-03-14 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116398 --- Comment #19 from Segher Boessenkool --- (In reply to Jakub Jelinek from comment #16) > Tamar's explanation why #c0 gcc 14 code is better than gcc 15: > "the mov is a zero latency instruction. sxtw, asr and sbfx themselves are > aliases to th

[Bug rtl-optimization/116398] [15 Regression] gcc.target/aarch64/ashltidisi.c fails since r15-268

2025-03-14 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116398 --- Comment #22 from Segher Boessenkool --- (In reply to Richard Sandiford from comment #18) > I'd been reluctant to get involved in this for fear of creating friction or > being a cook too many, No, your input is much appreciated! > but: the

[Bug rtl-optimization/116398] [15 Regression] gcc.target/aarch64/ashltidisi.c fails since r15-268

2025-03-14 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116398 --- Comment #14 from Segher Boessenkool --- (In reply to Jakub Jelinek from comment #8) > Because as this PR shows, those 2->2 insn merges with no change on i2 can > make a lot of sense and allow combination on the second and following user > of

[Bug rtl-optimization/118638] [14 Regression] Miscompile with -Os and -O0/1/2/3 since r14-4810

2025-02-27 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118638 --- Comment #24 from Segher Boessenkool --- (In reply to Jakub Jelinek from comment #21) > I certainly plan to backport it to those releases as well. But it is just > latent there... Where "latent" means "our testcases do not show problems" th

[Bug rtl-optimization/116028] [15 Regression] gcc.dg/pr10474.c test failure since r15-1619-g3b9b8d6cfdf593

2025-02-09 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116028 Segher Boessenkool changed: What|Removed |Added CC||segher at gcc dot gnu.org --- Comm

[Bug target/118663] [15 Regression] ICE: in rs6000_emit_move, at config/rs6000/rs6000.cc:11091 during libgcc build

2025-01-26 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118663 Segher Boessenkool changed: What|Removed |Added Status|NEW |WAITING --- Comment #5 from Segher

[Bug middle-end/118556] size of asm not outputed with -dP

2025-01-21 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118556 Segher Boessenkool changed: What|Removed |Added CC||segher at gcc dot gnu.org --- Comm

[Bug tree-optimization/115825] [12/13/14 Regression] Loop unrolling increases code size with -Os

2025-01-16 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115825 --- Comment #27 from Segher Boessenkool --- > This is a GIMPLE pass which has no idea what the backend will expand > __builtin_darn() to. So you are saying >90% of builtins now need to say they are pure and const (which makes totally no sense f

[Bug tree-optimization/115825] [12/13/14 Regression] Loop unrolling increases code size with -Os

2025-01-15 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115825 --- Comment #25 from Segher Boessenkool --- No, darn does have a side effect: it returns a random number in the destination reg (_deliver_ _a_ _r_andom _n_umber). It does not touch memory at all. There are no call insns at all either, of cours

  1   2   3   4   5   6   7   8   9   10   >