[Bug target/120447] [16 Regression] cpython fails to compile on AArch64 after r16-446-g210d06502f22964c7214586c54f8eb54a6965bfd

2025-05-30 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120447 --- Comment #5 from Tamar Christina --- I could be mistaken, but VNx4QI is a partial vector, so every QI element occupies 32-bits (so we'd use a widening load here). I'm not sure this operation is valid for partial vectors as it means you're ta

[Bug tree-optimization/120357] [14/15/16 Regression] ICE in vect "error: definition in block 9 does not dominate use in block 3" with early break

2025-05-30 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120357 --- Comment #9 from Tamar Christina --- (In reply to Richard Biener from comment #8) > The following fixes this. I'm not 100% convinced but it does seem "obvious" > (but for the "peeled" case we seem to eventually create duplicate COND > reduct

[Bug target/120447] [16 Regression] cpython fails to compile on AArch64 after r16-446-g210d06502f22964c7214586c54f8eb54a6965bfd

2025-05-28 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120447 Tamar Christina changed: What|Removed |Added Status|WAITING |NEW Keywords|needs-source

[Bug target/120447] New: [16 Regression] cpython fails to compile on AArch64 after r16-446-g210d06502f22964c7214586c54f8eb54a6965bfd

2025-05-27 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120447 Bug ID: 120447 Summary: [16 Regression] cpython fails to compile on AArch64 after r16-446-g210d06502f22964c7214586c54f8eb54a6965bfd Product: gcc Version: 16.0

[Bug tree-optimization/120383] Improving early break unrolled sequences with Adv. SIMD

2025-05-24 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120383 --- Comment #2 from Tamar Christina --- (In reply to Richard Biener from comment #1) > Sure, I'm OK with an optab for it. So it's like (half-type)((unsigned)(a + > b) >> (sizeof(a)*4))? Yeah, and I was planning on if an optab was acceptable to

[Bug tree-optimization/120357] [14/15/16 Regression] ICE in vect "error: definition in block 9 does not dominate use in block 3" with early break

2025-05-24 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120357 --- Comment #7 from Tamar Christina --- (In reply to Richard Biener from comment #5) > Confirmed on trunk. I'll eventually have a look. Sorry I'm on holiday till Tuesday, I'm happy to take a look then if you prefer. I did not mean to dump my b

[Bug tree-optimization/120383] Improving early break unrolled sequences with Adv. SIMD

2025-05-21 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120383 Tamar Christina changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot gnu.org

[Bug tree-optimization/120383] New: Improving early break unrolled sequences with Adv. SIMD

2025-05-21 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120383 Bug ID: 120383 Summary: Improving early break unrolled sequences with Adv. SIMD Product: gcc Version: 16.0 Status: UNCONFIRMED Severity: normal Prior

[Bug middle-end/120352] New: scalar epiloque not needed for early break when exit block is invariant

2025-05-19 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120352 Bug ID: 120352 Summary: scalar epiloque not needed for early break when exit block is invariant Product: gcc Version: 16.0 Status: UNCONFIRMED Keywords: missed

[Bug tree-optimization/116855] [14 Regression] Unsafe early-break vectorization

2025-05-11 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116855 --- Comment #14 from Tamar Christina --- (In reply to Richard Biener from comment #13) > Too late for backporting to 14.3 IMO, also not sure how important it is - we > did not have an actual case where this caused problems AFAIK. early-break >

[Bug tree-optimization/120164] GCC fails vectorization when using conditional __builtin_prefetch

2025-05-08 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120164 --- Comment #9 from Tamar Christina --- (In reply to rguent...@suse.de from comment #8) > On Thu, 8 May 2025, tnfchris at gcc dot gnu.org wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120164 > > > > --- Comment #7 from Tamar Christ

[Bug tree-optimization/120164] GCC fails vectorization when using conditional __builtin_prefetch

2025-05-08 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120164 --- Comment #7 from Tamar Christina --- (In reply to Richard Biener from comment #6) > (In reply to Tamar Christina from comment #5) > > The given example is an easy one to drop, but I wonder what would happen if > > the block had other instruct

[Bug tree-optimization/120164] GCC fails vectorization when using conditional __builtin_prefetch

2025-05-08 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120164 --- Comment #5 from Tamar Christina --- (In reply to Richard Biener from comment #4) > Note with "vectorizing" prefetches I meant adjusting the prefetched address, > "vectorizing" it as an induction but only prefetching on the first (or > last?)

[Bug tree-optimization/120164] GCC fails vectorization when using conditional __builtin_prefetch

2025-05-08 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120164 --- Comment #3 from Tamar Christina --- (In reply to Tamar Christina from comment #2) > (In reply to Richard Biener from comment #1) > > As of today this is a job for the vectorizer if-conversion pass then. > > > > OTOH I believe we should work

[Bug tree-optimization/120164] GCC fails vectorization when using conditional __builtin_prefetch

2025-05-08 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120164 --- Comment #2 from Tamar Christina --- (In reply to Richard Biener from comment #1) > As of today this is a job for the vectorizer if-conversion pass then. > > OTOH I believe we should work towards vectorizing the prefetches themselves > rathe

[Bug target/120157] No use of SVE early break vectorisation in FP loop

2025-05-07 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120157 --- Comment #5 from Tamar Christina --- (In reply to ktkachov from comment #4) > > Ah indeed, -msve-vector-bits= does do what I expected. Feel free to close > > this if it's not tracking anything new then. > > Ok. FWIW the original testcase for

[Bug target/120157] No use of SVE early break vectorisation in FP loop

2025-05-07 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120157 Tamar Christina changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Assignee|unassigne

[Bug target/120157] No use of SVE early break vectorisation in FP loop

2025-05-07 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120157 --- Comment #1 from Tamar Christina --- (In reply to ktkachov from comment #0) > Not sure if this is a target-specific issue or not. For input: > int f11(float *x, float val, int n) > { > int i; > for (i = 0; i < n; i++) { > if (

[Bug libstdc++/116140] [15/16 Regression] 5-10% slowdown of 483.xalancbmk and 523.xalancbmk_r since r15-2356-ge69456ff9a54ba

2025-05-01 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116140 --- Comment #20 from Tamar Christina --- We're currently working on it. The improvements come from architectures where the code vectorized. The performance losses come from those where it didn't vectorize, or the vectorizer generated inefficien

[Bug tree-optimization/119351] [14 Regression] Incorrect forall masking for AND reduction in early break

2025-04-28 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119351 Tamar Christina changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug target/118892] [14 Regression] ICE (segfault) in rebuild_jump_labels on aarch64-linux-gnu since r14-5289

2025-04-28 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118892 Tamar Christina changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug target/119921] [12/13/14/15/16 Regression] ICE building SVE ACLE in varasm

2025-04-24 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119921 Tamar Christina changed: What|Removed |Added Version|13.3.1 |16.0 Target Milestone|---

[Bug target/119921] New: [12/13/14/15/16 Regression] ICE building SVE ACLE in varasm

2025-04-24 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119921 Bug ID: 119921 Summary: [12/13/14/15/16 Regression] ICE building SVE ACLE in varasm Product: gcc Version: 13.3.1 Status: UNCONFIRMED Keywords: ice-on-valid-cod

[Bug tree-optimization/119881] support a large number of pointers in alias versioning

2025-04-22 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119881 --- Comment #2 from Tamar Christina --- (In reply to Richard Biener from comment #1) > I wonder where this matters in practice and my usual stance is educating > users > about __restrict or #pragma GCC ivdep or OMP simd safelen is better than >

[Bug tree-optimization/119872] [15/16 regression] wrong code at -O{1,2,s} since r15-1809-g735edbf1e2479f

2025-04-22 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119872 --- Comment #10 from Tamar Christina --- (In reply to rguent...@suse.de from comment #9) > On Mon, 21 Apr 2025, tnfchris at gcc dot gnu.org wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119872 > > > > --- Comment #8 from Tamar Chri

[Bug tree-optimization/119872] [15/16 regression] wrong code at -O{1,2,s} since r15-1809-g735edbf1e2479f

2025-04-21 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119872 --- Comment #8 from Tamar Christina --- (In reply to Richard Biener from comment #7) > Please make sure to not "fix" something where the input is already wrong - > see the various issues where SCEV produces an invalid CHREC - forming a chrec > i

[Bug tree-optimization/119872] [15/16 regression] wrong code at -O{1,2,s} since r15-1809-g735edbf1e2479f

2025-04-21 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119872 Tamar Christina changed: What|Removed |Added Ever confirmed|0 |1 Status|UNCONFIRMED

[Bug tree-optimization/119881] New: support alias analysis for large number of pointers

2025-04-20 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119881 Bug ID: 119881 Summary: support alias analysis for large number of pointers Product: gcc Version: unknown Status: UNCONFIRMED Keywords: missed-optimization Severity: nor

[Bug tree-optimization/114932] IVopts inefficient handling of signed IV used for addressing.

2025-04-20 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932 --- Comment #27 from Tamar Christina --- (In reply to Tianyang Chou from comment #26) > (In reply to Tamar Christina from comment #0) > > Hi Tamar, > After reading the whole discussion, I still confused about how does the > immediate offset

[Bug tree-optimization/119860] New: needless vector unrolling causes less profitable vectorization

2025-04-18 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119860 Bug ID: 119860 Summary: needless vector unrolling causes less profitable vectorization Product: gcc Version: unknown Status: UNCONFIRMED Keywords: missed-optim

[Bug testsuite/119286] [15 Regression] GCN vs. "middle-end: delay checking for alignment to load [PR118464]"

2025-04-18 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119286 --- Comment #9 from Tamar Christina --- (In reply to Thomas Schwinge from comment #8) > Tamar, thanks! I confirm all fixed -- but one: > > (In reply to myself from comment #1) > > ..., and similarly -- but not identical! -- for '-march=gfx1100

[Bug tree-optimization/119858] [15/16 Regression] GCN vs. "middle-end: Fix incorrect codegen with PFA and VLS [PR119351]"

2025-04-18 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119858 Tamar Christina changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot gnu.org

[Bug tree-optimization/119351] [14 Regression] Incorrect forall masking for AND reduction in early break

2025-04-17 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119351 Tamar Christina changed: What|Removed |Added Priority|P1 |P2 --- Comment #23 from Tamar Christi

[Bug tree-optimization/119351] [14 Regression] Incorrect forall masking for AND reduction in early break

2025-04-17 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119351 Tamar Christina changed: What|Removed |Added Target Milestone|15.0|14.3 Summary|[15 Regressio

[Bug testsuite/119286] [15 Regression] GCN vs. "middle-end: delay checking for alignment to load [PR118464]"

2025-04-16 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119286 Tamar Christina changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug tree-optimization/119351] [15 Regression] Wrong code in GROMACS for AArch64 generic SVE VLS target since r15-6807-g68326d5d1a593d

2025-04-13 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119351 Tamar Christina changed: What|Removed |Added Keywords|needs-reduction,| |needs-source

[Bug tree-optimization/119351] [15 Regression] Wrong code in GROMACS for AArch64 generic SVE VLS target since r15-6807-g68326d5d1a593d

2025-04-10 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119351 --- Comment #18 from Tamar Christina --- (In reply to Richard Biener from comment #17) > I wonder if we can use > > BIT_FIELD_REF > > as the "reduction" step. Yeah that's the same comment Richard S suggested when we were talking to avoid th

[Bug tree-optimization/119351] [15 Regression] Wrong code in GROMACS for AArch64 generic SVE VLS target since r15-6807-g68326d5d1a593d

2025-04-10 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119351 --- Comment #16 from Tamar Christina --- Ok, found the bug and c-vise is running for a testcase. The issue is as follows: For early break we need to know which value to start the scalar loop with if we take an early exit. Historically this me

[Bug tree-optimization/119351] [15 Regression] Wrong code in GROMACS for AArch64 generic SVE VLS target since r15-6807-g68326d5d1a593d

2025-04-09 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119351 --- Comment #15 from Tamar Christina --- The following example reproduces the CFG but not the bad codegen: https://godbolt.org/z/Thzo7hz8P This generates the actual code I expected: _55 = {_2, _2, _2, _2}; _56 = {_11, _11, _11, _11}; _57

[Bug tree-optimization/119351] [15 Regression] Wrong code in GROMACS for AArch64 generic SVE VLS target since r15-6807-g68326d5d1a593d

2025-04-09 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119351 --- Comment #14 from Tamar Christina --- There seems to be an one error in the pre-header when calculating the initial vector IV. The starting values are calculated as: sub z27.s, z23.s, z31.s

[Bug tree-optimization/119187] vectorizer should be able to SLP already vectorized code

2025-04-08 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119187 --- Comment #8 from Tamar Christina --- (In reply to ktkachov from comment #7) > Could this be extended to scale Neon intrinsics code to SVE by > re-vectorising and treating the 128-bit Neon lane as a Q-word element of a > wider SVE vector? I t

[Bug target/113257] -march=native or -mcpu=native are ineffective, but -march=native -mcpu=native works on arm64 M2 Ultra

2025-04-08 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113257 Tamar Christina changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug middle-end/119577] RISC-V: Redundant vector IV roundtrip.

2025-04-07 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119577 --- Comment #2 from Tamar Christina --- (In reply to Richard Biener from comment #1) > IIRC it depends on the "kind" of early break whether we need the > first IV (scalar IV possible) or the last, but I don't rememeber exactly. First is always

[Bug tree-optimization/119351] [15 Regression] Wrong code in GROMACS for AArch64 generic SVE VLS target since r15-6807-g68326d5d1a593d

2025-04-07 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119351 --- Comment #13 from Tamar Christina --- Sorry had a week off, looking into this again today.

[Bug target/118892] [14 Regression] ICE (segfault) in rebuild_jump_labels on aarch64-linux-gnu since r14-5289

2025-04-07 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118892 --- Comment #18 from Tamar Christina --- (In reply to Pavol Rusnak from comment #17) > Is the fix going to be backported from master to 14.x release? Possibly > targeting 14.3.0 release? Yep

[Bug tree-optimization/119351] [15 Regression] Wrong code in GROMACS for AArch64 generic SVE VLS target

2025-03-28 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119351 --- Comment #9 from Tamar Christina --- --- static bool next_ci(int dimYY, int numCells, int nth, int ci_block, int* ci_x, int* ci_y, int* ci_b, int* ci) { while (*ci >= *ci_x * dimYY + *ci_y + 1) { *ci_y += 1; if (*ci_y

[Bug tree-optimization/115450] [15 Regression] cpu2017 502.gcc runtime miscompute on aarch64 with SVE since r15-1006-gd93353e6423eca

2025-03-28 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115450 --- Comment #11 from Tamar Christina --- (In reply to Richard Biener from comment #10) > Can anybody still reproduce this? I can't. I can reproduce the failure with the original commit but cannot with today's trunk.

[Bug tree-optimization/119351] [15 Regression] Wrong code in GROMACS for AArch64 generic SVE VLS target

2025-03-27 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119351 --- Comment #8 from Tamar Christina --- Looking at it some more, I think the loop is valid to vectorize. But we don't seem to vectorize the reduction jumping back to the outerloop: ;; basic block 384, loop depth 3, count 8598980 (estimated lo

[Bug tree-optimization/119402] [14/15 Regression] `((-bool) & _6) & (~_6)` is not optimized to 0 on some targets since r14-5673

2025-03-25 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119402 --- Comment #3 from Tamar Christina --- (In reply to Jakub Jelinek from comment #2) > Started with r14-5673-g33c2b70dbabc02788caabcbc66b7baeafeb95bcf > With -O2 -mtune=generic it is fine even on the current trunk. Seems like it's due to missing

[Bug target/119108] [15 Regression] AArch64 Commit 'vect: Force alignment peeling ...' (r15-6807-g68326d5d1a593d) causes regression in Snappy workload for -mcpu=neoverse-v2.

2025-03-25 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119108 --- Comment #12 from Tamar Christina --- Sorry for the slow response, had a few days off. The regression here can be reproduced through this example loop: https://godbolt.org/z/jnGe5x4P7 for the current loop in snappy what you want is -UALIGNE

[Bug tree-optimization/119351] [15 Regression] Wrong code in GROMACS for AArch64 generic SVE VLS target

2025-03-25 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119351 --- Comment #7 from Tamar Christina --- Sorry for the delay, had a few days off. So looking at this again, it's happening When next_ci gets inlined into nbnxn_make_pairlist_part, the while loop while (next_ci(iGrid, nth, ci_block, &ci_x, &ci_y

[Bug tree-optimization/119393] [15 Regression] Worse vectorization of imagick_r hot loop on aarch64 since r15-5024-g2a2e6784074e1f

2025-03-20 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119393 --- Comment #3 from Tamar Christina --- Confirmed.

[Bug tree-optimization/119351] [15 Regression] Wrong code in GROMACS for AArch64 generic SVE VLS target

2025-03-20 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119351 Tamar Christina changed: What|Removed |Added Last reconfirmed||2025-03-20 Ever confirmed|0

[Bug tree-optimization/119351] [15 Regression] Wrong code in GROMACS for AArch64 generic SVE VLS target

2025-03-20 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119351 --- Comment #6 from Tamar Christina --- (In reply to ktkachov from comment #5) > (In reply to Tamar Christina from comment #4) > > While looking at the codegen it looks like GROMACS has a lot of loops that > > get vectorized now and it's showing

[Bug testsuite/119286] [15 Regression] GCN vs. "middle-end: delay checking for alignment to load [PR118464]"

2025-03-19 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119286 --- Comment #5 from Tamar Christina --- Still have one to fix.

[Bug target/115842] [15 Regression] 6.5% slowdown of 548.exchange2_r on Intel Ice Lake

2025-03-19 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115842 --- Comment #9 from Tamar Christina --- (In reply to Hongtao Liu from comment #8) > (In reply to Tamar Christina from comment #7) > > (In reply to Hongtao Liu from comment #6) > > > I noticed some double-counting of cost in group-candidate (reg

[Bug tree-optimization/114932] IVopts inefficient handling of signed IV used for addressing.

2025-03-19 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932 --- Comment #24 from Tamar Christina --- Hi, Yeah vectorization was one of the reasons for the slowdown. Do note however it's not entirely safe to backport that patch, as it exposes another bug which has a large fix. At least the top two comm

[Bug tree-optimization/119351] [15 Regression] Wrong code in GROMACS for AArch64 generic SVE VLS target

2025-03-18 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119351 --- Comment #3 from Tamar Christina --- Confirmed, able to reproduce it now. Taking a look. -march=armv8-a+sve is enough FFIW.

[Bug tree-optimization/119187] vectorizer should be able to SLP already vectorized code

2025-03-18 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119187 --- Comment #5 from Tamar Christina --- (In reply to Richard Biener from comment #4) > > for (...) >a[32*i] = ..; >a[32*i+1] = ..; > ... >a[32*i + 31] = ...; > > to match the number of lanes in a HW vector. It shares some of the

[Bug target/115842] [15 Regression] 6.5% slowdown of 548.exchange2_r on Intel Ice Lake

2025-03-17 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115842 Tamar Christina changed: What|Removed |Added CC||tnfchris at gcc dot gnu.org --- Comme

[Bug testsuite/119286] [15 Regression] GCN vs. "middle-end: delay checking for alignment to load [PR118464]"

2025-03-17 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119286 Tamar Christina changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed|

[Bug target/118974] Use SVE cbranch sequence for Neon modes when TARGET_SVE

2025-03-17 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118974 --- Comment #3 from Tamar Christina --- and using the SVE CC regs: .L6: ldr q30, [x2, x0] cmple p15.s, p7/z, z30.s, #0 b.none .L2

[Bug target/118974] Use SVE cbranch sequence for Neon modes when TARGET_SVE

2025-03-17 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118974 Tamar Christina changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot gnu.org

[Bug target/119108] [15 Regression] AArch64 Commit 'vect: Force alignment peeling ...' (r15-6807-g68326d5d1a593d) causes regression in Snappy workload for -mcpu=neoverse-v2.

2025-03-11 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119108 --- Comment #11 from Tamar Christina --- Actually I just realized that loop uses two pointers, and we can only peel for one unknown misalignment atm. This loop will instead be versioned, and because of the manual misalignment in the caller I don

[Bug target/119108] [15 Regression] AArch64 Commit 'vect: Force alignment peeling ...' (r15-6807-g68326d5d1a593d) causes regression in Snappy workload for -mcpu=neoverse-v2.

2025-03-11 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119108 --- Comment #10 from Tamar Christina --- (In reply to Matthew Malcomson from comment #9) > (In reply to Tamar Christina from comment #8) > > Ok, so having looked at this I'm not sure the compiler is at fault here. > > > > Similar to the SVN cas

[Bug tree-optimization/119187] vectorizer should be able to SLP already vectorized code

2025-03-10 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119187 --- Comment #3 from Tamar Christina --- (In reply to Andrew Pinski from comment #2) > (In reply to Andrew Pinski from comment #1) > > There is another bug report for a similar thing but with SSE and AVX2. > > yes PR 95960. Ah yeah, I guess I w

[Bug target/119108] [15 Regression] AArch64 Commit 'vect: Force alignment peeling ...' (r15-6807-g68326d5d1a593d) causes regression in Snappy workload for -mcpu=neoverse-v2.

2025-03-10 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119108 --- Comment #8 from Tamar Christina --- Ok, so having looked at this I'm not sure the compiler is at fault here. Similar to the SVN case the snappy code is misaligning the loads intentionally and loading 64-bits at a time from the 8-bit pointe

[Bug tree-optimization/119187] New: vectorizer should be able to SLP already vectorized code

2025-03-10 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119187 Bug ID: 119187 Summary: vectorizer should be able to SLP already vectorized code Product: gcc Version: unknown Status: UNCONFIRMED Keywords: missed-optimizatio

[Bug tree-optimization/118464] [15 Regression] gcc-15.0.0_pre20250112 ICE with opencv-4.10.0 using -O2/-ftree-loop-vectorize: memory_descriptor_ref.cpp:94:19: internal compiler error: in exact_div, at

2025-03-07 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118464 Tamar Christina changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug tree-optimization/116855] [14 Regression] Unsafe early-break vectorization

2025-03-07 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116855 Tamar Christina changed: What|Removed |Added Summary|[14/15 Regression] Unsafe |[14 Regression] Unsafe

[Bug middle-end/119145] [15 Regression] ICE in expanding IFN_MASK_CALL from vector math

2025-03-06 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119145 --- Comment #1 from Tamar Christina --- The vectorizer seems confused. Vectorization fails, but seems to fail during SLP transform so the ifc loop is kept, but the statements not transformed. it then produces broken SSA: note: * Analysis

[Bug middle-end/119145] New: [15 Regression] ICE in expanding IFN_MASK_CALL from vector math

2025-03-06 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119145 Bug ID: 119145 Summary: [15 Regression] ICE in expanding IFN_MASK_CALL from vector math Product: gcc Version: 15.0 Status: UNCONFIRMED Keywords: ice-on-valid-c

[Bug target/119108] [15 Regression] AArch64 Commit 'vect: Force alignment peeling ...' (r15-6807-g68326d5d1a593d) causes regression in Snappy workload for -mcpu=neoverse-v2.

2025-03-05 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119108 --- Comment #6 from Tamar Christina --- Ok, now really confirmed :) Interestingly the behavior on other uarches suggests this may be cost modelling. On Neoverse-V1 we get (without LTO): BM_UFlat/0/1 -4.60251 BM_UFlat/0/2 -2.34742 BM_UFlat/3/1

[Bug target/119108] [15 Regression] AArch64 Commit 'vect: Force alignment peeling ...' (r15-6807-g68326d5d1a593d) causes regression in Snappy workload for -mcpu=neoverse-v2.

2025-03-05 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119108 --- Comment #5 from Tamar Christina --- Ah... It looks like somehow the built for /data/gcc/gcc-with-68326d5d1a5-install/ failed and it was silently picking up the distro compiler instead. Hence the difference in memmove only! I'll clean every

[Bug target/119108] [15 Regression] AArch64 Commit 'vect: Force alignment peeling ...' (r15-6807-g68326d5d1a593d) causes regression in Snappy workload for -mcpu=neoverse-v2.

2025-03-05 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119108 --- Comment #4 from Tamar Christina --- (In reply to Matthew Malcomson from comment #3) > I only looked into VecSource/5/2, and unfortunately I looked into it on an > internal setup that compiles slightly differently. > > In that slightly diffe

[Bug target/119108] [15 Regression] AArch64 Commit 'vect: Force alignment peeling ...' (r15-6807-g68326d5d1a593d) causes regression in Snappy workload for -mcpu=neoverse-v2.

2025-03-05 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119108 Tamar Christina changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed|

[Bug target/118892] [14/15 Regression] ICE (segfault) in rebuild_jump_labels on aarch64-linux-gnu since r14-5289

2025-02-27 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118892 --- Comment #13 from Tamar Christina --- (In reply to Jakub Jelinek from comment #12) > E.g. the i386 backend usually uses force_reg in this case. If the operand > is a REG, it does nothing, if it is a SUBREG, it is forced into a temporary > an

[Bug rtl-optimization/119046] [15 Regression] Performance drop from not forming lane-wise FMLAs with Eigen library

2025-02-27 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119046 Tamar Christina changed: What|Removed |Added Blocks||114515 CC|

[Bug tree-optimization/119016] [15 regression] svn miscompiled with -O2 -mavx -fno-vect-cost-model since r15-6807-g68326d5d1a593d

2025-02-26 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119016 --- Comment #10 from Tamar Christina --- (In reply to rguent...@suse.de from comment #9) > On Wed, 26 Feb 2025, tnfchris at gcc dot gnu.org wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119016 > > > > --- Comment #8 from Tamar Chri

[Bug tree-optimization/119016] [15 regression] svn miscompiled with -O2 -mavx -fno-vect-cost-model since r15-6807-g68326d5d1a593d

2025-02-26 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119016 --- Comment #8 from Tamar Christina --- (In reply to rguent...@suse.de from comment #7) > On Wed, 26 Feb 2025, tnfchris at gcc dot gnu.org wrote: > > > Because of the scalar code doing DI mode loads, and the misalignment being > > HImode, I do

[Bug tree-optimization/119016] [15 regression] svn miscompiled with -O2 -mavx -fno-vect-cost-model since r15-6807-g68326d5d1a593d

2025-02-26 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119016 --- Comment #6 from Tamar Christina --- At the start of the second iteration len = 2, so start becomes misaligned at 0x7fffe2f2 but the peeling iteration code checks (0x7fffe2f2 / 8) & 1 which is 0, so it doesn't peel to align it. Inde

[Bug tree-optimization/119016] [15 regression] svn miscompiled with -O2 -mavx -fno-vect-cost-model since r15-6807-g68326d5d1a593d

2025-02-25 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119016 Tamar Christina changed: What|Removed |Added Priority|P3 |P1 Last reconfirmed|

[Bug tree-optimization/118976] [12/13/14/15 regression] Correctness Issue: SVE vectorization results in data corruption when cpu has 128bit vectors but compiled with -mcpu=neoverse-v1 (which is only f

2025-02-24 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118976 Tamar Christina changed: What|Removed |Added Status|UNCONFIRMED |NEW Component|target

[Bug target/118974] Use SVE cbranch sequence for Neon modes when TARGET_SVE

2025-02-21 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118974 Tamar Christina changed: What|Removed |Added Ever confirmed|0 |1 Status|UNCONFIRMED

[Bug target/118942] New: [14/15 Regression] vld1q_s{8, 16}_x{3, 4} use incorrect pointer type

2025-02-19 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118942 Bug ID: 118942 Summary: [14/15 Regression] vld1q_s{8,16}_x{3,4} use incorrect pointer type Product: gcc Version: 14.2.0 Status: UNCONFIRMED Keywords: rejects-v

[Bug target/118892] [14/15 Regression] ICE (segfault) in rebuild_jump_labels on aarch64-linux-gnu since r14-5289

2025-02-18 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118892 --- Comment #11 from Tamar Christina --- (In reply to Richard Sandiford from comment #10) > (In reply to Tamar Christina from comment #9) > > I swear that was something that was fixed. But in any case, the simplest > > fix is to force it into a

[Bug target/118892] [14/15 Regression] ICE (segfault) in rebuild_jump_labels on aarch64-linux-gnu since r14-5289

2025-02-18 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118892 --- Comment #9 from Tamar Christina --- (In reply to Andrew Pinski from comment #8) > (In reply to Tamar Christina from comment #7) > > > > But operand1 is marked as `register_operand` which means whatever did the > > expansion didn't honor the

[Bug c++/118921] New: C++ frontend does not honor GCC pragma optimize

2025-02-18 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118921 Bug ID: 118921 Summary: C++ frontend does not honor GCC pragma optimize Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component:

[Bug middle-end/26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

2025-02-18 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163 Bug 26163 depends on bug 117270, which changed state. Bug 117270 Summary: [15 Regression] 9% exec time slowdown of 538.imagick_r on aarch64 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117270 What|Removed |Adde

[Bug target/117270] [15 Regression] 9% exec time slowdown of 538.imagick_r on aarch64

2025-02-18 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117270 Tamar Christina changed: What|Removed |Added Resolution|FIXED |--- Status|RESOLVED

[Bug target/118892] [14/15 Regression] ICE (segfault) in rebuild_jump_labels on aarch64-linux-gnu since r14-5289

2025-02-18 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118892 Tamar Christina changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot gnu.org

[Bug middle-end/26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

2025-02-17 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163 Bug 26163 depends on bug 118691, which changed state. Bug 118691 Summary: [15 Regression] gcc_r in SPECCPU 2017 miscompare on train dataset https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118691 What|Removed |Adde

[Bug middle-end/118691] [15 Regression] gcc_r in SPECCPU 2017 miscompare on train dataset

2025-02-17 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118691 Tamar Christina changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug tree-optimization/118464] [15 Regression] gcc-15.0.0_pre20250112 ICE with opencv-4.10.0 using -O2/-ftree-loop-vectorize: memory_descriptor_ref.cpp:94:19: internal compiler error: in exact_div, at

2025-02-17 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118464 --- Comment #14 from Tamar Christina --- Still being worked on, I'll send v3 of the patch today or tomorrow.

[Bug rtl-optimization/118611] LRA inserts unneeded reload on FMA chain

2025-02-13 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118611 --- Comment #8 from Tamar Christina --- Yeah, that makes sense. Thanks for working on it! We've been trying to reduce the different cases where we see this happening in the hopes to provide more data to tune any possible heuristics. So the pa

[Bug rtl-optimization/118611] LRA inserts unneeded reload on FMA chain

2025-02-13 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118611 Tamar Christina changed: What|Removed |Added CC||acoplan at gcc dot gnu.org --- Commen

[Bug tree-optimization/118852] [15 regression] Train run of 502.gcc_r compiled with -Ofast -fprofile-generate -march=x86_64-v3 fails

2025-02-13 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118852 --- Comment #4 from Tamar Christina --- (In reply to ktkachov from comment #3) > FWIW I see this also on aarch64 I filed the AArch64 bug weeks ago https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118691, there we don't need -fprofile-generate to tr

[Bug target/118800] [13 regression] aarch64 -mcpu=native ICEs since PR113257 backport

2025-02-12 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118800 Tamar Christina changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug tree-optimization/118211] tree-vectorize: vectorize input.cc, find_end_of_line

2025-02-10 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118211 Bug 118211 depends on bug 118754, which changed state. Bug 118754 Summary: [15 Regression] FAIL: gcc.target/i386/pr106010-8c.c by r15-6807-g68326d5d1a593d https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118754 What|Removed

[Bug target/118753] [15 Regression] [meta-bug] GCC 15 Regression on x86

2025-02-10 Thread tnfchris at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118753 Bug 118753 depends on bug 118754, which changed state. Bug 118754 Summary: [15 Regression] FAIL: gcc.target/i386/pr106010-8c.c by r15-6807-g68326d5d1a593d https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118754 What|Removed

  1   2   3   4   5   6   7   8   9   10   >