https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117718
--- Comment #3 from Michael Meissner ---
No, the issue is with DQ addressing (i.e. vector load/store with offset), we
can't guarantee that the external address will be properly aligned with the
bottom 4 bits must be set to 0.
In theory, we have
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117729
Bug ID: 117729
Summary: On power10 consider using vector pair load/store in
prologue/epilog in saving vector registers
Product: gcc
Version: 15.0
Status: UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117721
Bug ID: 117721
Summary: Big endian test suite failures comparing default cpu
and --with-cpu=power7
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Severity: nor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79137
Michael Meissner changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117487
Bug ID: 117487
Summary: Power8 optimizations for math library aren't done in
power9 or power10 (PR target/71977)
Product: gcc
Version: 15.0
Status: UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117251
--- Comment #11 from Michael Meissner ---
For singlebuff.c, there is a clear improvement when using the XXEVAL
instruction:
XXEVAL TRUNK GCC14 GCC13 GCC12 GCC11
-- - - - - -
-O3: 4.46 5.40
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117251
--- Comment #10 from Michael Meissner ---
There is an instruction that was added in power10 (XXEVAL) that does provide
fusion between VSX vectors that includes ANDC->XOR and XOR->XOR fusion. I have
coded up patches to support this and I will be
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117251
--- Comment #9 from Michael Meissner ---
I tried several of the options to change the code generation:
-mno-power10-fusion which disables doing the fusion pairing.
Combinations of -fno-schedule-insns and -fno-schedule-insns2.
-fno-sched-press
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117251
--- Comment #8 from Michael Meissner ---
I added an option to not do the combiner patterns until after reload, and it
does not seem to fire at all.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117251
--- Comment #5 from Michael Meissner ---
For the singlebuff.c benchmark, the numbers are:
Trunk (sources checked out October 5th):5.40 seconds
GCC 14 (sources checked out October 21st): 5.40 seconds
GCC 13 (sources checked out October 21
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117251
--- Comment #6 from Michael Meissner ---
Note, in the first comment, I mis-read the instruction, and the instruction
being used is vector unsigned long long rotate left, and not vector unsigned
long long shift left.
I.e.:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117251
--- Comment #4 from Michael Meissner ---
I tracked down the commit that first made the slowdown visible:
commit 3a61ca1b9256535e1bfb19b2d46cde21f3908a5d (HEAD)
Author: Jan Hubicka
Date: Thu Jul 6 18:56:22 2023 +0200
Improve profile upda
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117251
Michael Meissner changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |meissner at gcc dot
gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117251
--- Comment #2 from Michael Meissner ---
Created attachment 59406
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59406&action=edit
Singlebuff.c test
The singlebuff.c is a simpler test case than multibuff.c. However, the numbers
quoted an
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117251
Michael Meissner changed:
What|Removed |Added
Priority|P3 |P2
Version|15.0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117251
Bug ID: 117251
Summary: SHA3 code for PowerPC has a major slow down
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Severity: major
Priority: P3
Component: targe
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89213
Michael Meissner changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114742
--- Comment #4 from Michael Meissner ---
The minimum architecture for IEEE 128-bit support is power7, because it needs
the VSX registers to pass and return IEEE 128-bit values.
Now, in theory, IEEE 128-bit support could have required only Altiv
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107757
--- Comment #4 from Michael Meissner ---
Note, this code only shows up when the target CPU is power8.
For the following code:
vector long long lsb64()
{
return vec_splats(1LL);
}
Both power9 and power10 generate:
xxspltib 34,1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89213
Michael Meissner changed:
What|Removed |Added
Attachment #58918|0 |1
is obsolete|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89213
Michael Meissner changed:
What|Removed |Added
Attachment #45612|0 |1
is obsolete|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107757
--- Comment #3 from Michael Meissner ---
As Segher says, the test is not quite correct. I would write it as:
vector long long lsb64_opt()
{
vector long long a = vec_splats(~0LL);
__asm__("vsrd %0,%1,%2":"=v"(a):"v"(a),"v"(a));
return
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115800
--- Comment #6 from Michael Meissner ---
Of course it would also apply if you are building a BE compiler that has little
endian multilibs, you would run into the same situation.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115800
--- Comment #5 from Michael Meissner ---
And libstdc++-v3 errors are similar:
mkdir -p ./powerpc64le-unknown-linux-gnu/bits/stdc++.h.gch
/home/meissner/fsf-build-ppc64le/work171-p5/./gcc/xgcc -shared-libgcc
-B/home/meissner/fsf-build-ppc64le/wo
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115800
--- Comment #4 from Michael Meissner ---
Libgfortran gives various errors that _Float128 is not supported on this
target.
libtool: compile: /home/meissner/fsf-build-ppc64le/work171-p5/./gcc/xgcc
-B/home/meissner/fsf-build-ppc64le/work171-p5/./
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115800
Bug ID: 115800
Summary: PowerPC GCC cannot build a little endian compile if
--with-cpu=power5 is used
Product: gcc
Version: unknown
Status: UNCONFIRMED
Severit
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113652
--- Comment #23 from Michael Meissner ---
This is one of those things where there is no right answer in part because we
need other things to flesh out the support.
The reason -mvsx was used is we need the VSX registers to build the IEEE
128-bit
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94630
Michael Meissner changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101019
Michael Meissner changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99708
Michael Meissner changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104772
Bug 104772 depends on bug 99708, which changed state.
Bug 99708 Summary: __SIZEOF_FLOAT128__ not defined on powerpc64le-linux
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99708
What|Removed |Added
-
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110960
Michael Meissner changed:
What|Removed |Added
CC||meissner at gcc dot gnu.org
--- Comm
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113652
--- Comment #19 from Michael Meissner ---
When I wrote the VSX support many years ago, I intended that -mvsx enable all
of ISA 2.06, which includes ISA 2.05, etc.
My intentions were there 2 options for power7, one is the base ISA 2.07 support
f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70928
Michael Meissner changed:
What|Removed |Added
Ever confirmed|0 |1
Last reconfirmed|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=31418
Michael Meissner changed:
What|Removed |Added
Resolution|--- |FIXED
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112886
Bug ID: 112886
Summary: We need a new print_operand output modifier for vector
double
Product: gcc
Version: unknown
Status: UNCONFIRMED
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104698
Michael Meissner changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111778
Michael Meissner changed:
What|Removed |Added
Severity|normal |major
Priority|P3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111778
Bug ID: 111778
Summary: PowerPC constant code change uses an undefined shift
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Compon
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105325
Michael Meissner changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103498
Michael Meissner changed:
What|Removed |Added
Resolution|--- |FIXED
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109067
Michael Meissner changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70243
--- Comment #5 from Michael Meissner ---
Created attachment 54814
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54814&action=edit
Test case
This is test case that shows the generation of fmaddfp and fnmsubfp.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105325
Michael Meissner changed:
What|Removed |Added
Assignee|acsawdey at gcc dot gnu.org|meissner at gcc dot
gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109067
Bug ID: 109067
Summary: Powerpc GCC does not support __ibm128 complex
multiply/divide if long double is IEEE 128-bit.
Product: gcc
Version: 13.0
Status: UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108958
Bug ID: 108958
Summary: Powerpcle could generate mtvsrdd for zero extend DI to
TI mode, when the TImode is in a vector register
Product: gcc
Version: 13.0
Status: UNCONF
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108623
--- Comment #7 from Michael Meissner ---
Created attachment 54387
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54387&action=edit
Proposed patch combining Richard's patch and an assertion.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108623
Michael Meissner changed:
What|Removed |Added
Last reconfirmed||2023-02-01
Ever confirmed|0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108623
--- Comment #4 from Michael Meissner ---
I must have missed the spare bits. I think it is better to use the full 16
bits for precision. I also think your other changes to realign bit fields
greater than 1 bit.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108623
Bug ID: 108623
Summary: We need to grow the precision field in
tree_type_common for PowerPC
Product: gcc
Version: 13.0
Status: UNCONFIRMED
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93738
Michael Meissner changed:
What|Removed |Added
CC||meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106345
Michael Meissner changed:
What|Removed |Added
CC||meissner at gcc dot gnu.org
--- Comm
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106682
Bug ID: 106682
Summary: Powerpc test
gcc.target/powerpc/pr86731-fwrapv-longlong.c fails on
power8, passes on power9/power10
Product: gcc
Version: unknown
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106681
Bug ID: 106681
Summary: Powerpc test gcc.dg/pr104992.c fails on power10
Product: gcc
Version: unknown
Status: UNCONFIRMED
Severity: normal
Priority: P3
Componen
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106680
Bug ID: 106680
Summary: Test gcc.target/powerpc/bswap64-4.c fails on 32-bit BE
Product: gcc
Version: unknown
Status: UNCONFIRMED
Severity: normal
Priority: P3
C
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101169
Michael Meissner changed:
What|Removed |Added
Ever confirmed|0 |1
CC|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96983
Michael Meissner changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104868
--- Comment #8 from Michael Meissner ---
Matheus, try the patch I just attached to the PR that I posted to the
gcc-patches mailing list.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104868
Michael Meissner changed:
What|Removed |Added
CC||meissner at gcc dot gnu.org
--- Comm
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104868
--- Comment #4 from Michael Meissner ---
In looking at it, the reason is the convert from DImode to TImode has several
constraints. The constraint that matters in this case has the output being an
Altivec register, while the input is a GPR regi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104253
Michael Meissner changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104698
--- Comment #3 from Michael Meissner ---
It goes beyond 'just use RTL'.
The problem is the code only generates an altivec instruction. So if the
__int128_t value is in a GPR, the compiler will need to do a move to the vector
registers (1 insn)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104698
Bug ID: 104698
Summary: Inefficient code for DI to TI sign extend on power10
Product: gcc
Version: 11.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Compon
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104335
Michael Meissner changed:
What|Removed |Added
CC||asolokha at gmx dot com
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104256
Michael Meissner changed:
What|Removed |Added
Resolution|--- |DUPLICATE
Status|ASSIGNE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104256
--- Comment #1 from Michael Meissner ---
Created attachment 52463
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52463&action=edit
Proposed patch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104256
Michael Meissner changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |meissner at gcc dot
gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99197
Michael Meissner changed:
What|Removed |Added
CC||meissner at gcc dot gnu.org
R
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102059
--- Comment #31 from Michael Meissner ---
Created attachment 52383
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52383&action=edit
Simpler patch to fix the problem with power8-fusion.
This patch just ignores the -mpower8-fusion option in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104253
--- Comment #11 from Michael Meissner ---
The patch has been posted, I'm awaiting approval.
https://gcc.gnu.org/pipermail/gcc-patches/2022-January/589469.html
BTW, the copy_to_mode_reg bug I mentioned earlier goes away with the patch.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104253
Michael Meissner changed:
What|Removed |Added
Attachment #52306|0 |1
is obsolete|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104253
--- Comment #8 from Michael Meissner ---
Yes, you are right. I didn't remember which functions were generated by the
compiler, but I just did all of the conversion functions.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104124
--- Comment #3 from Michael Meissner ---
There are two things going on.
1) There is no vspltisd instruction, so we can't generate a single instruction
to load constants other than 0 or -1. Unfortunately, this was not added in
either power9 or
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104253
Michael Meissner changed:
What|Removed |Added
Status|NEW |ASSIGNED
Assignee|unassign
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104253
--- Comment #4 from Michael Meissner ---
Created attachment 52306
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52306&action=edit
Patch to use the correct names for __ibm128 converts if long double is IEEE
128-bit
The problem was interna
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104253
Michael Meissner changed:
What|Removed |Added
Last reconfirmed||2022-01-26
Status|UNCONF
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103763
Michael Meissner changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104136
Michael Meissner changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104136
--- Comment #5 from Michael Meissner ---
Fixed in commit f9063d12633c62a089115df032a19295854d8b06 on January 21, 2022.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104136
Michael Meissner changed:
What|Removed |Added
Attachment #52246|0 |1
is obsolete|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104136
Michael Meissner changed:
What|Removed |Added
Attachment #52244|0 |1
is obsolete|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104136
--- Comment #1 from Michael Meissner ---
Created attachment 52244
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52244&action=edit
Patch to mark XXSPLTIW and XXSPLTIDP as possibly being prefixed
If you compile module_advect_em.F90 with -O
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104136
Michael Meissner changed:
What|Removed |Added
Priority|P3 |P1
Severity|normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104136
Bug ID: 104136
Summary: Gcc cannot compile wrf_r for power10 using -Ofast
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102935
Michael Meissner changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102935
Michael Meissner changed:
What|Removed |Added
Attachment #52143|0 |1
is obsolete|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102935
--- Comment #2 from Michael Meissner ---
Created attachment 52143
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52143&action=edit
Patch to update code generation test
The test wants to load all 1's into a vector register. On power8 it u
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102935
Michael Meissner changed:
What|Removed |Added
Status|UNCONFIRMED |ASSIGNED
Last reconfirmed|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103763
--- Comment #1 from Michael Meissner ---
Created attachment 52141
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52141&action=edit
Patch to fix the insn count
Update the insn regex for power10.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103763
Michael Meissner changed:
What|Removed |Added
Ever confirmed|0 |1
Last reconfirmed|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103498
Bug ID: 103498
Summary: Spec 2017 imagick_r is 2.62% slower on Power10 with
pc-relative addressing compared to not using
pc-relative addressing
Product: gcc
Vers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99921
Michael Meissner changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
Bug 26163 depends on bug 103320, which changed state.
Bug 103320 Summary: 12 Regression] Spec 2017 benchmark roms_r fails on PowerPC
for -Ofast
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103320
What|Removed |
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103320
Michael Meissner changed:
What|Removed |Added
Resolution|--- |WONTFIX
Status|UNCONFIRM
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103387
Michael Meissner changed:
What|Removed |Added
Severity|normal |major
Last reconfirmed|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103317
Michael Meissner changed:
What|Removed |Added
Priority|P2 |P1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103318
Michael Meissner changed:
What|Removed |Added
Priority|P2 |P1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103320
Michael Meissner changed:
What|Removed |Added
Priority|P2 |P1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103320
Michael Meissner changed:
What|Removed |Added
CC||bergner at gcc dot gnu.org,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103320
Bug ID: 103320
Summary: Spec 2017 benchmark roms_r fails on PowerPC for -Ofast
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Comp
1 - 100 of 171 matches
Mail list logo