https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117718
--- Comment #3 from Michael Meissner ---
No, the issue is with DQ addressing (i.e. vector load/store with offset), we
can't guarantee that the external address will be properly aligned with the
bottom 4 bits must be set to 0.
In theory, we have
Severity: enhancement
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: meissner at gcc dot gnu.org
Target Milestone: ---
GCC should consider using the load vector pair and store vector pair
instructions in the prologue
: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: meissner at gcc dot gnu.org
Target Milestone: ---
I build a GCC trunk on the gcc110 cfarm system. I got the following failures
when I built GCC without using --with-cpu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79137
Michael Meissner changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|---
IRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: meissner at gcc dot gnu.org
Target Milestone: ---
I was answering an email about something else, and I wanted to look up code
that I added in Januar
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117251
--- Comment #11 from Michael Meissner ---
For singlebuff.c, there is a clear improvement when using the XXEVAL
instruction:
XXEVAL TRUNK GCC14 GCC13 GCC12 GCC11
-- - - - - -
-O3: 4.46 5.40
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117251
--- Comment #10 from Michael Meissner ---
There is an instruction that was added in power10 (XXEVAL) that does provide
fusion between VSX vectors that includes ANDC->XOR and XOR->XOR fusion. I have
coded up patches to support this and I will be
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117251
--- Comment #9 from Michael Meissner ---
I tried several of the options to change the code generation:
-mno-power10-fusion which disables doing the fusion pairing.
Combinations of -fno-schedule-insns and -fno-schedule-insns2.
-fno-sched-press
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117251
--- Comment #8 from Michael Meissner ---
I added an option to not do the combiner patterns until after reload, and it
does not seem to fire at all.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117251
--- Comment #5 from Michael Meissner ---
For the singlebuff.c benchmark, the numbers are:
Trunk (sources checked out October 5th):5.40 seconds
GCC 14 (sources checked out October 21st): 5.40 seconds
GCC 13 (sources checked out October 21
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117251
--- Comment #6 from Michael Meissner ---
Note, in the first comment, I mis-read the instruction, and the instruction
being used is vector unsigned long long rotate left, and not vector unsigned
long long shift left.
I.e.:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117251
--- Comment #4 from Michael Meissner ---
I tracked down the commit that first made the slowdown visible:
commit 3a61ca1b9256535e1bfb19b2d46cde21f3908a5d (HEAD)
Author: Jan Hubicka
Date: Thu Jul 6 18:56:22 2023 +0200
Improve profile upda
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117251
Michael Meissner changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |meissner at gcc dot
gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117251
--- Comment #2 from Michael Meissner ---
Created attachment 59406
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59406&action=edit
Singlebuff.c test
The singlebuff.c is a simpler test case than multibuff.c. However, the numbers
quoted an
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117251
Michael Meissner changed:
What|Removed |Added
Priority|P3 |P2
Version|15.0
: target
Assignee: unassigned at gcc dot gnu.org
Reporter: meissner at gcc dot gnu.org
Target Milestone: ---
Created attachment 59405
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59405&action=edit
Multibuff.c test
The sha3 functions compiled for the powerpc has a s
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89213
Michael Meissner changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114742
--- Comment #4 from Michael Meissner ---
The minimum architecture for IEEE 128-bit support is power7, because it needs
the VSX registers to pass and return IEEE 128-bit values.
Now, in theory, IEEE 128-bit support could have required only Altiv
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107757
--- Comment #4 from Michael Meissner ---
Note, this code only shows up when the target CPU is power8.
For the following code:
vector long long lsb64()
{
return vec_splats(1LL);
}
Both power9 and power10 generate:
xxspltib 34,1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89213
Michael Meissner changed:
What|Removed |Added
Attachment #58918|0 |1
is obsolete|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89213
Michael Meissner changed:
What|Removed |Added
Attachment #45612|0 |1
is obsolete|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107757
--- Comment #3 from Michael Meissner ---
As Segher says, the test is not quite correct. I would write it as:
vector long long lsb64_opt()
{
vector long long a = vec_splats(~0LL);
__asm__("vsrd %0,%1,%2":"=v"(a):"v"(a),"v"(a));
return
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115800
--- Comment #6 from Michael Meissner ---
Of course it would also apply if you are building a BE compiler that has little
endian multilibs, you would run into the same situation.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115800
--- Comment #5 from Michael Meissner ---
And libstdc++-v3 errors are similar:
mkdir -p ./powerpc64le-unknown-linux-gnu/bits/stdc++.h.gch
/home/meissner/fsf-build-ppc64le/work171-p5/./gcc/xgcc -shared-libgcc
-B/home/meissner/fsf-build-ppc64le/wo
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115800
--- Comment #4 from Michael Meissner ---
Libgfortran gives various errors that _Float128 is not supported on this
target.
libtool: compile: /home/meissner/fsf-build-ppc64le/work171-p5/./gcc/xgcc
-B/home/meissner/fsf-build-ppc64le/work171-p5/./
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: meissner at gcc dot gnu.org
Target Milestone: ---
The libgfortran and libstdc++-v3 libraries cannot be built if you build a
little endian compiler and set the default
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113652
--- Comment #23 from Michael Meissner ---
This is one of those things where there is no right answer in part because we
need other things to flesh out the support.
The reason -mvsx was used is we need the VSX registers to build the IEEE
128-bit
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94630
Michael Meissner changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101019
Michael Meissner changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99708
Michael Meissner changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104772
Bug 104772 depends on bug 99708, which changed state.
Bug 99708 Summary: __SIZEOF_FLOAT128__ not defined on powerpc64le-linux
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99708
What|Removed |Added
-
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110960
Michael Meissner changed:
What|Removed |Added
CC||meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113652
--- Comment #19 from Michael Meissner ---
When I wrote the VSX support many years ago, I intended that -mvsx enable all
of ISA 2.06, which includes ISA 2.05, etc.
My intentions were there 2 options for power7, one is the base ISA 2.07 support
f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70928
Michael Meissner changed:
What|Removed |Added
Ever confirmed|0 |1
Last reconfirmed|
|RESOLVED
CC||meissner at gcc dot gnu.org
--- Comment #2 from Michael Meissner ---
I built the current GCC 14 development compiler using -O2 -funroll-loops
-funsafe-math-optimizations, and it built fine. I suspect it had been fixed
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: meissner at gcc dot gnu.org
Target Milestone: ---
I've been working with vector double support to provide faster memory latency
for specialized applications. While the work
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104698
Michael Meissner changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
,
||meissner at gcc dot gnu.org,
||segher at gcc dot gnu.org
Build||powerpc64le-unknown-linux-g
||nu
Target||powerpc64le-unknown
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: meissner at gcc dot gnu.org
Target Milestone: ---
I was building a cross compiler to PowerPC on my x86_86 workstation with the
latest version of GCC on October 11th. I could not build the compiler on the
x86_64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105325
Michael Meissner changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103498
Michael Meissner changed:
What|Removed |Added
Resolution|--- |FIXED
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109067
Michael Meissner changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70243
--- Comment #5 from Michael Meissner ---
Created attachment 54814
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54814&action=edit
Test case
This is test case that shows the generation of fmaddfp and fnmsubfp.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105325
Michael Meissner changed:
What|Removed |Added
Assignee|acsawdey at gcc dot gnu.org|meissner at gcc dot
gnu.org
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: meissner at gcc dot gnu.org
Target Milestone: ---
: UNCONFIRMED
Severity: enhancement
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: meissner at gcc dot gnu.org
Target Milestone: ---
If you have a DImode variable (i.e. long) in a GPR, and you want to zero extend
it to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108623
--- Comment #7 from Michael Meissner ---
Created attachment 54387
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54387&action=edit
Proposed patch combining Richard's patch and an assertion.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108623
Michael Meissner changed:
What|Removed |Added
Last reconfirmed||2023-02-01
Ever confirmed|0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108623
--- Comment #4 from Michael Meissner ---
I must have missed the spare bits. I think it is better to use the full 16
bits for precision. I also think your other changes to realign bit fields
greater than 1 bit.
Priority: P3
Component: other
Assignee: unassigned at gcc dot gnu.org
Reporter: meissner at gcc dot gnu.org
Target Milestone: ---
The current patches that have been submitted to the PowerPC back end need to
grow the precision field in the tree_type_common
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93738
Michael Meissner changed:
What|Removed |Added
CC||meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106345
Michael Meissner changed:
What|Removed |Added
CC||meissner at gcc dot gnu.org
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: meissner at gcc dot gnu.org
Target Milestone: ---
I was doing builds on a power10 for patch submission, and I noticed the
Component: testsuite
Assignee: unassigned at gcc dot gnu.org
Reporter: meissner at gcc dot gnu.org
Target Milestone: ---
I was doing builds on a power10 system for patch submission, and I noticed the
following test fails when the test is compiled for power10, but it does not
fail
Component: testsuite
Assignee: unassigned at gcc dot gnu.org
Reporter: meissner at gcc dot gnu.org
Target Milestone: ---
I was doing some builds for submitting patches, and I did runs on BE systems as
well as LE systems.
I noticed the test gcc.target/powerpc/bswap64-4.c fails
||meissner at gcc dot gnu.org
Status|UNCONFIRMED |NEW
Last reconfirmed||2022-08-18
--- Comment #3 from Michael Meissner ---
The fold-vec-extract tests work fine on the development version of GCC 13 for
64-bit, but they are
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96983
Michael Meissner changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104868
--- Comment #8 from Michael Meissner ---
Matheus, try the patch I just attached to the PR that I posted to the
gcc-patches mailing list.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104868
Michael Meissner changed:
What|Removed |Added
CC||meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104868
--- Comment #4 from Michael Meissner ---
In looking at it, the reason is the convert from DImode to TImode has several
constraints. The constraint that matters in this case has the output being an
Altivec register, while the input is a GPR regi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104253
Michael Meissner changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104698
--- Comment #3 from Michael Meissner ---
It goes beyond 'just use RTL'.
The problem is the code only generates an altivec instruction. So if the
__int128_t value is in a GPR, the compiler will need to do a move to the vector
registers (1 insn)
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: meissner at gcc dot gnu.org
Target Milestone: ---
On power10, signed conversion from DImode to TImode is inefficient for GCC 11
and the current GCC 12. GCC 10 does not do this optimization.
On power10, GCC tries
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104335
Michael Meissner changed:
What|Removed |Added
CC||asolokha at gmx dot com
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104256
Michael Meissner changed:
What|Removed |Added
Resolution|--- |DUPLICATE
Status|ASSIGNE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104256
--- Comment #1 from Michael Meissner ---
Created attachment 52463
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52463&action=edit
Proposed patch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104256
Michael Meissner changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |meissner at gcc dot
gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99197
Michael Meissner changed:
What|Removed |Added
CC||meissner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102059
--- Comment #31 from Michael Meissner ---
Created attachment 52383
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52383&action=edit
Simpler patch to fix the problem with power8-fusion.
This patch just ignores the -mpower8-fusion option in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104253
--- Comment #11 from Michael Meissner ---
The patch has been posted, I'm awaiting approval.
https://gcc.gnu.org/pipermail/gcc-patches/2022-January/589469.html
BTW, the copy_to_mode_reg bug I mentioned earlier goes away with the patch.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104253
Michael Meissner changed:
What|Removed |Added
Attachment #52306|0 |1
is obsolete|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104253
--- Comment #8 from Michael Meissner ---
Yes, you are right. I didn't remember which functions were generated by the
compiler, but I just did all of the conversion functions.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104124
--- Comment #3 from Michael Meissner ---
There are two things going on.
1) There is no vspltisd instruction, so we can't generate a single instruction
to load constants other than 0 or -1. Unfortunately, this was not added in
either power9 or
|unassigned at gcc dot gnu.org |meissner at gcc dot
gnu.org
--- Comment #5 from Michael Meissner ---
The other issue that I mentioned in note #2 is likely a different issue when
-mabi=ibmlongdouble is used. I didn't have the patch to automatically use IEEE
128-bit if the compiler used to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104253
--- Comment #4 from Michael Meissner ---
Created attachment 52306
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52306&action=edit
Patch to use the correct names for __ibm128 converts if long double is IEEE
128-bit
The problem was interna
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104253
Michael Meissner changed:
What|Removed |Added
Last reconfirmed||2022-01-26
Status|UNCONF
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103763
Michael Meissner changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104136
Michael Meissner changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104136
--- Comment #5 from Michael Meissner ---
Fixed in commit f9063d12633c62a089115df032a19295854d8b06 on January 21, 2022.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104136
Michael Meissner changed:
What|Removed |Added
Attachment #52246|0 |1
is obsolete|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104136
Michael Meissner changed:
What|Removed |Added
Attachment #52244|0 |1
is obsolete|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104136
--- Comment #1 from Michael Meissner ---
Created attachment 52244
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52244&action=edit
Patch to mark XXSPLTIW and XXSPLTIDP as possibly being prefixed
If you compile module_advect_em.F90 with -O
|critical
Host||powerpc64le-unknown-linux-g
||nu
Assignee|unassigned at gcc dot gnu.org |meissner at gcc dot
gnu.org
Target||powerpc64le
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: meissner at gcc dot gnu.org
Target Milestone: ---
Using the current trunk compiler (from January 18th, 2022), I cannot compile
the module_advect_em fortran module with either -Ofast or -O3 using my normal
spec build
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102935
Michael Meissner changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102935
Michael Meissner changed:
What|Removed |Added
Attachment #52143|0 |1
is obsolete|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102935
--- Comment #2 from Michael Meissner ---
Created attachment 52143
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52143&action=edit
Patch to update code generation test
The test wants to load all 1's into a vector register. On power8 it u
||2022-01-07
CC||dje at gcc dot gnu.org,
||meissner at gcc dot gnu.org,
||segher at gcc dot gnu.org
Assignee|unassigned at gcc dot
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103763
--- Comment #1 from Michael Meissner ---
Created attachment 52141
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52141&action=edit
Patch to fix the insn count
Update the insn regex for power10.
||2022-01-07
Status|UNCONFIRMED |ASSIGNED
Assignee|unassigned at gcc dot gnu.org |meissner at gcc dot
gnu.org
Version: 12.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: meissner at gcc dot gnu.org
Target Milestone: ---
I was doing some Spec 2017 rate runs on a single power10
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99921
Michael Meissner changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
Bug 26163 depends on bug 103320, which changed state.
Bug 103320 Summary: 12 Regression] Spec 2017 benchmark roms_r fails on PowerPC
for -Ofast
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103320
What|Removed |
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103320
Michael Meissner changed:
What|Removed |Added
Resolution|--- |WONTFIX
Status|UNCONFIRM
||2021-11-23
Priority|P3 |P1
CC||meissner at gcc dot gnu.org
Ever confirmed|0 |1
Status|UNCONFIRMED |NEW
--- Comment #1 from Michael Meissner
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103317
Michael Meissner changed:
What|Removed |Added
Priority|P2 |P1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103318
Michael Meissner changed:
What|Removed |Added
Priority|P2 |P1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103320
Michael Meissner changed:
What|Removed |Added
Priority|P2 |P1
,
||dje at gcc dot gnu.org,
||meissner at gcc dot gnu.org,
||segher at gcc dot gnu.org,
||wschmidt at gcc dot gnu.org
Component: regression
Assignee: unassigned at gcc dot gnu.org
Reporter: meissner at gcc dot gnu.org
Target Milestone: ---
The Spec 2017 benchmark roms_r compiles fine but produces the wrong output when
compiled with -Ofast options on both power9 and power10. In going back with
1 - 100 of 1287 matches
Mail list logo