https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118259
--- Comment #4 from mjr19 at cam dot ac.uk ---
Add using
seed=iand(seed*int(1103515245,selected_int_kind(18))+12345,z'7fff')
also works as expected. Converting the code to C shows the same behaviour as
the Fortran if seed is a static int
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118259
--- Comment #3 from mjr19 at cam dot ac.uk ---
That is a very interesting point. If I change the constants in the random
number generator to
seed=iand(seed*110+123,z'7fff')
then the answer with '-O3' is
0 0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118259
Bug ID: 118259
Summary: -O3 optimisation bug fixed with -fno-inline
Product: gcc
Version: 14.1.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: fo
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117805
--- Comment #20 from mjr19 at cam dot ac.uk ---
I am not convinced that gfortran's current behaviour is wholly consistent with
what a mathematician would reasonably expect. When I was taught complex
arithmetic, multiplication by one and addition
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117805
--- Comment #13 from mjr19 at cam dot ac.uk ---
(In reply to kargls from comment #11)
> On 11/28/24 04:54, rguenth at gcc dot gnu.org wrote:
>
> The Fortran standard stops at this point and does not specify the
> actual underlying algorithm.
Th
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117805
--- Comment #9 from mjr19 at cam dot ac.uk ---
(In reply to kargls from comment #6)
I agree that parts of the reasoning from J3 are a little surprising, but other
parts seem sound, and the conclusion is unambiguous.
(I also disagree with its cl
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117805
--- Comment #5 from mjr19 at cam dot ac.uk ---
Compiling with -fno-signed-zeros does work surprisingly well.
I say "surprisingly", as I think that the change affects more than just signed
zeros, in that 3.0*(2.0,Inf) might be (6.0,Inf) or (NaN,I
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117805
--- Comment #2 from mjr19 at cam dot ac.uk ---
There will certainly be differences in some cases. If R=2.0 and Z=-0.0i the
answer might be (0.0,0.0) or (0.0,-0.0).
The point is that Fortran does not specify which of these is correct. Both are
re
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117805
Bug ID: 117805
Summary: complex type, -Ofast and IEEE-754
Product: gcc
Version: 14.1.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: fortran
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107294
--- Comment #7 from mjr19 at cam dot ac.uk ---
I was sufficiently confused having read the Standard to persuade Dr John Reid
to submit a request for clarification to J3, the Fortran Standards Committee.
The request is at https://j3-fortran.org/d
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116128
--- Comment #5 from mjr19 at cam dot ac.uk ---
I think in general using partial sums improves accuracy.
If one assumes that all of the data have the same sign and similar magnitude,
then by the time the sum is nearly complete one is adding a sin
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116128
--- Comment #3 from mjr19 at cam dot ac.uk ---
It seems that most of these are in-line expanded by gfortran-14.1, at least in
some cases.
function foo(a,n)
real(kind(1d0))::a(*),foo
integer::n
foo=sum(a(1:n))
end function foo
and
funct
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114767
--- Comment #8 from mjr19 at cam dot ac.uk ---
If it is tricky to teach gfortran that it can flip the signs of alternate
elements in a vector trivially with an xor, would a possible step to an
improvement be to teach it that the cost of vpermpd (
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116128
--- Comment #1 from mjr19 at cam dot ac.uk ---
The same comment applies to maxval and minval, which vectorise with -Ofast only
for -mavx2, although the answer will be independent of the ordering of the
scalar min/max operations.
In contrast, ial
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116109
--- Comment #3 from mjr19 at cam dot ac.uk ---
It might be helpful if GCC considered this optimisation separately from
unrolling.
Traditional unrolling attempts to reduce the overhead of the (integer) loop
control instructions, but with floating
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116128
Bug ID: 116128
Summary: missed optimisation: fortran sum instrinsic performed
in order
Product: gcc
Version: 14.1.0
Status: UNCONFIRMED
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116109
Bug ID: 116109
Summary: Missed optimisation: unnecessary register dependency
on reduction
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115709
--- Comment #3 from mjr19 at cam dot ac.uk ---
Created attachment 58558
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58558&action=edit
Demo of effect of vperm rearrangement
I still believe that my code is correct. To make what I propose
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115711
Bug ID: 115711
Summary: Fortran: extra malloc and copy with transfer
Product: gcc
Version: 14.1.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115710
Bug ID: 115710
Summary: fortran complex abs does not vectorise
Product: gcc
Version: 14.1.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: fortran
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115709
Bug ID: 115709
Summary: missed optimisation: vperms not reordered to eliminate
Product: gcc
Version: 14.1.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Co
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114324
--- Comment #8 from mjr19 at cam dot ac.uk ---
Ooops -- timings not ns/iteration as claimed, nor even comparable between the
m3spf and m4spf examples, but they are consistent within each example.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114324
--- Comment #7 from mjr19 at cam dot ac.uk ---
The patch to GCC 15 in commit
r15-1508-g59221dc587f369695d9b0c2f73aedf8458931f0f from pr 68855 has made a
significant improvement to the optimisation of these examples at -O3, causing
the -Ofast ver
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115563
--- Comment #6 from mjr19 at cam dot ac.uk ---
A further comment to aid others reading this report. It is not just unnecessary
brackets which used to prevent vectorisation, but also necessary ones.
subroutine foo(a,b,c,n)
complex (kind(1d0)) :
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115563
--- Comment #5 from mjr19 at cam dot ac.uk ---
I'm glad this was useful, and thanks for the impressively rapid fix. I stumbled
across this by chance whilst trying to construct a minimal example for a rather
different missed vectorisation case.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115563
Bug ID: 115563
Summary: Unnecessary brackets prevent fortran vectorisation
Product: gcc
Version: 14.1.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Compon
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107294
mjr19 at cam dot ac.uk changed:
What|Removed |Added
CC||mjr19 at cam dot ac.uk
--- Comm
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114767
--- Comment #7 from mjr19 at cam dot ac.uk ---
Another manifestation of this issue in GCC 13.1 and 14.1 is that the loop
do i=1,n
c(i)=a(i)*c(i)*(0d0,1d0)
enddo
takes about twice as long to run as
do i=1,n
c(i)=a(i)*(0d0,1d0)*c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114324
--- Comment #5 from mjr19 at cam dot ac.uk ---
Note that bug 114767 also turns out to be a case in which the inability to
alternate neg and nop along a vector leads to poor performance with some
operations on the complex type. That optimisation i
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114767
--- Comment #6 from mjr19 at cam dot ac.uk ---
I was starting to wonder whether this issue might be related to that in bug
114324, which is a slightly more complicated example in which multiplication by
a purely imaginary number destroys vectoris
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114767
--- Comment #4 from mjr19 at cam dot ac.uk ---
An issue which I suspect is related is shown by
subroutine zradd(c,n)
integer :: i,n
complex(kind(1d0)) :: c(*)
do i=1,n
c(i)=c(i)+1d0
enddo
end subroutine
If compiled with gfortran-1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114767
--- Comment #2 from mjr19 at cam dot ac.uk ---
Ah, I see. An inability to alternate negation with noop also means that
conjugation is treated suboptimally.
do i=1,n
c(i)=conjg(c(i))
enddo
Here gfortran-13 and -14 are differently subopt
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114767
Bug ID: 114767
Summary: gfortran AVX2 complex multiplication by (0d0,1d0)
suboptimal
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114324
--- Comment #4 from mjr19 at cam dot ac.uk ---
Created attachment 57713
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57713&action=edit
Second testcase, very similar to first
Thank you for looking into this. The real code in question has
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114324
Bug ID: 114324
Summary: AVX2 vectorisation performance regression with
gfortran 13/14
Product: gcc
Version: 13.1.0
Status: UNCONFIRMED
Severity: normal
35 matches
Mail list logo