https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83017
--- Comment #18 from Christian Felter ---
Thanks for the great work!! I've tested the new version and found similar
results.
I think the wrong results (which are actually random results) mean there is
another related bug still, so I've opened b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83017
Richard Biener changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83017
--- Comment #16 from Richard Biener ---
Author: rguenth
Date: Fri Nov 17 13:36:37 2017
New Revision: 254869
URL: https://gcc.gnu.org/viewcvs?rev=254869&root=gcc&view=rev
Log:
2017-11-17 Richard Biener
PR fortran/83017
* tree-
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83017
--- Comment #15 from Richard Biener ---
Author: rguenth
Date: Fri Nov 17 13:15:34 2017
New Revision: 254867
URL: https://gcc.gnu.org/viewcvs?rev=254867&root=gcc&view=rev
Log:
2017-11-17 Richard Biener
PR tree-optimization/83017
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83017
--- Comment #14 from Richard Biener ---
Ok, so the correctness issue is that 'tmp' is treated as shared by autopar.
Clearly while the outer loop iterations are independent the dependence in the
inner loop should make autopar privatize this array
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83017
--- Comment #13 from Richard Biener ---
Ok, so we do slightly better for the runtime test than for the static test:
if (loop->inner)
m_p_thread=2;
else
m_p_thread=MIN_PER_THREAD;
so with 2 threads we should have exac
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83017
--- Comment #12 from Dominique d'Humieres ---
> Please use -fopt-info-loop to verify the loop is parallelized. You have
> to use -floop-parallelize-all as well due to the cost model issue.
If I use the commented loop I get with/without the patc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83017
--- Comment #11 from rguenther at suse dot de ---
On November 16, 2017 5:42:02 PM GMT+01:00, "dominiq at lps dot ens.fr"
wrote:
>https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83017
>
>--- Comment #10 from Dominique d'Humieres
>---
>> Created att
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83017
--- Comment #10 from Dominique d'Humieres ---
> Created attachment 42621 [details]
> updated patch
AFAICT the patch does not fix the problem:
without the patch
PI 2.98876095
PI 3.14159274
4.742u 0.015s 0:04.77 99.5% 0+0k 0+0i
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83017
Richard Biener changed:
What|Removed |Added
Status|NEW |ASSIGNED
Assignee|unassigned
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83017
--- Comment #8 from Richard Biener ---
Created attachment 42620
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42620&action=edit
patch
Otherwise untested patch. Note ivdep is mapped to safelen which isn't useful
for parallelization given
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83017
--- Comment #7 from Richard Biener ---
If I "fix" GCC to consider the loop you annotate parallel:
do concurrent (i = 1:nsplit)
pi(i) = sum(compute( low(i), high(i) ))
end do
then we arrive at computing 4 iterations of that loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83017
--- Comment #6 from rguenther at suse dot de ---
On November 16, 2017 2:22:50 PM GMT+01:00, cfztol at hotmail dot com
wrote:
>https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83017
>
>--- Comment #5 from Christian Felter ---
>Okay, sounds like ther
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83017
--- Comment #5 from Christian Felter ---
Okay, sounds like there is hope. By the way, the problem also exists without a
function call. Declaring
real, dimension(nsplit) :: tmp
and replacing the loop with
do concurrent (i = 1:nsplit)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83017
--- Comment #4 from rguenther at suse dot de ---
On November 16, 2017 1:22:37 PM GMT+01:00, cfztol at hotmail dot com
wrote:
>https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83017
>
>--- Comment #3 from Christian Felter ---
>Ultimately, I wanted t
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83017
--- Comment #3 from Christian Felter ---
Ultimately, I wanted to compute k like this
k = permutation( j )
where permutation is a 1D array of integers (from 1 to 4, e.g. [ 1, 4, 2, 1, 3,
... etc] ). This would allow an easy way of parallelizing
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83017
Richard Biener changed:
What|Removed |Added
CC||rguenth at gcc dot gnu.org
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83017
Dominique d'Humieres changed:
What|Removed |Added
Status|UNCONFIRMED |NEW
Last reconfirmed|
18 matches
Mail list logo