https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57037

--- Comment #1 from Harald Anlauf <anlauf at gmx dot de> ---
(In reply to Harald Anlauf from comment #0)
> gfortran (using -Ofast -fprefetch-loop-arrays) exactly
> reproduces the performance of the Intel compiler without
> temporal stores.  It appears that this is an important
> optimization.

I tried a current snapshot from trunk (r219084) and found
that -fprefetch-loop-arrays now gives an additional boost,
matching Intel v15 for the above code, even without the
streaming stores.

Reply via email to