https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91778
Thomas Koenig <tkoenig at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Last reconfirmed|2019-09-16 00:00:00 | CC| |tkoenig at gcc dot gnu.org --- Comment #3 from Thomas Koenig <tkoenig at gcc dot gnu.org> --- (In reply to Mark Wieczorek from comment #0) > I am writing about a possible bug in the gfortran GCC9 optimizer on macOS > (installed via brew). > > Before going into the details, I note that my code (SHTOOLS/pyshtools) is > widely used on many platforms and compilers. My code works with GCC8 > compiled with optimizations "-O" or "-O3", and it works fine with GCC9 when > compiled _without_ optimizations. I was able to "fix" my code to work with > GCC9, but I feel that what I am doing is avoiding a bug in the GCC9 > optimizer, and that I am not in fact "fixing" my code (perhaps I am > wrong...). > > The problem is related to using the FFTW3 library, which is the most widely > used FFT library for scientific computing. If this is a bug, then others > will probably encounter similar problems. As my code is somewhat long (and > given the lack of time I have now), I will just give you a summary of two > problems. If necessary, I could try to write a "small" example that > reproduces these problems when I have more free time later. If it turns out that this is needed, please do. However... > I start by describing how FFTW routines are use. First, you initialize the > FFT operation and get pointers to all the input and output arrays, which are > stored in the variable "plan": > > call dfftw_plan_dft_c2r_1d(plan, nlong, coef, grid) This sounds very suspicious. According to the Fortran standard, you cannot stash away a pointer to a Fortran array unless that array is marked as TARGET. Well, you can, but it's liable to break any time, and apparently it did. Can you show the declaration of dfftw_plan_dft_c2r_1d ? > Then you perform the FFT simply by calling > > call dfftw_execute(plan) > > The first problem boils down to this: > > call dfftw_plan_dft_c2r_1d(plan, nlong, coef, grid) > > coef(1) = dcmplx(coef0,0.0d0) ! A > coef(2:lmax_comp+1) = coef(2:lmax_comp+1) / 2.0d0 > > call dfftw_execute(plan) ! AA > gridglq(i,1:nlong) = grid(1:nlong) > > coef(1) = dcmplx(coef0s,0.0d0) ! B > coef(2:lmax_comp+1) = coefs(2:lmax_comp+1)/2.0d0 > > call dfftw_execute(plan) ! BB > gridglq(i_s,1:nlong) = grid(1:nlong) > > > The problem is that the optimizer thinks the line A is redundant with line B > (the same variable is being defined twice). And that is correct behavior. Try marking coef as TARGET or VOLATILE, this should inhibit this optimization. > The second problem I encountered is a little more mysterious. These are the > _last_ 4 lines of the subroutine: > > coef(lmax_comp+1) = coef(lmax_comp+1) + cilm(1,lmax_comp+1,lmax_comp+1) > coef(nlong-(lmax_comp-1)) = coef(nlong-(lmax_comp-1)) & > + cilm(2,lmax_comp+1,lmax_comp+1) > > call dfftw_execute(plan) > > griddh(i_eq,1:nlong) = grid(1:nlong) > > The problem is that the optimizer ignores the first two lines. The reason > for this is probably because (1) the variable coef is not explicitly noted > in the fftw call, and (2) the variable coef is not output in the subroutine. > Thus, the optimizer probably thinks that it doesn't need to compute the > first two lines Sounds reasonable. > So, in summary, I believe that the GCC9 optimizer is not working correctly > because it doesn't realize that the function call dfftw_execute(plan) > actually depends on the variables coef and grid. Given that my code has > worked well with all other versions of GCC, I suspect that there has been a > change in how the optimizer works. I assume that your program was always non-conforming, and that gcc has simply gotten better at finding optimization opportunities.