Jason
I am sorry for my late answer. For my particular case, following the advice of Jeff gave me the results I wanted. By the way, thanks Jeff for the link, it was quite useful. I compiled my serial and parallel programs using -fp-model precise -fp-model source (Linux* or OS X*) to improve the consistency and reproducibility of floating-point results while limiting the impact on performance. My program is quite large, so I preferred to use this option because there are certainly other problems that are not only related to reassociation. Oscar Mojica Geologist Ph.D. in Geophysics SENAI CIMATEC Supercomputing Center Lattes: http://lattes.cnpq.br/0796232840554652 ________________________________ From: users <users-boun...@lists.open-mpi.org> on behalf of Jason Maldonis <maldo...@wisc.edu> Sent: Wednesday, January 18, 2017 6:07 PM To: Open MPI Users Subject: Re: [OMPI users] Rounding errors and MPI Hi Oscar, I have similar issues that I was never able to fully track down in my code, but I think you just identified the real problem. If you figure out the correct options could you please let me know here? Using the compiler optimizations are important for our code, but if we can solve this issue with a compile option, that would be huge! Thank you for sharing this, Jason Jason Maldonis Research Assistant of Professor Paul Voyles Materials Science Grad Student University of Wisconsin, Madison 1509 University Ave, Rm 202 Madison, WI 53706 maldo...@wisc.edu<mailto:maldo...@wisc.edu> On Wed, Jan 18, 2017 at 1:38 PM, Jeff Hammond <jeff.scie...@gmail.com<mailto:jeff.scie...@gmail.com>> wrote: If compiling with -O0 solves the problem, then you should use -assume protect-parens and/or one of the options discussed in the PDF you will find at https://software.intel.com/en-us/articles/consistency-of-floating-point-results-using-the-intel-compiler. Disabling optimization is a heavy hammer that you don't want to use if you care about performance at all. If you are using Fortran and MPI, it seems likely you care about performance. Jeff On Mon, Jan 16, 2017 at 8:31 AM, Oscar Mojica <o_moji...@hotmail.com<mailto:o_moji...@hotmail.com>> wrote: Thanks guys for your answers. Actually, the optimization was not disabled, and that was the problem, compiling it with -o0 solves it. Sorry. Oscar Mojica Geologist Ph.D. in Geophysics SENAI CIMATEC Supercomputing Center Lattes: http://lattes.cnpq.br/0796232840554652 ________________________________ From: users <users-boun...@lists.open-mpi.org<mailto:users-boun...@lists.open-mpi.org>> on behalf of Yann Jobic <yann.jo...@univ-amu.fr<mailto:yann.jo...@univ-amu.fr>> Sent: Monday, January 16, 2017 12:01 PM To: Open MPI Users Subject: Re: [OMPI users] Rounding errors and MPI Hi, Is there an overlapping section in the MPI part ? Otherwise, please check : - declaration type of all the variables (consistency) - correct initialization of the array "wave" (to zero) - maybe use temporary variables like real size1,size2,factor size1 = dx+dy size2 = dhx+dhy factor = dt*size2/(size1**2) and then in the big loop: wave(it,j,k)= wave(it,j,k)*factor The code will also run faster. Yann Le 16/01/2017 à 14:28, Oscar Mojica a écrit : Hello everybody I'm having a problem with a parallel program written in fortran. I have a 3D array which is divided in two in the third dimension so thats two processes perform some operations with a part of the cube, using a subroutine. Each process also has the complete cube. Before each process call the subroutine, I compare its sub array with its corresponding part of the whole cube. These are the same. The subroutine simply performs point-to-point operations in a loop, i.e. do k=k1,k2 do j=1,nhx do it=1,nt wave(it,j,k)= wave(it,j,k)*dt/(dx+dy)*(dhx+dhy)/(dx+dy) end do end do enddo where, wave is the 3D array and the other values are constants. After leaving the subroutine I notice that there is a difference in the values calculated by process 1 compared to the values that I get if the whole cube is passed to the subroutine but that this only works on its part, i.e. --- complete 2017-01-12 10:30:23.000000000 -0400 +++ half 2017-01-12 10:34:57.000000000 -0400 @@ -4132545,7 +4132545,7 @@ -2.5386049E-04 -2.9899486E-04 -3.4697619E-04 - -3.7867704E-04 + -3.7867710E-04 0.0000000E+00 0.0000000E+00 0.0000000E+00 When I do this with more processes the same thing happens with all processes other than zero. I find it very strange. I am disabling the optimization when compiling. In the end the results are visually the same, but not numerically. I am working with simple precision. Any idea what may be going on? I do not know if this is related to MPI Oscar Mojica Geologist Ph.D. in Geophysics SENAI CIMATEC Supercomputing Center Lattes: http://lattes.cnpq.br/0796232840554652 _______________________________________________ users mailing list users@lists.open-mpi.org<mailto:users@lists.open-mpi.org> https://rfd.newmexicoconsortium.org/mailman/listinfo/users _______________________________________________ users mailing list users@lists.open-mpi.org<mailto:users@lists.open-mpi.org> https://rfd.newmexicoconsortium.org/mailman/listinfo/users -- Jeff Hammond jeff.scie...@gmail.com<mailto:jeff.scie...@gmail.com> http://jeffhammond.github.io/ _______________________________________________ users mailing list users@lists.open-mpi.org<mailto:users@lists.open-mpi.org> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
_______________________________________________ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users