Re: [OMPI users] Rounding errors and MPI

Oscar Mojica Mon, 23 Jan 2017 05:40:40 -0800

Jason


I am sorry for my late answer.  For my particular case, following the advice of 
Jeff  gave me the results I wanted. By the way, thanks Jeff for the link, it 
was quite useful.

I compiled my serial and parallel programs using


-fp-model precise -fp-model source (Linux* or OS X*)


to improve the consistency and reproducibility of floating-point results while 
limiting the impact on performance. My program is quite large, so I preferred 
to use this option

because there are certainly other problems that are not only related to 
reassociation.



Oscar Mojica
Geologist Ph.D. in Geophysics
SENAI CIMATEC Supercomputing Center
Lattes: http://lattes.cnpq.br/0796232840554652



________________________________
From: users <users-boun...@lists.open-mpi.org> on behalf of Jason Maldonis 
<maldo...@wisc.edu>
Sent: Wednesday, January 18, 2017 6:07 PM
To: Open MPI Users
Subject: Re: [OMPI users] Rounding errors and MPI

Hi Oscar,

I have similar issues that I was never able to fully track down in my code, but 
I think you just identified the real problem. If you figure out the correct 
options could you please let me know here?

Using the compiler optimizations are important for our code, but if we can 
solve this issue with a compile option, that would be huge!

Thank you for sharing this,
Jason

Jason Maldonis
Research Assistant of Professor Paul Voyles
Materials Science Grad Student
University of Wisconsin, Madison
1509 University Ave, Rm 202
Madison, WI 53706
maldo...@wisc.edu<mailto:maldo...@wisc.edu>

On Wed, Jan 18, 2017 at 1:38 PM, Jeff Hammond 
<jeff.scie...@gmail.com<mailto:jeff.scie...@gmail.com>> wrote:
If compiling with -O0 solves the problem, then you should use -assume 
protect-parens and/or one of the options discussed in the PDF you will find at 
https://software.intel.com/en-us/articles/consistency-of-floating-point-results-using-the-intel-compiler.
  Disabling optimization is a heavy hammer that you don't want to use if you 
care about performance at all.  If you are using Fortran and MPI, it seems 
likely you care about performance.

Jeff

On Mon, Jan 16, 2017 at 8:31 AM, Oscar Mojica 
<o_moji...@hotmail.com<mailto:o_moji...@hotmail.com>> wrote:

Thanks guys for your answers.


Actually, the optimization was not disabled, and that was the problem, 
compiling it with -o0 solves it. Sorry.


Oscar Mojica
Geologist Ph.D. in Geophysics
SENAI CIMATEC Supercomputing Center
Lattes: http://lattes.cnpq.br/0796232840554652



________________________________
From: users 
<users-boun...@lists.open-mpi.org<mailto:users-boun...@lists.open-mpi.org>> on 
behalf of Yann Jobic <yann.jo...@univ-amu.fr<mailto:yann.jo...@univ-amu.fr>>
Sent: Monday, January 16, 2017 12:01 PM
To: Open MPI Users
Subject: Re: [OMPI users] Rounding errors and MPI

Hi,

Is there an overlapping section in the MPI part ?

Otherwise, please check :
- declaration type of all the variables (consistency)
- correct initialization of the array "wave" (to zero)
- maybe use temporary variables like
real size1,size2,factor
size1 = dx+dy
size2 = dhx+dhy
factor = dt*size2/(size1**2)
and then in the big loop:
wave(it,j,k)= wave(it,j,k)*factor
The code will also run faster.

Yann

Le 16/01/2017 à 14:28, Oscar Mojica a écrit :
Hello everybody

I'm having a problem with a parallel program written in fortran. I have a 3D 
array which is divided in two in the third dimension so thats two processes

perform some operations with a part of the cube, using a subroutine. Each 
process also has the complete cube. Before each process call the subroutine,

I compare its sub array with its corresponding part of the whole cube. These 
are the same. The subroutine simply performs point-to-point operations in a 
loop, i.e.


     do k=k1,k2
      do j=1,nhx
       do it=1,nt
            wave(it,j,k)= wave(it,j,k)*dt/(dx+dy)*(dhx+dhy)/(dx+dy)
         end do
       end do
      enddo


where, wave is the 3D array and the other values are constants.


After leaving the subroutine I notice that there is a difference in the values 
calculated by process 1 compared to the values that I get if the whole cube is 
passed to the subroutine but that this only works on its part, i.e.


---    complete    2017-01-12 10:30:23.000000000 -0400
+++ half              2017-01-12 10:34:57.000000000 -0400
@@ -4132545,7 +4132545,7 @@
   -2.5386049E-04
   -2.9899486E-04
   -3.4697619E-04
-  -3.7867704E-04
+ -3.7867710E-04
    0.0000000E+00
    0.0000000E+00
    0.0000000E+00



When I do this with more processes the same thing happens with all processes 
other than zero. I find it very strange. I am disabling the optimization when 
compiling.

In the end the results are visually the same, but not numerically. I am working 
with simple precision.


Any idea what may be going on? I do not know if this is related to MPI



Oscar Mojica
Geologist Ph.D. in Geophysics
SENAI CIMATEC Supercomputing Center
Lattes: http://lattes.cnpq.br/0796232840554652




_______________________________________________
users mailing list
users@lists.open-mpi.org<mailto:users@lists.open-mpi.org>
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


_______________________________________________
users mailing list
users@lists.open-mpi.org<mailto:users@lists.open-mpi.org>
https://rfd.newmexicoconsortium.org/mailman/listinfo/users



--
Jeff Hammond
jeff.scie...@gmail.com<mailto:jeff.scie...@gmail.com>
http://jeffhammond.github.io/

_______________________________________________
users mailing list
users@lists.open-mpi.org<mailto:users@lists.open-mpi.org>
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Rounding errors and MPI

Reply via email to