[OMPI users] Rounding errors and MPI

2017-01-16 Thread Oscar Mojica
Hello everybody

I'm having a problem with a parallel program written in fortran. I have a 3D 
array which is divided in two in the third dimension so thats two processes

perform some operations with a part of the cube, using a subroutine. Each 
process also has the complete cube. Before each process call the subroutine,

I compare its sub array with its corresponding part of the whole cube. These 
are the same. The subroutine simply performs point-to-point operations in a 
loop, i.e.


 do k=k1,k2
  do j=1,nhx
   do it=1,nt
wave(it,j,k)= wave(it,j,k)*dt/(dx+dy)*(dhx+dhy)/(dx+dy)
 end do
   end do
  enddo


where, wave is the 3D array and the other values are constants.


After leaving the subroutine I notice that there is a difference in the values 
calculated by process 1 compared to the values that I get if the whole cube is 
passed to the subroutine but that this only works on its part, i.e.


---complete2017-01-12 10:30:23.0 -0400
+++ half  2017-01-12 10:34:57.0 -0400
@@ -4132545,7 +4132545,7 @@
   -2.5386049E-04
   -2.9899486E-04
   -3.4697619E-04
-  -3.7867704E-04
+ -3.7867710E-04
0.000E+00
0.000E+00
0.000E+00



When I do this with more processes the same thing happens with all processes 
other than zero. I find it very strange. I am disabling the optimization when 
compiling.

In the end the results are visually the same, but not numerically. I am working 
with simple precision.


Any idea what may be going on? I do not know if this is related to MPI



Oscar Mojica
Geologist Ph.D. in Geophysics
SENAI CIMATEC Supercomputing Center
Lattes: http://lattes.cnpq.br/0796232840554652

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Rounding errors and MPI

2017-01-16 Thread Tim Prince via users
You might try inserting parentheses so as to specify your preferred order of 
evaluation. If using ifort, you would need -assume protect-parens .

Sent via the ASUS PadFone X mini, an AT&T 4G LTE smartphone

 Original Message 
From:Oscar Mojica 
Sent:Mon, 16 Jan 2017 08:28:05 -0500
To:Open MPI User's List 
Subject:[OMPI users] Rounding errors and MPI

>___
>users mailing list
>users@lists.open-mpi.org
>https://rfd.newmexicoconsortium.org/mailman/listinfo/users___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Rounding errors and MPI

2017-01-16 Thread Yann Jobic

Hi,

Is there an overlapping section in the MPI part ?

Otherwise, please check :
- declaration type of all the variables (consistency)
- correct initialization of the array "wave" (to zero)
- maybe use temporary variables like
real size1,size2,factor
size1 = dx+dy
size2 = dhx+dhy
factor = dt*size2/(size1**2)
and then in the big loop:
wave(it,j,k)= wave(it,j,k)*factor
The code will also run faster.

Yann

Le 16/01/2017 à 14:28, Oscar Mojica a écrit :

Hello everybody

I'm having a problem with a parallel program written in fortran. I 
have a 3D array which is divided in two in the third dimension so 
thats two processes


perform some operations with a part of the cube, usinga subroutine. 
Each process also has the complete cube. Before each process call the 
subroutine,


I compare its sub array with its corresponding part of the whole cube. 
These are the same. The subroutine simply performs point-to-point 
operations in a loop, i.e.



 do k=k1,k2
  do j=1,nhx
   do it=1,nt
wave(it,j,k)= wave(it,j,k)*dt/(dx+dy)*(dhx+dhy)/(dx+dy)
 end do
   end do
  enddo


where, wave is the 3D array and the other values are constants.


After leaving the subroutine I notice that there is a difference in 
the values calculated by process 1 compared to the values that I get 
if the whole cube is passed to the subroutine but that this only works 
on its part, i.e.



---complete2017-01-12 10:30:23.0 -0400
+++ half  2017-01-12 10:34:57.0 -0400
@@ -4132545,7 +4132545,7 @@
   -2.5386049E-04
   -2.9899486E-04
   -3.4697619E-04
-  -3.7867704E-04
+ -3.7867710E-04
0.000E+00
0.000E+00
0.000E+00


When I do this with more processes the same thing happens with all 
processes other than zero. I find it very strange. I am disabling the 
optimization when compiling.


In the end the results are visually the same, but not numerically. I 
am working with simple precision.



Any idea what may be going on? I do not know if this is related to MPI



Oscar Mojica
Geologist Ph.D. in Geophysics
SENAI CIMATEC Supercomputing Center
Lattes: http://lattes.cnpq.br/0796232840554652



___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users



___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Rounding errors and MPI

2017-01-16 Thread Oscar Mojica
Thanks guys for your answers.


Actually, the optimization was not disabled, and that was the problem, 
compiling it with -o0 solves it. Sorry.


Oscar Mojica
Geologist Ph.D. in Geophysics
SENAI CIMATEC Supercomputing Center
Lattes: http://lattes.cnpq.br/0796232840554652




From: users  on behalf of Yann Jobic 

Sent: Monday, January 16, 2017 12:01 PM
To: Open MPI Users
Subject: Re: [OMPI users] Rounding errors and MPI

Hi,

Is there an overlapping section in the MPI part ?

Otherwise, please check :
- declaration type of all the variables (consistency)
- correct initialization of the array "wave" (to zero)
- maybe use temporary variables like
real size1,size2,factor
size1 = dx+dy
size2 = dhx+dhy
factor = dt*size2/(size1**2)
and then in the big loop:
wave(it,j,k)= wave(it,j,k)*factor
The code will also run faster.

Yann

Le 16/01/2017 à 14:28, Oscar Mojica a écrit :
Hello everybody

I'm having a problem with a parallel program written in fortran. I have a 3D 
array which is divided in two in the third dimension so thats two processes

perform some operations with a part of the cube, using a subroutine. Each 
process also has the complete cube. Before each process call the subroutine,

I compare its sub array with its corresponding part of the whole cube. These 
are the same. The subroutine simply performs point-to-point operations in a 
loop, i.e.


 do k=k1,k2
  do j=1,nhx
   do it=1,nt
wave(it,j,k)= wave(it,j,k)*dt/(dx+dy)*(dhx+dhy)/(dx+dy)
 end do
   end do
  enddo


where, wave is the 3D array and the other values are constants.


After leaving the subroutine I notice that there is a difference in the values 
calculated by process 1 compared to the values that I get if the whole cube is 
passed to the subroutine but that this only works on its part, i.e.


---complete2017-01-12 10:30:23.0 -0400
+++ half  2017-01-12 10:34:57.0 -0400
@@ -4132545,7 +4132545,7 @@
   -2.5386049E-04
   -2.9899486E-04
   -3.4697619E-04
-  -3.7867704E-04
+ -3.7867710E-04
0.000E+00
0.000E+00
0.000E+00



When I do this with more processes the same thing happens with all processes 
other than zero. I find it very strange. I am disabling the optimization when 
compiling.

In the end the results are visually the same, but not numerically. I am working 
with simple precision.


Any idea what may be going on? I do not know if this is related to MPI



Oscar Mojica
Geologist Ph.D. in Geophysics
SENAI CIMATEC Supercomputing Center
Lattes: http://lattes.cnpq.br/0796232840554652




___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

[OMPI users] Library interposing works with 1.6.5 but not with 2.0.1 (Fortran)?

2017-01-16 Thread Alef Farah
Hi,

I contribute to a tracing library which uses PMPI. It's loaded with
LD_PRELOAD so it interposes libmpi, intercepting MPI_ calls. Since we
upgraded from OpenMPI 1.6.5 to OpenMPI 2.0.1 it seems to have stopped
intercepting the calls from Fortran application, although it continues
to work with C applications. For instance, adding breakpoints to MPI
calls with gdb in a certain Fortran application one gets:

Breakpoint 1, 0x7794bae0 in PMPI_Init () from
/home/afh/install/openmpi-2.0.1/b/lib/libmpi.so.20.0.1
(gdb) where
#0  0x7794bae0 in PMPI_Init () from
/home/afh/install/openmpi-2.0.1/b/lib/libmpi.so.20.0.1
#1  0x7729d638 in pmpi_init__ () from
/home/afh/install/openmpi-2.0.1/b/lib/libmpi_mpifh.so.20
#2  0x00401197 in MAIN__ ()
#3  0x0040210f in main ()

When using 2.0.1. Notice there is no sign of the tracing library,
whereas with 1.6.5 it works as intended:

Breakpoint 1, 0x77bd3c34 in MPI_Init ()
from /home/afh/svn/akypuera/b/lib/libaky.so
(gdb) where
#0  0x77bd3c34 in MPI_Init ()
from /home/afh/svn/akypuera/b/lib/libaky.so
#1  0x7763f218 in pmpi_init__ () from /usr/lib/libmpi_f77.so.1
#2  0x004010d7 in MAIN__ ()
#3  0x0040204f in main ()

It seems that with OpenMPI 1.6.5 libmpi_f77 is used, whereas with 2.0.1
libmpi_mpifh is used and the calls are not intercepted for some reason.
Any ideas? The only change made to the library's code was matching MPI's
C API changes (added const qualifiers to read-only buffers), so I don't
think that had anything to do with it.

Links to the application (NAS EP benchmark) and the tracing library can
be found attached, as well as the output of ldd for various
configurations, and config.log for my OpenMPI 2.0.1 build.


attachments.tar.xz
Description: application/xz
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users