[OMPI users] Interaction between Intel and OpenMPI floating point exceptions

2009-04-06 Thread Steve Lowder
Recently I've been running an MPI code that uses the LAPACK slamch 
routine to determine machine precision parameters.  This software is 
compiled using the latest Intel Fortran compiler and setting the -fpe0 
argument to watch for certain  floating point errors.  The slamch 
routines crashed and printed an OpenMPI stacktrace to report an 
underflow error, however the Intel -fpe0 setting doesn't abort on 
underflow.  When this software is not compiled and linked with OpenMPI, 
it ignores the underflow and doesn't abort when compiled with  -fpe0.


When I run the MPI version and set --mca opal_signal 6,7,11 the code 
doesn't abort on underflow.  I'd like to know if I'm interpreting this 
behavior correctly, it appears that the mpi versus no mpi cases handle 
underflow differently. I'm assuming OpenMPI has a handler that processes 
the interrupts ahead of the Fortran RTL, stopping execution.  Otherwise 
the Fortran RTL handler would just ignore the underflow.  Do I sort of 
understand what is going on here?  Is there another solution short of 
the --mca opal_signal switch?


thanks
Steve


Re: [OMPI users] Interaction between Intel and OpenMPI floating point exceptions

2009-04-07 Thread Steve Lowder

Iain,
 Thanks for the reply, yours sounds like a good suggestion to try to 
work around this.

Steve


Iain Bason wrote:


On Apr 6, 2009, at 7:22 PM, Steve Lowder wrote:

Recently I've been running an MPI code that uses the LAPACK slamch 
routine to determine machine precision parameters.  This software is 
compiled using the latest Intel Fortran compiler and setting the 
-fpe0 argument to watch for certain  floating point errors.  The 
slamch routines crashed and printed an OpenMPI stacktrace to report 
an underflow error, however the Intel -fpe0 setting doesn't abort on 
underflow.  When this software is not compiled and linked with 
OpenMPI, it ignores the underflow and doesn't abort when compiled 
with  -fpe0.


When I run the MPI version and set --mca opal_signal 6,7,11 the code 
doesn't abort on underflow.  I'd like to know if I'm interpreting 
this behavior correctly, it appears that the mpi versus no mpi cases 
handle underflow differently. I'm assuming OpenMPI has a handler that 
processes the interrupts ahead of the Fortran RTL, stopping 
execution.  Otherwise the Fortran RTL handler would just ignore the 
underflow.  Do I sort of understand what is going on here?  Is there 
another solution short of the --mca opal_signal switch?


Your analysis sounds about right to me.  There are Fortran intrinsic 
routines that can get those machine precision parameters instead of 
slamch.  Would it be feasible to modify the code to use them?


Iain

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users