On Apr 6, 2009, at 7:22 PM, Steve Lowder wrote:
Recently I've been running an MPI code that uses the LAPACK slamch
routine to determine machine precision parameters. This software is
compiled using the latest Intel Fortran compiler and setting the -
fpe0 argument to watch for certain floating point errors. The
slamch routines crashed and printed an OpenMPI stacktrace to report
an underflow error, however the Intel -fpe0 setting doesn't abort on
underflow. When this software is not compiled and linked with
OpenMPI, it ignores the underflow and doesn't abort when compiled
with -fpe0.
When I run the MPI version and set --mca opal_signal 6,7,11 the code
doesn't abort on underflow. I'd like to know if I'm interpreting
this behavior correctly, it appears that the mpi versus no mpi cases
handle underflow differently. I'm assuming OpenMPI has a handler
that processes the interrupts ahead of the Fortran RTL, stopping
execution. Otherwise the Fortran RTL handler would just ignore the
underflow. Do I sort of understand what is going on here? Is there
another solution short of the --mca opal_signal switch?
Your analysis sounds about right to me. There are Fortran intrinsic
routines that can get those machine precision parameters instead of
slamch. Would it be feasible to modify the code to use them?
Iain