I just tried running "hello_f90.f90" and see the same behavior: 100% CPU usage, 
gradually increasing memory consumption, and failure to get past mpi_finalize. 
LD_LIBRARY_PATH is set as:

                
/tools/casl_sles10/vera_clean/gcc-4.6.1/toolset/openmpi-1.6.5/lib

The installation target for this version of OpenMPI is:

                /tools/casl_sles10/vera_clean/gcc-4.6.1/toolset/openmpi-1.6.5

1045 fischega@lxlogin2[/data/fischega/petsc_configure/mpi_test/simple]> which 
mpirun
/tools/casl_sles10/vera_clean/gcc-4.6.1/toolset/openmpi-1.6.5/bin/mpirun

Perhaps something strange is happening with GCC? I've tried simple hello world 
C and Fortran programs, and they work normally.

From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph Castain
Sent: Sunday, January 19, 2014 11:36 AM
To: Open MPI Users
Subject: Re: [OMPI users] simple test problem hangs on mpi_finalize and 
consumes all system resources

The OFED warning about registration is something OMPI added at one point when 
we isolated the cause of jobs occasionally hanging, so you won't see that 
warning from other MPIs or earlier versions of OMPI (I forget exactly when we 
added it).

The problem you describe doesn't sound like an OMPI issue - it sounds like 
you've got a memory corruption problem in the code. Have you tried running the 
examples in our example directory to confirm that the installation is good?

Also, check to ensure that your LD_LIBRARY_PATH is correctly set to pickup the 
OMPI libs you installed - most Linux distros come with an older version, and 
that can cause problems if you inadvertently pick them up.


On Jan 19, 2014, at 5:51 AM, Fischer, Greg A. 
<fisch...@westinghouse.com<mailto:fisch...@westinghouse.com>> wrote:


Hello,

I have a simple, 1-process test case that gets stuck on the mpi_finalize call. 
The test case is a dead-simple calculation of pi - 50 lines of Fortran. The 
process gradually consumes more and more memory until the system becomes 
unresponsive and needs to be rebooted, unless the job is killed first.

In the output, attached, I see the warning message about OpenFabrics being 
configured to only allow registering part of physical memory. I've tried to 
chase this down with my administrator to no avail yet. (I am aware of the 
relevant FAQ entry.)  A different installation of MPI on the same system, made 
with a different compiler, does not produce the OpenFabrics memory registration 
warning - which seems strange because I thought it was a system configuration 
issue independent of MPI. Also curious in the output is that LSF seems to think 
there are 7 processes and 11 threads associated with this job.

The particulars of my configuration are attached and detailed below. Does 
anyone see anything potentially problematic?

Thanks,
Greg

OpenMPI Version: 1.6.5
Compiler: GCC 4.6.1
OS: SuSE Linux Enterprise Server 10, Patchlevel 2

uname -a : Linux lxlogin2 2.6.16.60-0.21-smp #1 SMP Tue May 6 12:41:02 UTC 2008 
x86_64 x86_64 x86_64 GNU/Linux

LD_LIBRARY_PATH=/tools/casl_sles10/vera_clean/gcc-4.6.1/toolset/openmpi-1.6.5/lib:/tools/casl_sles10/vera_clean/gcc-4.6.1/toolset/gcc-4.6.1/lib64:/tools/lsf/7.0.6.EC/7.0/linux2.6-glibc2.3-x86_64/lib

PATH= 
/tools/casl_sles10/vera_clean/gcc-4.6.1/toolset/python-2.7.6/bin:/tools/casl_sles10/vera_clean/gcc-4.6.1/toolset/openmpi-1.6.5/bin:/tools/casl_sles10/vera_clean/gcc-4.6.1/toolset/gcc-4.6.1/bin:/tools/casl_sles10/vera_clean/gcc-4.6.1/toolset/git-1.7.0.4/bin:/tools/casl_sles10/vera_clean/gcc-4.6.1/toolset/cmake-2.8.11.2/bin:/tools/lsf/7.0.6.EC/7.0/linux2.6-glibc2.3-x86_64/etc:/tools/lsf/7.0.6.EC/7.0/linux2.6-glibc2.3-x86_64/bin:/usr/bin:.:/bin:/usr/scripts

Execution command: (executed via LSF - effectively "mpirun -np 1 test_program")
<output.txt><config.log.bz2><ompi_info.bz2>_______________________________________________
users mailing list
us...@open-mpi.org<mailto:us...@open-mpi.org>
http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to