OK, with Jeff's kind help, I solved this issue in a very simple way. 
Now I would like to report back the reason for this issue and the 
solution.

(1) The scenario under which this issue happened:

In my OPMI environment, the $TMPDIR envar is set to different 
scratch directory for different MPI process, even some MPI 
processes are running on the same host. This is not troublesome if 
we use openib,self,tcp btl layer for communication. However, if we 
use sm btl layer, then, as Jeff said:

"""
Open MPI creates its shared memory files in $TMPDIR. It implicitly 
expects all shared memory files to be found under the same 
$TMPDIR for all procs on a single machine.  

More specifically, Open MPI creates what we call a "session 
directory" under $TMPDIR that is an implicit rendezvous point for all 
processes on the same machine.  Some meta data is put in there, 
to include the shared memory mmap files.

So if the different processes have a different idea of where the
rendezvous session directory exists, they'll end up blocking waiting 
for others to show up at their (individual) rendezvous points... but 
that will never happen, because each process is waiting at their 
own rendezvous point.

"""

So in this case, there is a block and wait on each other for MPI 
processes shared data through shared memory, which will never 
be released, hence the hang at the MPI_Init call.

(2) Solution to this issue:

You may set the $TMPDIR to a same directory on the same host if 
possible; or you could setenv OMPI_PREFIX_ENV to a common 
directory for MPI processes on the same host while keeping your 
$TMPDIR setting. either way is verified and working fine for me!

Thanks,
Yiguang

Reply via email to