Hi,

The only example that works is hello_c.c. All others (that use MPI_Send and 
MPI_Recv)(connectivity_c.c and ring_c.c) block after the first MPI_Send / 
MPI_Recv (although the first Send/Receive pair works well for all 
processes, subsequent Send/Receive pairs block). My slurm version is 
2.1.0. It is also worth mentioning that all examples work when not using SLURM 
(launching with "mpirun -np 5 <exaple_app>"). Blocking 
occurs only when I try to run on multiple hosts with SLURM ("salloc -N5 
mpirun <example_app>").

Adrian


________________________________
 From: Jeff Squyres <jsquy...@cisco.com>
To: adrian sabou <adrian.sa...@yahoo.com>; Open MPI Users <us...@open-mpi.org> 
Sent: Wednesday, February 1, 2012 10:32 PM
Subject: Re: [OMPI users]  OpenMPI / SLURM -> Send/Recv blocking
 
On Jan 31, 2012, at 11:16 AM, adrian sabou wrote:

> Like I said, a very simple program.
> When launching this application with SLURM (using "salloc -N2 mpirun 
> ./<my_app>"), it hangs at the barrier.

Are you able to run the MPI example programs in examples/ ?

> However, it passes the barrier if I launch it without SLURM (using "mpirun 
> -np 2 ./<my_app>"). I first noticed this problem when my application hanged 
> if I tried to send two successive messages from a process to another. Only 
> the first MPI_Send would work. The second MPI_Send would block indefinitely. 
> I was wondering whether any of you have encountered a similar problem, or may 
> have an ideea as to what is causing the Send/Receive pair to block when using 
> SLURM. The exact output in my console is as follows:
>  
>         salloc: Granted job allocation 1138
>         Process 0 - Sending...
>         Process 1 - Receiving...
>         Process 1 - Received.
>         Process 1 - Barrier reached.
>         Process 0 - Sent.
>         Process 0 - Barrier reached.
>         (it just hangs here)
>  
> I am new to MPI programming and to OpenMPI and would greatly appreciate any 
> help. My OpenMPI version is 1.4.4 (although I have also tried it on 1.5.4), 
> my SLURM version is 0.3.3-1 (slurm-llnl 2.1.0-1),

I'm not sure what SLURM version that is -- my "srun --version" shows 2.2.4.  
0.3.3 would be pretty ancient, no?

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/

Reply via email to