[OMPI users] Troubles using MPI_Isend/MPI_Irecv/MPI_Waitany and MPI_Allreduce

2011-10-20 Thread Pedro Gonnet
Hi all, I am currently working on a multi-threaded hybrid parallel simulation which uses both pthreads and OpenMPI. The simulation uses several pthreads per MPI node. My code uses the nonblocking routines MPI_Isend/MPI_Irecv/MPI_Waitany quite successfully to implement the node-to-node communicat

Re: [OMPI users] Troubles using MPI_Isend/MPI_Irecv/MPI_Waitany and MPI_Allreduce

2011-10-20 Thread Pedro Gonnet
Short update: I just installed version 1.4.4 from source (compiled with --enable-mpi-threads), and the problem persists. I should also point out that if, in thread (ii), I wait for the nonblocking communication in thread (i) to finish, nothing bad happens. But this makes the nonblocking communic

Re: [OMPI users] MPI_Waitany segfaults or (maybe) hangs

2011-10-20 Thread Jeff Squyres
Sorry for the delay in replying. Unfortunately, the "uninitialized values" kinds of warnings from valgrind are to be expected when using the OFED stack. Specifically, a bunch of memory in an OMPI process comes directly from OS-bypass kinds of mechanisms, which effectively translates into val

Re: [OMPI users] Application in a cluster

2011-10-20 Thread Jorge Jaramillo
Thanks for all your suggestions. Yes, indeed what I'm trying to do is execute a serial program. All the documentation you mention was pretty useful. I have another question, if mpirun launches several copies of the program on the different hosts, does it mean that I must have a copy of the program

Re: [OMPI users] Application in a cluster

2011-10-20 Thread Ralph Castain
On Oct 20, 2011, at 10:33 AM, Jorge Jaramillo wrote: > Thanks for all your suggestions. > > Yes, indeed what I'm trying to do is execute a serial program. All the > documentation you mention was pretty useful. > I have another question, if mpirun launches several copies of the program on > the

Re: [OMPI users] MPI_Waitany segfaults or (maybe) hangs

2011-10-20 Thread Francesco Salvadore
Dear Jeff, thanks for replying and for providing MPI implementation details.  As you say, the possible problem is a subtle memory bug. In our code, MPI communications are limited to a few subroutines named cutman_ and sharing a similar structure involving a possibile large number (1000 or

Re: [OMPI users] Application in a cluster

2011-10-20 Thread Gus Correa
Hi Jorge Aha! A serial executable. I guessed it right! :) But Ralph certainly came up with the simpler solution: use mpirun. As for the other question: If you are using Torque/PBS to launch the job, put this line in your PBS script: cd $PBS_O_WORKDIR which will put you in the work director