Sorry, I forgot to give more details on what versions I am using: OpenMPI 1.4 Ubuntu 9.10, kernel 2.6.31-16-generic #53-Ubuntu gcc (Ubuntu 4.4.1-4ubuntu8) 4.4.1
On Fri, Jan 15, 2010 at 15:47, Nicolas Bock <nicolasb...@gmail.com> wrote: > Hello list, > > I am running a job on a 4 quadcore AMD Opteron. This machine has 16 cores, > which I can verify by looking at /proc/cpuinfo. However, when I run a job > with > > mpirun -np 16 -mca btl self,sm job > > I get this error: > > -------------------------------------------------------------------------- > At least one pair of MPI processes are unable to reach each other for > MPI communications. This means that no Open MPI device has indicated > that it can be used to communicate between these processes. This is > an error; Open MPI requires that all MPI processes be able to reach > each other. This error can sometimes be the result of forgetting to > specify the "self" BTL. > > Process 1 ([[56972,2],0]) is on host: rust > Process 2 ([[56972,1],0]) is on host: rust > BTLs attempted: self sm > > Your MPI job is now going to abort; sorry. > -------------------------------------------------------------------------- > > By adding the tcp btl I can run the job. I don't understand why openmpi > claims that a pair of processes can not reach each other, all processor > cores should have access to all memory after all. Do I need to set some > other btl limit? > > nick > >