Hello list, I am running a job on a 4 quadcore AMD Opteron. This machine has 16 cores, which I can verify by looking at /proc/cpuinfo. However, when I run a job with
mpirun -np 16 -mca btl self,sm job I get this error: -------------------------------------------------------------------------- At least one pair of MPI processes are unable to reach each other for MPI communications. This means that no Open MPI device has indicated that it can be used to communicate between these processes. This is an error; Open MPI requires that all MPI processes be able to reach each other. This error can sometimes be the result of forgetting to specify the "self" BTL. Process 1 ([[56972,2],0]) is on host: rust Process 2 ([[56972,1],0]) is on host: rust BTLs attempted: self sm Your MPI job is now going to abort; sorry. -------------------------------------------------------------------------- By adding the tcp btl I can run the job. I don't understand why openmpi claims that a pair of processes can not reach each other, all processor cores should have access to all memory after all. Do I need to set some other btl limit? nick