Hello, I get the following error when I try to run my programs with openmpi-1.6.0.
tyr hello_1 52 which mpiexec /usr/local/openmpi-1.6_32_cc/bin/mpiexec tyr hello_1 53 tyr hello_1 51 mpiexec --host tyr,sunpc1 -np 3 hello_1_mpi Process 0 of 3 running on tyr.informatik.hs-fulda.de Process 2 of 3 running on tyr.informatik.hs-fulda.de [[4154,1],0][../../../../../openmpi-1.6/ompi/mca/btl/tcp/btl_tcp_endpoint.c:586:m ca_btl_tcp_endpoint_start_connect] from tyr.informatik.hs-fulda.de to: sunpc1 Unable to connect to the peer 127.0.0.1 on port 1024: Connection refused Process 1 of 3 running on sunpc1.informatik.hs-fulda.de [[4154,1],1][../../../../../openmpi-1.6/ompi/mca/btl/tcp/btl_tcp_endpoint.c:586:m ca_btl_tcp_endpoint_start_connect] from sunpc1.informatik.hs-fulda.de to: tyr Unable to connect to the peer 127.0.0.1 on port 516: Connection refused [sunpc1.informatik.hs-fulda.de:24555] *** An error occurred in MPI_Barrier [sunpc1.informatik.hs-fulda.de:24555] *** on communicator MPI_COMM_WORLD [sunpc1.informatik.hs-fulda.de:24555] *** MPI_ERR_INTERN: internal error [sunpc1.informatik.hs-fulda.de:24555] *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort ... I have no problems with just one host (in this case "127.0.0.1" should work). Why didn't mpiexec use the ip-addresses of the hosts in the above example? tyr hello_1 53 mpiexec --host tyr -np 2 hello_1_mpi Process 0 of 2 running on tyr.informatik.hs-fulda.de Now 1 slave tasks are sending greetings. Greetings from task 1: ... tyr hello_1 54 mpiexec --host sunpc1 -np 2 hello_1_mpi Process 1 of 2 running on sunpc1.informatik.hs-fulda.de Process 0 of 2 running on sunpc1.informatik.hs-fulda.de Now 1 slave tasks are sending greetings. Greetings from task 1: ... The problem doesn't result from the heterogeneity of the two hosts because I get the same error with two Sparc-systems or two PCs. I didn't have any problems with openmpi-1.2.4. tyr hello_1 18 mpiexec -mca btl ^udapl --host tyr,sunpc1,linpc1 \ -np 4 hello_1_mpi Process 0 of 4 running on tyr.informatik.hs-fulda.de Process 2 of 4 running on linpc1 Process 1 of 4 running on sunpc1.informatik.hs-fulda.de Process 3 of 4 running on tyr.informatik.hs-fulda.de Now 3 slave tasks are sending greetings. Greetings from task 2: ... tyr hello_1 19 which mpiexec /usr/local/openmpi-1.2.4/bin/mpiexec Do you have any ideas why it doesn't work with openmpi-1.6.0? I configured the package with ../openmpi-1.6/configure --prefix=/usr/local/openmpi-1.6_32_cc \ LDFLAGS="-m32" \ CC="cc" CXX="CC" F77="f77" FC="f95" \ CFLAGS="-m32" CXXFLAGS="-m32 -library=stlport4" FFLAGS="-m32" \ FCFLAGS="-m32" \ CPP="cpp" CXXCPP="cpp" \ CPPFLAGS="" CXXCPPFLAGS="" \ C_INCL_PATH="" C_INCLUDE_PATH="" CPLUS_INCLUDE_PATH="" \ OBJC_INCLUDE_PATH="" MPIHOME="" \ --without-udapl --without-openib \ --enable-mpi-f90 --with-mpi-f90-size=small \ --enable-heterogeneous --enable-cxx-exceptions \ --enable-orterun-prefix-by-default \ --with-threads=posix --enable-mpi-thread-multiple \ --enable-opal-multi-threads \ --with-hwloc=internal --with-ft=LAM --enable-sparse-groups \ |& tee log.configure.$SYSTEM_ENV.$MACHINE_ENV.32_cc Thank you very much for any help in advance. Kind regards Siegmar