Looks like this thread accidentally got dropped; sorry! More below.
> On May 4, 2019, at 10:40 AM, Eric F. Alemany via users > <users@lists.open-mpi.org> wrote: > > Hi Gilles, > > Thank you for your message and your suggestion. As you suggested i tried > mpirun -np 84 - -hostfile hostsfile --mca routed direct ./openmpi_hello.c > > The command hangs with no message or error message until i hit "control + z". > Then i have the same error message as before. > > To answer your question here are the answer which made me realize that the > Master node’s Open MPI version is 4.0.0 > and the other node(s) - computational nodes the Open MPI version is 4.0.1 - > see below output of "ompi_info" Could that be the issue? It *could* be, yes. It would be worth making all the versions consistent. > In my “hostsfile" there are 7 nodes. I followed the FAQ instructions but i am > not sure if i created the “hostsfile” correctly. Each node in my cluster has > 32 cores, except the Master node. Your hostfile looks fine. This error is very, very strange to get on a real system. Can you try two things: 1. Run with "mpirun --mca routed_base_verbose 100 ..." And send the full output. 2. Run with small sets of nodes and see if the problem is localized to specific nodes and/or sets of nodes. -- Jeff Squyres jsquy...@cisco.com _______________________________________________ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users