Hi Gilles, Thank you for your message and your suggestion. As you suggested i tried
mpirun -np 84 - -hostfile hostsfile --mca routed direct ./openmpi_hello.c The command hangs with no message or error message until i hit "control + z". Then i have the same error message as before. To answer your question here are the answer which made me realize that the Master node’s Open MPI version is 4.0.0 and the other node(s) - computational nodes the Open MPI version is 4.0.1 - see below output of "ompi_info" Could that be the issue? In my “hostsfile" there are 7 nodes. I followed the FAQ instructions but i am not sure if i created the “hostsfile” correctly. Each node in my cluster has 32 cores, except the Master node. hostsfile radonc-phaser01 slots=12 radonc-phaser02 slots=12 radonc-phaser03 slots=12 radonc-phaser04 slots=12 radonc-phaser05 slots=12 radonc-phaser06 slots=12 radonc-phaser07 slots=12 For all 7 nodes lscpu is the same lscpu CPU(s): 12 On-line CPU(s) list: 0-11 Thread(s) per core: 1 Core(s) per socket: 6 Socket(s): 2 NUMA node(s): 2 There are no firewall running. I also configured password-less ssh for each nodes (including the Master node). All nodes can ssh in all nodes without password. ompi@phaser-manager:~$ ompi_info Package: Open MPI root@phaser-manager Distribution Open MPI: 4.0.0 Open MPI repo revision: v4.0.0 Open MPI release date: Nov 12, 2018 Open RTE: 4.0.0 Open RTE repo revision: v4.0.0 Open RTE release date: Nov 12, 2018 OPAL: 4.0.0 OPAL repo revision: v4.0.0 OPAL release date: Nov 12, 2018 MPI API: 3.1.0 Ident string: 4.0.0 Prefix: /usr/local/.openmpi Configured architecture: x86_64-unknown-linux-gnu Configure host: phaser-manager Configured by: root Configured on: Fri Nov 30 15:16:24 PST 2018 Configure host: phaser-manager Configure command line: '--prefix=/usr/local/.openmpi' Built by: root Built on: Fri Nov 30 15:28:26 PST 2018 Built host: phaser-manager ompi@radonc-phaser01:~$ ompi_info Package: Open MPI root@radonc-phaser01 Distribution Open MPI: 4.0.1 Open MPI repo revision: v4.0.1 Open MPI release date: Mar 26, 2019 Open RTE: 4.0.1 Open RTE repo revision: v4.0.1 Open RTE release date: Mar 26, 2019 OPAL: 4.0.1 OPAL repo revision: v4.0.1 OPAL release date: Mar 26, 2019 MPI API: 3.1.0 Ident string: 4.0.1 Prefix: /usr/local/.openmpi Configured architecture: x86_64-unknown-linux-gnu Configure host: radonc-phaser01 Configured by: root Configured on: Sat Apr 13 22:14:28 PDT 2019 Configure host: radonc-phaser01 Configure command line: '--prefix=/usr/local/.openmpi' Built by: root Built on: Sat Apr 13 22:32:50 PDT 2019 Built host: radonc-phaser01 Thank you again for your help. Best, Eric _____________________________________________________________________________________________________ Eric F. Alemany System Administrator for Research IRT Division of Radiation & Cancer Biology Department of Radiation Oncology Stanford University School of Medicine Stanford, California 94305 Tel:1-650-498-7969<tel:1-650-498-7969> No Texting Fax:1-650-723-7382<tel:1-650-723-7382>
_______________________________________________ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users