Hi Gilles,

Thank you for your message and your suggestion. As you suggested i tried

mpirun -np 84  - -hostfile hostsfile --mca routed direct ./openmpi_hello.c

The command hangs with no message or error message until i hit "control + z".   
Then i have the same error message as before.

To answer your question here are the answer which made me realize that the 
Master node’s Open MPI version is 4.0.0
and the other node(s) - computational nodes the Open MPI version is 4.0.1 - see 
below output of "ompi_info" Could that be the issue?

In my “hostsfile" there are 7 nodes. I followed the FAQ instructions but i am 
not sure if i created the “hostsfile” correctly. Each node in my cluster has 32 
cores, except the Master node.

hostsfile
radonc-phaser01 slots=12
radonc-phaser02 slots=12
radonc-phaser03 slots=12
radonc-phaser04 slots=12
radonc-phaser05 slots=12
radonc-phaser06 slots=12
radonc-phaser07 slots=12

For all 7 nodes lscpu is the same
lscpu
CPU(s):              12
On-line CPU(s) list: 0-11
Thread(s) per core:  1
Core(s) per socket:  6
Socket(s):           2
NUMA node(s):        2


There are no firewall running.  I also configured password-less ssh for each 
nodes (including the Master node). All nodes can ssh in all nodes without 
password.


ompi@phaser-manager:~$ ompi_info
                 Package: Open MPI root@phaser-manager Distribution
                Open MPI: 4.0.0
  Open MPI repo revision: v4.0.0
   Open MPI release date: Nov 12, 2018
                Open RTE: 4.0.0
  Open RTE repo revision: v4.0.0
   Open RTE release date: Nov 12, 2018
                    OPAL: 4.0.0
      OPAL repo revision: v4.0.0
       OPAL release date: Nov 12, 2018
                 MPI API: 3.1.0
            Ident string: 4.0.0
                  Prefix: /usr/local/.openmpi
 Configured architecture: x86_64-unknown-linux-gnu
          Configure host: phaser-manager
           Configured by: root
           Configured on: Fri Nov 30 15:16:24 PST 2018
          Configure host: phaser-manager
  Configure command line: '--prefix=/usr/local/.openmpi'
                Built by: root
                Built on: Fri Nov 30 15:28:26 PST 2018
Built host: phaser-manager

ompi@radonc-phaser01:~$ ompi_info
                 Package: Open MPI root@radonc-phaser01 Distribution
                Open MPI: 4.0.1
  Open MPI repo revision: v4.0.1
   Open MPI release date: Mar 26, 2019
                Open RTE: 4.0.1
  Open RTE repo revision: v4.0.1
   Open RTE release date: Mar 26, 2019
                    OPAL: 4.0.1
      OPAL repo revision: v4.0.1
       OPAL release date: Mar 26, 2019
                 MPI API: 3.1.0
            Ident string: 4.0.1
                  Prefix: /usr/local/.openmpi
 Configured architecture: x86_64-unknown-linux-gnu
          Configure host: radonc-phaser01
           Configured by: root
           Configured on: Sat Apr 13 22:14:28 PDT 2019
          Configure host: radonc-phaser01
  Configure command line: '--prefix=/usr/local/.openmpi'
                Built by: root
                Built on: Sat Apr 13 22:32:50 PDT 2019
              Built host: radonc-phaser01




Thank you again for your help.

Best,
Eric

_____________________________________________________________________________________________________

Eric F.  Alemany
System Administrator for Research

IRT
Division of Radiation & Cancer  Biology
Department of Radiation Oncology

Stanford University School of Medicine
Stanford, California 94305

Tel:1-650-498-7969<tel:1-650-498-7969>  No Texting
Fax:1-650-723-7382<tel:1-650-723-7382>





_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Reply via email to