I recompiled the RHEL OpenMPI package to include the configuration option --with-tm and it compiled and is working fine.
*# mpirun -V* mpirun (Open MPI) 1.10.6 *# ompi_info | grep ras* MCA ras: gridengine (MCA v2.0.0, API v2.0.0, Component v1.10.6) MCA ras: loadleveler (MCA v2.0.0, API v2.0.0, Component v1.10.6) MCA ras: simulator (MCA v2.0.0, API v2.0.0, Component v1.10.6) MCA ras: slurm (MCA v2.0.0, API v2.0.0, Component v1.10.6) MCA ras: tm (MCA v2.0.0, API v2.0.0, Component v1.10.6) As you can see "tm" is present, and as you will see below, it is working (to an extent) But OpenMPI while it receives the torque supplied node allocation does not obey that allocation. For example... "pbs_hello" batch scrpt =======8<--------CUT HERE---------- #!/bin/bash # # "pbs_hello" batch script to run "mpi_hello" on PBS nodes # #PBS -m n # # -------------------------------------- echo "PBS Job Number " $(echo $PBS_JOBID | sed 's/\..*//') echo "PBS batch run on " $(hostname) echo "Time it was started " $(date +%F_%T) echo "Current Directory " $(pwd) echo "Submitted work dir " $PBS_O_WORKDIR echo "Number of Nodes " $PBS_NP echo "Nodefile List " $PBS_NODEFILE cat $PBS_NODEFILE #env | grep ^PBS_ echo --------------------------------------- cd "$PBS_O_WORKDIR" # return to the correct sub-directory # Run MPI displaying the node allocation maps. mpirun --mca ras_base_verbose 5 --display-map --display-allocation hostname =======8<--------CUT HERE---------- submitting to torque to run one process on each of 5 dual-core machines * # qsub -l nodes=5:ppn=1:dualcore pbs_hello* Results in the following... Stderr shows it the "tm" is being selected.. =======8<--------CUT HERE---------- [node21.emperor:07150] mca:base:select:( ras) Querying component [gridengine] [node21.emperor:07150] mca:base:select:( ras) Skipping component [gridengine]. Query failed to return a module [node21.emperor:07150] mca:base:select:( ras) Querying component [loadleveler] [node21.emperor:07150] mca:base:select:( ras) Skipping component [loadleveler]. Query failed to return a module [node21.emperor:07150] mca:base:select:( ras) Querying component [simulator] [node21.emperor:07150] mca:base:select:( ras) Skipping component [simulator]. Query failed to return a module [node21.emperor:07150] mca:base:select:( ras) Querying component [slurm] [node21.emperor:07150] mca:base:select:( ras) Skipping component [slurm]. Query failed to return a module [node21.emperor:07150] mca:base:select:( ras) Querying component [tm] [node21.emperor:07150] mca:base:select:( ras) Query of component [tm] set priority to 100 [node21.emperor:07150] mca:base:select:( ras) Selected component [tm] =======8<--------CUT HERE---------- While Stdout shows it is picking up the requested allocation. =======8<--------CUT HERE---------- PBS Job Number 8988 PBS batch run on node21.emperor Time it was started 2017-09-29_15:52:21 Current Directory /net/shrek.emperor/home/shrek/anthony Submitted work dir /home/shrek/anthony/mpi-pbs Number of Nodes 5 Nodefile List /var/lib/torque/aux//8988.shrek.emperor node21.emperor node25.emperor node24.emperor node23.emperor node22.emperor --------------------------------------- ====================== ALLOCATED NODES ====================== node21: slots=1 max_slots=0 slots_inuse=0 state=UP node25.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP node24.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP node23.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP node22.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP ================================================================= ====================== ALLOCATED NODES ====================== node21: slots=1 max_slots=0 slots_inuse=0 state=UP node25.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP node24.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP node23.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP node22.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP ================================================================= Data for JOB [18928,1] offset 0 ======================== JOB MAP ======================== Data for node: node21 Num slots: 1 Max slots: 0 Num procs: 1 Process OMPI jobid: [18928,1] App: 0 Process rank: 0 Data for node: node25.emperor Num slots: 1 Max slots: 0 Num procs: 1 Process OMPI jobid: [18928,1] App: 0 Process rank: 1 Data for node: node24.emperor Num slots: 1 Max slots: 0 Num procs: 1 Process OMPI jobid: [18928,1] App: 0 Process rank: 2 Data for node: node23.emperor Num slots: 1 Max slots: 0 Num procs: 1 Process OMPI jobid: [18928,1] App: 0 Process rank: 3 Data for node: node22.emperor Num slots: 1 Max slots: 0 Num procs: 1 Process OMPI jobid: [18928,1] App: 0 Process rank: 4 ============================================================= node21.emperor node21.emperor node21.emperor node21.emperor node21.emperor =======8<--------CUT HERE---------- However according to the "hostname" (and was visible in "pbsnodes") *ALL 5 processes was run on the first node. Vastly over-subscribing that node.* Anyone have any ideas as to what went wrong? *Why did OpenMPI not follow the node mapping it says it should be following!* Additional... OpenMPI on its own (without torque) does appear to work as expected. *# mpirun -host node21,node22,node23,node24,node25 hostname* node24.emperor node22.emperor node21.emperor node25.emperor node23.emperor Anthony Thyssen ( System Programmer ) <a.thys...@griffith.edu.au> -------------------------------------------------------------------------- Rosscott's Law: The faster the computer, The faster it can go wrong. --------------------------------------------------------------------------
_______________________________________________ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users