[OMPI users] OpenMPI with-tm is not obeying torque

Anthony Thyssen Thu, 28 Sep 2017 23:05:16 -0700

I recompiled the RHEL OpenMPI package to include the configuration option
--with-tm
and it compiled and is working fine.


*# mpirun -V*
mpirun (Open MPI) 1.10.6

*# ompi_info | grep ras*
       MCA ras: gridengine (MCA v2.0.0, API v2.0.0, Component v1.10.6)
       MCA ras: loadleveler (MCA v2.0.0, API v2.0.0, Component v1.10.6)
       MCA ras: simulator (MCA v2.0.0, API v2.0.0, Component v1.10.6)
       MCA ras: slurm (MCA v2.0.0, API v2.0.0, Component v1.10.6)
       MCA ras: tm (MCA v2.0.0, API v2.0.0, Component v1.10.6)

As you can see "tm" is present, and as you will see below, it is working
(to an extent)

But OpenMPI while it receives the torque supplied
node allocation does not obey that allocation.

For example...

"pbs_hello" batch scrpt
=======8<--------CUT HERE----------
#!/bin/bash
#
# "pbs_hello" batch script to run "mpi_hello" on PBS nodes
#
#PBS -m n
#
# --------------------------------------
echo "PBS Job Number      " $(echo $PBS_JOBID | sed 's/\..*//')
echo "PBS batch run on    " $(hostname)
echo "Time it was started " $(date +%F_%T)
echo "Current Directory   " $(pwd)
echo "Submitted work dir  " $PBS_O_WORKDIR
echo "Number of Nodes     " $PBS_NP
echo "Nodefile List       " $PBS_NODEFILE
cat $PBS_NODEFILE
#env | grep ^PBS_
echo ---------------------------------------
cd "$PBS_O_WORKDIR"   # return to the correct sub-directory

# Run MPI displaying the node allocation maps.
mpirun --mca ras_base_verbose 5 --display-map --display-allocation hostname


=======8<--------CUT HERE----------

submitting to torque to run one process on each of 5 dual-core machines

  * # qsub -l nodes=5:ppn=1:dualcore pbs_hello*

Results in the following...

Stderr shows it the "tm" is being selected..
=======8<--------CUT HERE----------
[node21.emperor:07150] mca:base:select:(  ras) Querying component
[gridengine]
[node21.emperor:07150] mca:base:select:(  ras) Skipping component
[gridengine]. Query failed to return a module
[node21.emperor:07150] mca:base:select:(  ras) Querying component
[loadleveler]
[node21.emperor:07150] mca:base:select:(  ras) Skipping component
[loadleveler]. Query failed to return a module
[node21.emperor:07150] mca:base:select:(  ras) Querying component
[simulator]
[node21.emperor:07150] mca:base:select:(  ras) Skipping component
[simulator]. Query failed to return a module
[node21.emperor:07150] mca:base:select:(  ras) Querying component [slurm]
[node21.emperor:07150] mca:base:select:(  ras) Skipping component [slurm].
Query failed to return a module
[node21.emperor:07150] mca:base:select:(  ras) Querying component [tm]
[node21.emperor:07150] mca:base:select:(  ras) Query of component [tm] set
priority to 100
[node21.emperor:07150] mca:base:select:(  ras) Selected component [tm]
=======8<--------CUT HERE----------

While Stdout shows it is picking up the requested allocation.
=======8<--------CUT HERE----------
PBS Job Number       8988
PBS batch run on     node21.emperor
Time it was started  2017-09-29_15:52:21
Current Directory    /net/shrek.emperor/home/shrek/anthony
Submitted work dir   /home/shrek/anthony/mpi-pbs
Number of Nodes      5
Nodefile List        /var/lib/torque/aux//8988.shrek.emperor
node21.emperor
node25.emperor
node24.emperor
node23.emperor
node22.emperor
---------------------------------------

======================   ALLOCATED NODES   ======================
        node21: slots=1 max_slots=0 slots_inuse=0 state=UP
        node25.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
        node24.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
        node23.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
        node22.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
=================================================================

======================   ALLOCATED NODES   ======================
        node21: slots=1 max_slots=0 slots_inuse=0 state=UP
        node25.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
        node24.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
        node23.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
        node22.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
=================================================================
 Data for JOB [18928,1] offset 0

 ========================   JOB MAP   ========================

 Data for node: node21  Num slots: 1    Max slots: 0    Num procs: 1
        Process OMPI jobid: [18928,1] App: 0 Process rank: 0

 Data for node: node25.emperor  Num slots: 1    Max slots: 0    Num procs: 1
        Process OMPI jobid: [18928,1] App: 0 Process rank: 1

 Data for node: node24.emperor  Num slots: 1    Max slots: 0    Num procs: 1
        Process OMPI jobid: [18928,1] App: 0 Process rank: 2

 Data for node: node23.emperor  Num slots: 1    Max slots: 0    Num procs: 1
        Process OMPI jobid: [18928,1] App: 0 Process rank: 3

 Data for node: node22.emperor  Num slots: 1    Max slots: 0    Num procs: 1
        Process OMPI jobid: [18928,1] App: 0 Process rank: 4

 =============================================================
node21.emperor
node21.emperor
node21.emperor
node21.emperor
node21.emperor
=======8<--------CUT HERE----------

However according to the "hostname"  (and was visible in "pbsnodes")

*ALL 5 processes was run on the first node.  Vastly over-subscribing that
node.*

Anyone have any ideas as to what went wrong?

*Why did OpenMPI not follow the node mapping it says it should be
following!*

Additional... OpenMPI on its own (without torque) does appear to work as
expected.

*# mpirun -host node21,node22,node23,node24,node25 hostname*
node24.emperor
node22.emperor
node21.emperor
node25.emperor
node23.emperor


  Anthony Thyssen ( System Programmer )    <a.thys...@griffith.edu.au>
 --------------------------------------------------------------------------
   Rosscott's Law:  The faster the computer,
                    The faster it can go wrong.
 --------------------------------------------------------------------------

_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

[OMPI users] OpenMPI with-tm is not obeying torque

Reply via email to