Hi,
PBS version : 2021.1.3
OpenMPI : 5.0.3
I’d like to do a partial oversubscription with MPMD, like this

  1.  Request 40 trunks in total
  2.  First group of 32 trunks with 128 cores per trunk, 1 MPI rank per core, 1 
OMP threads per MPI rank
  3.  Second group of 8 trunks with 128 cores per trunk, 64 MPI rank per core, 
1 OMP threads per MPI rank

Tried to launch MPMD with these combinations

1. mpiexec  -n 4096 A :  -n 512 --npernode 64 B
2. mpiexec  -n 4096 A :  -n 512 --map-by ppr:64:node B


neither did it work.
Looked like OpenMPI just launch B with 128 MPI processes per core instead of 
64. Wondering if it’s due to the lists of nodes in PBS_NODEFILE, as it will get 
a lists of 128x40 nodes in the PBS_NODEFILE if using 
select=40:ncpus=128:mpiprocs=128.
Just wondering if there was an alternative approach, such as using placement 
paraments or by using PBS select directives to get expected PBS_NODEFILE, other 
then touch the PBS_NODEFILE with a pre-launch script
Any idea ?
Thanks for your time
Regards
Jerry

Reply via email to