Hi folks, It seems like oversubscription is disabled by default in OpenMPI 1.8.7, at least when running on a PBS/torque-managed node. When I run a program in parallel on a node with 8 cores, I am not able to use more than 8 ranks:
> mic@aia272:~> mpirun --display-allocation -n 9 hostname > > ====================== ALLOCATED NODES ====================== > aia272: slots=8 max_slots=0 slots_inuse=0 state=UP > ================================================================= > -------------------------------------------------------------------------- > There are not enough slots available in the system to satisfy the 9 slots > that were requested by the application: > hostname > > Either request fewer slots for your application, or make more slots available > for use. > -------------------------------------------------------------------------- However, if I specifically enable oversubscription through the rmaps_base_oversubscribe setting, it works again: > mic@aia272:~> mpirun --display-allocation --mca rmaps_base_oversubscribe 1 -n > 9 hostname > > ====================== ALLOCATED NODES ====================== > aia272: slots=8 max_slots=0 slots_inuse=0 state=UP > ================================================================= > aia272 > aia272 > aia272 > aia272 > aia272 > aia272 > aia272 > aia272 > aia272 Now I am wondering, is this a bug or a feature? We recently upgraded from 1.6.x to 1.8.7, and as far as I remember, in 1.6.x oversubscription was enabled by default. Regards, Michael P.S.: In ompi_info, both rmaps_base_no_oversubscribe and rmaps_base_oversubscribe are reported as “false”. Our prefix/etc/openmpi-mca-params.conf file is empty.