Hi,

Am 22.11.2013 um 17:32 schrieb Gans, Jason D:

> I would like to run an instance of my application on every *core* of a small 
> cluster. I am using Torque 2.5.12 to run jobs on the cluster. The cluster in 
> question is a heterogeneous collection of machines that are all past their 
> prime. Specifically, the number of cores ranges from 2-8. Here is the Torque 
> "nodes" file:
> 
> n0000 np=2
> n0001 np=2
> n0002 np=8
> n0003 np=8
> n0004 np=2
> n0005 np=2
> n0006 np=2
> n0007 np=4
> 
> When I use openmpi-1.6.3, I can oversubscribe nodes but the tasks are 
> allocated to nodes without regard to the number of cores on each node 
> (specified by the "np=xx" in the nodes file). For example, when I run "mpirun 
> -np 24 hostname", mpirun places three instances of "hostname" on each node, 
> despite the fact that some nodes only have two processors and some have more.

You submitted the job itself by requesting 24 cores for it too?

-- Reuti


> Is there a way to have OpenMPI "gracefully" oversubscribe nodes by allocating 
> instances based on the "np=xx" information in the Torque nodes file? It this 
> a Torque problem?
> 
> p.s. I do get the desired behavior when I run *without* Torque and specify 
> the following machine file to mpirun:
> 
> n0000 slots=2
> n0001 slots=2
> n0002 slots=8
> n0003 slots=8
> n0004 slots=2
> n0005 slots=2
> n0006 slots=2
> n0007 slots=4
> 
> Regards,
> 
> Jason
> 
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to