Re: [gridengine users] job cannot run in parallel environment "smp" because it only offers 2 slots

Bob Tupper Tue, 21 Feb 2012 11:31:08 -0800

You could change the
consumable to from YES to JOB




On 02/21/2012 11:20 AM, Txema Heredia Genestar wrote:

Hello all,
I am having some problems to run threaded jobs in SGE 6.1u4. In ourcluster, h_vmem is defined as a consumable attribute in all nodes. Itis mandatory, all jobs must request it, with a default value of 6Gb.That constraint leads any "parallel" job sent to the cluster to try toreserve a lot of memory (h_vmem * slots). This is ok for most parallelprocesses (mpi and the such). But, sometimes, we need to run"threaded" jobs, where all jobs share a chunk of memory (everything ona single node). This leads to situations where I need to send an8-threaded job that requires, say, 10 Gb of memory, but it cannot bescheduled because no node can handle a 80Gb request. When a memoryrequest cannot be fulfilled, the typical message of "cannot run in PE"smp" because it only offers N slots" appears in qstat (where N is themaximum number of slots I wolud be able to use given the requestedh_vmem size).
This is the parallel environment I am trying to use:

# qconf -sp smp
pe_name           smp
slots             9999
user_lists        test_users
xuser_lists       NONE
start_proc_args   /bin/true
stop_proc_args    /bin/true
allocation_rule   $fill_up
control_slaves    FALSE
job_is_first_task FALSE
urgency_slots     min
The most annoying part of all this is that this behaviour is notconsistent: This morning I've been able to run a 6-threaded jobrequesting 10Gb of memory in a 48Gb node. But, in the afternoon, thesame job using the very same command in the same node could not be run.
Does anyone have any suggestion on how to deal with this?

Thanks in advance,

Txema

_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Re: [gridengine users] job cannot run in parallel environment "smp" because it only offers 2 slots

Reply via email to