Re: [gridengine users] subordinate_list not suspending tasks

Lars van der bijl Wed, 24 Oct 2012 00:59:05 -0700

On 23 October 2012 22:08, Reuti <[email protected]> wrote:
> Hi,
>
> Am 23.10.2012 um 21:41 schrieb Lars van der bijl:
>
>> I've got 2 queue's
>>
>> $ qconf -sq final.q
>> qname                 final.q
>> hostlist              @allhosts
>> suspend_thresholds    NONE
>> nsuspend              1
>> suspend_interval      00:01:00
>> pe_list               make smp
>> rerun                 TRUE
>>
>>
>> $ qconf -sq quick.q
>> qname                 quick.q
>> hostlist              @allhosts
>> suspend_thresholds    NONE
>> nsuspend              1
>> suspend_interval      00:01:00
>> pe_list               make smp
>> rerun                 TRUE
>> subordinate_list      final.q=1
>>
>> we have about 325 procs and both queue's have access to the same machines.
>>
>> what I'd except to see is if I have 200 slots running in final.q and I
>> submit a task to quick.q that it would suspend the task in final.q and
>> push the new task in front.
>> however what I am seeing that that only 32 slots are being used. and
>> not all tasks are being pushed in front of the final.q
>>
>> we only use parallel submission in case that makes a difference.
>>
>> what could I change to get this behavior?
>
> hard to tell from the information you posted, as I don't know how 32 are in 
> any way related to 325 procs without knowing more details. So some remarks, 
> maybe you can refine the setup or question then:
>
> - the subordinate_list will only work "per exechost" queue instance
> - in your current setup all slots from queue instance on a particular 
> exechost will be suspended as soon as one slot in quick.q is used
> - (may you are looking for a slot-wise subordination?)


the problem I have with the slot-wise setup is that you can only set 1
slot value for the subordinate_list.
what we have is a lot of 8 core machine. a few 4 ,6 and 12 cores. so
those would have to go in separate queue's i'd imagine.

We frequently submit the same task with 4 cores or 8 cores. using "-pe
smp 4" or "-pe smp 8" this causes a slotwise setup to be difficult to
setup because if its set to 2 slots per host then a task submitted
with 8 proc won't get suspended.

> - jobs in the quick.q don't have a higher priority
> - it's best not to submit to queues in SGE, but think of "request resources" 
> and SGE will select an appropriate queue for your job
>
> For your setup this could mean to define a BOOL complex "quick" as 
> "requestable FORCED" and attach it to the quick.q, then request "-l quick" 
> (which implies "-l quick=TRUE") and in addition attach a high "urgency" value 
> to this complex. Then they should go also to the top of the list. And only 
> "quick" will run in this queue.

thanks this is a different way of thinking about them problem for me.

to specify what hosts can run a type of job we currently submit with
hostgroups like so.

quick.q@@mantra

now for other types of task we have a host group setup because we only
have 10 license for a application. a single machine can run more then
one of these tasks at a time but the license is only consumed ones per
host.
is there a way to have this setup with a complex?

>
> -- Reuti

_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Re: [gridengine users] subordinate_list not suspending tasks

Reply via email to