Am 23.01.2012 um 21:55 schrieb Andrew Pearson:

> Thanks Reuti

You're welcome.


> OK - I made duplicates of all of my parallel environments, so that the slow 
> queue has a different PE list than the fast queue.  The submitted job now 
> runs on the correct queue.
> 
> However, in some sense I'm back to square one.  The reason I created two 
> queues and made them non-requestable is that I wanted to assign resources to 
> users, rather than have them choose them.  Now, the user can effectively 
> choose which queue to be in by choosing the correct parallel environment.  I 
> can't see a way to make the parallel environments non-requestable.

The queue you can even leave as requestable. This is the way SGE usually works: 
a user request resources and SGE will choose an appropriate queue to satisfy 
these requests.

Nevertheless: in case you want to enforce a policy you can use a JSV to 
correct/remove resource requests of the user or also to attach some on your 
own. In your case:

- a queue is requested, remove the request
- a specific PE is requested: replace it with an attached asterisk

Instead of correcting the request, you could also just output that the job is 
declined and why.

====
#!/bin/sh

PATH=/bin:/usr/bin

jsv_on_start()
{
   return
}

jsv_on_verify()
{

   do_correct="false" 
   do_wait="false"

   pe_name=$(jsv_get_param pe_name)
   if [ "$pe_name" ]; then
      if ! [[ $pe_name =~ [*]$ ]]; then
         jsv_set_param pe_name "$pe_name*"
         do_correct="true"
      fi
   fi

   if [ "$do_wait" = "true" ]; then
      jsv_reject_wait "Job is rejected. It might be submitted later."
   elif [ "$do_correct" = "true" ]; then
      jsv_correct "Job was modified before it was accepted"
   else
      jsv_accept "Job is accepted"
   fi
   return
}

. ${SGE_ROOT}/util/resources/jsv/jsv_include.sh

jsv_main
===

which you can compare to the examples in 
$SGE_ROOT/usr/sge/util/resources/jsv.sh. If there is no asterisk at the end 
(BTW: the asterisk(s) could be anywhere in the string), one is appended (ok, 
you could always append one, it won't hurt) - `man jsv_script_interface` to 
implement similar corrections (i.e. removal):

   jsv_del_param q_hard
   jsv_del_param q_soft

in case it was set. The URL needs to be set too to this script:

$ qconf -sconf
...
jsv_url                      /home/reuti/jsv.sh

(Perl might be faster though).


> Even if this were possible however, if the user doesn't include a -pe line in 
> their submission script, I don't see how they would specify the number of 
> processors they need.

Is this a typo, if it's possible, the users can use it to specify the necessary 
slot count.

-- Reuti


> Sorry for my basic questions.  I'd appreciate any comments you have.
> 
> 
> 
> On Mon, Jan 23, 2012 at 2:57 PM, Reuti <[email protected]> wrote:
> Am 23.01.2012 um 20:34 schrieb Andrew Pearson:
> 
> > Hi.  I'm trying to move from load-based to sequence based scheduling, and I 
> > have a problem.  First, a little something about my setup:
> >
> > I have two sets of machines - 176 'fast' cores in 16-core nodes, and 90 
> > 'slow' cores in 2-core nodes.  I have two corresponding queues - slow.q and 
> > fast.q.  The queues are non-requestable.  fast.q looks at the @fast host 
> > group, which contains only the names of the fast nodes, and slow.q looks at 
> > the @slow host group, which contains only the names of the slow nodes.  In 
> > fast.q, I have slots = 16 and processors = 16, while in slow.q I have slots 
> > = 2 and processors = 2.  Finally, slow.q is seq_no 1 and fast.q is seq_no 2.
> >
> > Here's the problem:  If I submit a 120 processor job (so it's too large to 
> > fit on the slow cores), it still gets assigned to slow.q.  This in itself 
> > is bad - I want such a job to go directly to fast.q.  Its gets worse though 
> > - because there aren't enough machines in slow.q, the remaining 30 threads 
> > end up on nodes in fast.q!  I don't understand how this second part is 
> > possible.  I've done qstat -f, and my 'fast' compute nodes definitely 
> > aren't listed as being members of slow.q.
> >
> > Any suggestions?  Thank you.
> 
> If the same PE is attached to more than one queue, it can collect slots from 
> any of them:
> 
> http://gridengine.org/pipermail/users/2012-January/002526.html
> 
> -- Reuti
> 
> 


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to