These JSV scripts look very useful - I'll read about them.  Thanks for the
example.

On Mon, Jan 23, 2012 at 5:04 PM, Reuti <[email protected]> wrote:

> Am 23.01.2012 um 21:55 schrieb Andrew Pearson:
>
> > Thanks Reuti
>
> You're welcome.
>
>
> > OK - I made duplicates of all of my parallel environments, so that the
> slow queue has a different PE list than the fast queue.  The submitted job
> now runs on the correct queue.
> >
> > However, in some sense I'm back to square one.  The reason I created two
> queues and made them non-requestable is that I wanted to assign resources
> to users, rather than have them choose them.  Now, the user can effectively
> choose which queue to be in by choosing the correct parallel environment.
>  I can't see a way to make the parallel environments non-requestable.
>
> The queue you can even leave as requestable. This is the way SGE usually
> works: a user request resources and SGE will choose an appropriate queue to
> satisfy these requests.
>
> Nevertheless: in case you want to enforce a policy you can use a JSV to
> correct/remove resource requests of the user or also to attach some on your
> own. In your case:
>
> - a queue is requested, remove the request
> - a specific PE is requested: replace it with an attached asterisk
>
> Instead of correcting the request, you could also just output that the job
> is declined and why.
>
> ====
> #!/bin/sh
>
> PATH=/bin:/usr/bin
>
> jsv_on_start()
> {
>   return
> }
>
> jsv_on_verify()
> {
>
>   do_correct="false"
>   do_wait="false"
>
>   pe_name=$(jsv_get_param pe_name)
>   if [ "$pe_name" ]; then
>      if ! [[ $pe_name =~ [*]$ ]]; then
>         jsv_set_param pe_name "$pe_name*"
>         do_correct="true"
>      fi
>   fi
>
>   if [ "$do_wait" = "true" ]; then
>      jsv_reject_wait "Job is rejected. It might be submitted later."
>   elif [ "$do_correct" = "true" ]; then
>      jsv_correct "Job was modified before it was accepted"
>   else
>      jsv_accept "Job is accepted"
>   fi
>   return
> }
>
> . ${SGE_ROOT}/util/resources/jsv/jsv_include.sh
>
> jsv_main
> ===
>
> which you can compare to the examples in
> $SGE_ROOT/usr/sge/util/resources/jsv.sh. If there is no asterisk at the end
> (BTW: the asterisk(s) could be anywhere in the string), one is appended
> (ok, you could always append one, it won't hurt) - `man
> jsv_script_interface` to implement similar corrections (i.e. removal):
>
>   jsv_del_param q_hard
>   jsv_del_param q_soft
>
> in case it was set. The URL needs to be set too to this script:
>
> $ qconf -sconf
> ...
> jsv_url                      /home/reuti/jsv.sh
>
> (Perl might be faster though).
>
>
> > Even if this were possible however, if the user doesn't include a -pe
> line in their submission script, I don't see how they would specify the
> number of processors they need.
>
> Is this a typo, if it's possible, the users can use it to specify the
> necessary slot count.
>
> -- Reuti
>
>
> > Sorry for my basic questions.  I'd appreciate any comments you have.
> >
> >
> >
> > On Mon, Jan 23, 2012 at 2:57 PM, Reuti <[email protected]>
> wrote:
> > Am 23.01.2012 um 20:34 schrieb Andrew Pearson:
> >
> > > Hi.  I'm trying to move from load-based to sequence based scheduling,
> and I have a problem.  First, a little something about my setup:
> > >
> > > I have two sets of machines - 176 'fast' cores in 16-core nodes, and
> 90 'slow' cores in 2-core nodes.  I have two corresponding queues - slow.q
> and fast.q.  The queues are non-requestable.  fast.q looks at the @fast
> host group, which contains only the names of the fast nodes, and slow.q
> looks at the @slow host group, which contains only the names of the slow
> nodes.  In fast.q, I have slots = 16 and processors = 16, while in slow.q I
> have slots = 2 and processors = 2.  Finally, slow.q is seq_no 1 and fast.q
> is seq_no 2.
> > >
> > > Here's the problem:  If I submit a 120 processor job (so it's too
> large to fit on the slow cores), it still gets assigned to slow.q.  This in
> itself is bad - I want such a job to go directly to fast.q.  Its gets worse
> though - because there aren't enough machines in slow.q, the remaining 30
> threads end up on nodes in fast.q!  I don't understand how this second part
> is possible.  I've done qstat -f, and my 'fast' compute nodes definitely
> aren't listed as being members of slow.q.
> > >
> > > Any suggestions?  Thank you.
> >
> > If the same PE is attached to more than one queue, it can collect slots
> from any of them:
> >
> > http://gridengine.org/pipermail/users/2012-January/002526.html
> >
> > -- Reuti
> >
> >
>
>
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to