These JSV scripts look very useful - I'll read about them. Thanks for the example.
On Mon, Jan 23, 2012 at 5:04 PM, Reuti <[email protected]> wrote: > Am 23.01.2012 um 21:55 schrieb Andrew Pearson: > > > Thanks Reuti > > You're welcome. > > > > OK - I made duplicates of all of my parallel environments, so that the > slow queue has a different PE list than the fast queue. The submitted job > now runs on the correct queue. > > > > However, in some sense I'm back to square one. The reason I created two > queues and made them non-requestable is that I wanted to assign resources > to users, rather than have them choose them. Now, the user can effectively > choose which queue to be in by choosing the correct parallel environment. > I can't see a way to make the parallel environments non-requestable. > > The queue you can even leave as requestable. This is the way SGE usually > works: a user request resources and SGE will choose an appropriate queue to > satisfy these requests. > > Nevertheless: in case you want to enforce a policy you can use a JSV to > correct/remove resource requests of the user or also to attach some on your > own. In your case: > > - a queue is requested, remove the request > - a specific PE is requested: replace it with an attached asterisk > > Instead of correcting the request, you could also just output that the job > is declined and why. > > ==== > #!/bin/sh > > PATH=/bin:/usr/bin > > jsv_on_start() > { > return > } > > jsv_on_verify() > { > > do_correct="false" > do_wait="false" > > pe_name=$(jsv_get_param pe_name) > if [ "$pe_name" ]; then > if ! [[ $pe_name =~ [*]$ ]]; then > jsv_set_param pe_name "$pe_name*" > do_correct="true" > fi > fi > > if [ "$do_wait" = "true" ]; then > jsv_reject_wait "Job is rejected. It might be submitted later." > elif [ "$do_correct" = "true" ]; then > jsv_correct "Job was modified before it was accepted" > else > jsv_accept "Job is accepted" > fi > return > } > > . ${SGE_ROOT}/util/resources/jsv/jsv_include.sh > > jsv_main > === > > which you can compare to the examples in > $SGE_ROOT/usr/sge/util/resources/jsv.sh. If there is no asterisk at the end > (BTW: the asterisk(s) could be anywhere in the string), one is appended > (ok, you could always append one, it won't hurt) - `man > jsv_script_interface` to implement similar corrections (i.e. removal): > > jsv_del_param q_hard > jsv_del_param q_soft > > in case it was set. The URL needs to be set too to this script: > > $ qconf -sconf > ... > jsv_url /home/reuti/jsv.sh > > (Perl might be faster though). > > > > Even if this were possible however, if the user doesn't include a -pe > line in their submission script, I don't see how they would specify the > number of processors they need. > > Is this a typo, if it's possible, the users can use it to specify the > necessary slot count. > > -- Reuti > > > > Sorry for my basic questions. I'd appreciate any comments you have. > > > > > > > > On Mon, Jan 23, 2012 at 2:57 PM, Reuti <[email protected]> > wrote: > > Am 23.01.2012 um 20:34 schrieb Andrew Pearson: > > > > > Hi. I'm trying to move from load-based to sequence based scheduling, > and I have a problem. First, a little something about my setup: > > > > > > I have two sets of machines - 176 'fast' cores in 16-core nodes, and > 90 'slow' cores in 2-core nodes. I have two corresponding queues - slow.q > and fast.q. The queues are non-requestable. fast.q looks at the @fast > host group, which contains only the names of the fast nodes, and slow.q > looks at the @slow host group, which contains only the names of the slow > nodes. In fast.q, I have slots = 16 and processors = 16, while in slow.q I > have slots = 2 and processors = 2. Finally, slow.q is seq_no 1 and fast.q > is seq_no 2. > > > > > > Here's the problem: If I submit a 120 processor job (so it's too > large to fit on the slow cores), it still gets assigned to slow.q. This in > itself is bad - I want such a job to go directly to fast.q. Its gets worse > though - because there aren't enough machines in slow.q, the remaining 30 > threads end up on nodes in fast.q! I don't understand how this second part > is possible. I've done qstat -f, and my 'fast' compute nodes definitely > aren't listed as being members of slow.q. > > > > > > Any suggestions? Thank you. > > > > If the same PE is attached to more than one queue, it can collect slots > from any of them: > > > > http://gridengine.org/pipermail/users/2012-January/002526.html > > > > -- Reuti > > > > > >
_______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
