Hi, Am 09.03.2012 um 18:34 schrieb Robert Hutton:
> We currently have a functioning grid engine cluster with six 24-core > machines. We currently have it set up with three consumable resources: > > - h_rt > - h_vmem > - virtual_free > > I've encouraged users to not think about which queue to schedule their > jobs in, but instead specify the three consumable resources and have the > scheduler determine the best queue. > > Currently, there is only one queue that all jobs get scheduled in, > called longrun.q. But this means that very short jobs sometimes have to > wait for very long jobs to finish before they can get any slots in that > queue. What I'd like to do is have a second queue, shortrun.q, in which > any short jobs get run, with slotwise preemption set up so that they can > quickly run and then let the longer jobs continue. > > I set up this very thing, with a h_rt limit of 2 hours on the > shortrun.q, but it had unexpected consequences when there were no long > running jobs filling longrun.q and a user submitted a lot of very > short-running jobs: longrun.q and shortrun.q both immediately filled > with short jobs, with the jobs in longrun.q immediately preempted by > those in shortrun.q. > > I suppose my question is: is there any way to prevent the short running > jobs from entering longrun.q, or do I have to tell my users to use "-q > shortrun.q" when they want to schedule jobs with h_rt less than two hours? Instead of requesting a queue, it would be more flexible to request a boolean complex, which is attached to the shortrun.q. In fact, you are right that there is no minimum runtime for a queue which you could define and would be checked for jobs being send thereto. I entered an RFE for it some time ago. Other ways to solve it: - change the sort order of queues, so that the shortrun.q will be used first, and of course fail for the long running jobs to be assigned thereto so that they go to the other queue. - request the boolean complex automatically in a JSV, if you detect that the user requested only a short runtime. -- Reuti > Thanks, > > Rob > > -- > Robert Hutton > Senior Systems and Database Administrator > Centre for Genomics and Global Health <http://cggh.org> > The Wellcome Trust Centre for Human Genetics > Roosevelt Drive > Oxford > OX3 7BN > United Kingdom > Tel: +44 (0)1865 287721 > _______________________________________________ > users mailing list > [email protected] > https://gridengine.org/mailman/listinfo/users _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
