On 09.03.2017 18:52, Reuti wrote:
Am 09.03.2017 um 17:41 schrieb Roberto Nunnari <roberto.nunn...@supsi.ch>:
On 09.03.2017 15:14, Reuti wrote:
Hi,
Am 09.03.2017 um 14:24 schrieb Roberto Nunnari <roberto.nunn...@supsi.ch>:
Hi Reuti.
Hi William.
here's my settings you required:
params MONITOR=1
max_reservation 32
default_duration 0:10:0
I cannot understand how What I see in ${SGE_ROOT}/${SGE_CELL}/common/schedule
can help me.. here's a little extract for a job submitted with -R y, and it
keeps repeating without change
...
3653372:1:RESERVING:1489043424:660:P:smp:slots:32.000000
3653372:1:RESERVING:1489043424:660:Q:long.q@node19.cluster:slots:32.000000
3653372:1:RESERVING:1489043424:660:P:smp:slots:32.000000
3653372:1:RESERVING:1489043424:660:Q:long.q@node19.cluster:slots:32.000000
What else is running in the cluster? Are there other jobs blocked which would
otherwise slip in? All request -l h_rt=…?
Hi.
There are always smaller jobs (without -R y) pending in the queue that get in
front of bigger jobs (with -R y).
The user of this big job doesn't make use of options like h_rt, mem_free, etc..
but only asks for a particular node, ie: hostname=node19.cluster
So essentially the node19 should get drained over time.
Yes, I expect that over time slots on node19 will be reserved for the
job requesting reservation, as they become free when jobs running on
node19 exit.
(When no job requests -l h_rt=… and only the default length apply [which won't
be enforced], SGE might look for another node to make the reservation.)
the other users usually use -l h_rt=.. and mem_free=.. and as they are serial
jobs or parallel jobs that asks less resources, they slip in front of the job
that asks more resources even if it was submitted long before and makes use of
-R y.
What you can see of course is the possible back-filling of node19. Can you
check the requested h_rt requests for the other jobs already running on node19?
As long as the longest job on this node will run, shorter jobs can be filled in
in case their runtime is lower than this longest job will continue to run.
One more question. how can I understand that something is moving with
reservation (ie see that the scheduler has started reserving slots) by looking
in the file ${SGE_ROOT}/${SGE_CELL}/common/schedule ?
When you request a special node, the reservation can't move to another node. I
saw this only in case the job with -R y may freely be scheduled inside the
cluster and the already running jobs have no h_rt (hence the default_runtime
applies) and they run much longer than anticipated, so that the reservation at
one point can be fulfilled sooner when it moves to a another node.
I don't mean move from node to node.. by moving I mean that something
happens in the scheduler.. that the scheduler reserves a slot for the
pending job requesting reservation.. in the schedule file, I see only
lines with the word RESERVING.. and never something like RESERVED.. or
little changes that tell me that something is changing.. I always see
lines like these:
3653372:1:RESERVING:1489043424:660:P:smp:slots:32.000000
3653372:1:RESERVING:1489043424:660:Q:long.q@node19.cluster:slots:32.000000
I believe that if the scheduler reserves a slot, something in these
lines should change..
Thank you.
--
Roberto Nunnari
Servizi Informatici Ti-Edu
Via Pobiette 11 - 6928 Manno - Switzerland
helpdesk email: mailto: h...@ti-edu.ch
direct email: mailto:roberto.nunn...@supsi.ch
tel: +41-58-6666561
_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users