On 09.03.2017 18:52, Reuti wrote:

Am 09.03.2017 um 17:41 schrieb Roberto Nunnari <roberto.nunn...@supsi.ch>:

On 09.03.2017 15:14, Reuti wrote:
Hi,

Am 09.03.2017 um 14:24 schrieb Roberto Nunnari <roberto.nunn...@supsi.ch>:

Hi Reuti.
Hi William.

here's my settings you required:
params                            MONITOR=1
max_reservation                   32
default_duration                  0:10:0

I cannot understand how What I see in ${SGE_ROOT}/${SGE_CELL}/common/schedule 
can help me.. here's a little extract for a job submitted with -R y, and it 
keeps repeating without change
...
3653372:1:RESERVING:1489043424:660:P:smp:slots:32.000000
3653372:1:RESERVING:1489043424:660:Q:long.q@node19.cluster:slots:32.000000
3653372:1:RESERVING:1489043424:660:P:smp:slots:32.000000
3653372:1:RESERVING:1489043424:660:Q:long.q@node19.cluster:slots:32.000000

What else is running in the cluster? Are there other jobs blocked which would 
otherwise slip in? All request -l h_rt=…?

Hi.

There are always smaller jobs (without -R y) pending in the queue that get in 
front of bigger jobs (with -R y).
The user of this big job doesn't make use of options like h_rt, mem_free, etc.. 
but only asks for a particular node, ie: hostname=node19.cluster

So essentially the node19 should get drained over time.

Yes, I expect that over time slots on node19 will be reserved for the job requesting reservation, as they become free when jobs running on node19 exit.






(When no job requests -l h_rt=… and only the default length apply [which won't 
be enforced], SGE might look for another node to make the reservation.)

the other users usually use -l h_rt=.. and mem_free=.. and as they are serial 
jobs or parallel jobs that asks less resources, they slip in front of the job 
that asks more resources even if it was submitted long before and makes use of 
-R y.

What you can see of course is the possible back-filling of node19. Can you 
check the requested h_rt requests for the other jobs already running on node19? 
As long as the longest job on this node will run, shorter jobs can be filled in 
in case their runtime is lower than this longest job will continue to run.



One more question. how can I understand that something is moving with 
reservation (ie see that the scheduler has started reserving slots) by looking 
in the file ${SGE_ROOT}/${SGE_CELL}/common/schedule ?

When you request a special node, the reservation can't move to another node. I 
saw this only in case the job with -R y may freely be scheduled inside the 
cluster and the already running jobs have no h_rt (hence the default_runtime 
applies) and they run much longer than anticipated, so that the reservation at 
one point can be fulfilled sooner when it moves to a another node.

I don't mean move from node to node.. by moving I mean that something happens in the scheduler.. that the scheduler reserves a slot for the pending job requesting reservation.. in the schedule file, I see only lines with the word RESERVING.. and never something like RESERVED.. or little changes that tell me that something is changing.. I always see lines like these:
3653372:1:RESERVING:1489043424:660:P:smp:slots:32.000000
3653372:1:RESERVING:1489043424:660:Q:long.q@node19.cluster:slots:32.000000
I believe that if the scheduler reserves a slot, something in these lines should change..

Thank you.

--
Roberto Nunnari
Servizi Informatici Ti-Edu
Via Pobiette 11 - 6928 Manno - Switzerland
helpdesk email: mailto: h...@ti-edu.ch
direct email: mailto:roberto.nunn...@supsi.ch
tel: +41-58-6666561
_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to