Am 15.05.2012 um 23:20 schrieb Jake Carroll:

> Hi Reuti,
> 
> 
> Thank you for your responses. I appreciate your time. See in line:
> 
> 
> On 16/05/12 7:14 AM, "Reuti" <[email protected]> wrote:
> 
>> Hi,
>> 
>> Am 15.05.2012 um 22:59 schrieb Jake Carroll:
>> 
>>> A couple of quick questions this morning with some ROCKS/SGE scheduler
>>> semantics.
>>> 
>>>     € I've got some new users who want to drive the cluster we have set up
>>> with the very maximum efficiency possible. I.e ­ a user can use as much
>>> of the cluster that is possible when they submit a job. With over 1000
>>> cores  but many users, one of the things we did do was limit a users
>>> ability to take up more than about 300 or 400 slots, such that they
>>> could only ever utilise maybe 20 to 30% of the cluster at any given
>>> time. My new users don't like this ­and they want to be able to use 100%
>>> of the system, if it's free and no other jobs are running.
>> 
>> How long do their jobs run? Once a job is entitled to run by SGE, it will
>> stay in this state. Slo they could block the cluster for some time once
>> their jobs started.
> 
> They can indeed run for weeks.
> 
>> 
>> 
>>> Now, my understanding is that we could definitely remove that limit of
>>> 300 or 400 slots/jobs, but it'll have a couple of detrimental impacts:
>>> Primarily ­it'll preclude any other user from starting jobs at any
>>> given time if their jobs are running, as there are no free slots.
>> 
>> Yes. Maybe it would work for you: a single user can only other 80% of the
>> cluster, even if there are free cores.
> 
> Yup. We are considering just giving them more allocation, rather than
> simply letting them use the entire cluster.
> 
>> 
>> 
>>> 2. My users told me "no, no ­ you can simply put our jobs "to sleep"
>>> when others in the queue log in to run their jobs.
>>> 
>>> Now, my understanding of that is, yes, that is possible (though, I
>>> don't know how it's implemented ­ fairshare policy queue / weight
>>> perhaps?) BUT it has the big drawback that when a users job is "asleep",
>>> it will actually still keep ahold of the memory allocation on the node,
>>> thus, if another big mem job comes along and the node is memory
>>> over-subscribed, crashing scenarios will ensue! Can somebody confirm
>>> that kind of functionality/concern for me?
>> 
>> Correct. Used resources remain used. But you could allow some swap space,
>> and swapping out the sleeping job would only occur once.
> 
> OK. That is a good idea. Do you know of how I can do this? Is there a
> specific part of the SGE man pages that I'll need to consult to learn
> about how to enable "sleep" states when other users try to run jobs? Never
> implemented that before.

Just increase the swap space on disk, and use only the real available memory 
for the jobs.


>>> 3. My users want jobs to "persist" over the course of a cluster head
>>> node crash. Would I be right in saying that it's only possible to
>>> persist across crashes if the users are using CHECKPOINTING in their
>>> jobs? I've heard of it before ­just never implemented it and don't know
>>> where to start.
>> 
>> You can reboot the the head node of an SGE cluster which runs the qmaster
>> any time you like. The jobs aren't affected at all. For this
>> functionality no checkpointing is necessary. Checkpointng (if already
>> supported by the application outside of SGE) would allow a crash of an
>> exechost.
> 
> OK. That's important to know. So it's a function of the application
> outside of SGE, and not so much SGE itself.

Yep. SGE will just support to integrate an already existing checkpointing into 
SGE.

-- Reuti
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to