Re: [gridengine users] RQS Help

Ray Spence Tue, 26 Jun 2012 12:00:22 -0700

Hi Reuti,

Back on the list. Please see below -


On Tue, Jun 26, 2012 at 11:41 AM, Reuti <[email protected]> wrote:

> Am 26.06.2012 um 20:30 schrieb Ray Spence:
>
> > Reuti,
> >
> > Thank you so much for making RQS/h_vmem clear.
> >
> > I hope I'm not taking advantage of you here - I apologize if so.
>
> No, but please ask on the list or register. Therefore I didn't forward you
> last posting as it was sent from an unknown address.
>
>
> > I have another question regarding slots. Our cluster has 4 nodes each
> with 32 cores. My
> > assumption is that SGE should be able to run 128 total jobs at any time.
> I see only
> > 1 job running per node with many jobs in qw. I think I need to change
> the "slots" value
> > in the queue config? Here is what I have (still early on in learning SGE
> config..)
> > from qconf -sq <queue>
> >
> > slots                1,[scf-sm00.Stat.Berkeley.EDU=32], \
> >                       [scf-sm01.Stat.Berkeley.EDU=32], \
> >                       [scf-sm02.Stat.Berkeley.EDU=32], \
> >                       [scf-sm03.Stat.Berkeley.EDU=32]
>
> It's a matter of taste: the above is correct. If you have identical nodes,
> you can even shorten it to:
>
> slots      32
>

Great, we do - I'll try that!


>
> It's the number of slots per queue instance.
>
>
> > should the "1" be 32? 128? Or, where is it that I tell SGE to use all 32
> cores?
>
> As the default memory consumption is 248g, only one job can run at a time
> I would say.


Ok, this is not what we want at all. It is more important to use all 32
cores/node than attempting
any ram usage control. I'm going to back out the h_vmem complex setting in
order to run 128 jobs at a time. Should I reset the h_vmem complex back to
not consumable (NO)
or keep it comsumable but set its default to say 1G? If I do that won't
users have to request
a higher h_vmem  amount upon job submission?



> Try to submit jobs with "sleep 120" or so for which you requested less
> memory on the command line. The actual used up memory can be checked:
>
> qhost -F h_vmem
>
>
> -- Reuti
>
>
> > Thank you again,
> > Ray
> >
> >
> > On Tue, Jun 26, 2012 at 11:14 AM, Reuti <[email protected]>
> wrote:
> > Am 26.06.2012 um 19:42 schrieb Ray Spence:
> >
> > > Hi Reuti,
> > >
> > > I'll respond in-line:
> > >
> > > On Mon, Jun 25, 2012 at 4:21 PM, Reuti <[email protected]>
> wrote:
> > > Hi,
> > >
> > > Am 26.06.2012 um 00:57 schrieb Ray Spence:
> > >
> > > > I apologize for more questions but I'm not getting to where our
> group wants our new
> > > > cluster to be. In order to limit all of a given user's jobs in a
> specified queue to a total
> > > > amount of physical ram (h_vmem) I see no other solution than an RQS.
> Is this true?
> > >
> > > Correct. h_vmem is a hard limit while others prefer virtual_free as a
> guidance for SGE, while the latter is not enforced:
> > >
> > >
> http://www.gridengine.info/2009/12/01/adding-memory-requirement-awareness-to-the-scheduler/
> > >
> > >
> > > I've read this info from you. When you say "Use the one you defined in
> your qsub command by requesting it with the -l option..." I take you to
> mean that once I've made a given memory complex (h_vmem, virtual_free,
> etc.) consumable (qconf -mc) in order to enforce any limit on that complex
> users must request a number value on that complex at job submission. I think
> > > I'm repeating myself here.. Your info here is what lead me to pose my
> question in the first place.
> > >
> > >
> > > > Using qconf -mq <queue> will limit each job in <queue> but not each
> user's total
> > > > memory footprint across all his jobs, correct?
> > >
> > > Correct.
> > >
> > >
> > > > The node-level limit does not do what
> > > > we want here..
> > >
> > > Correct, it's the memory usage across all queues and resp. all jobs on
> a node.
> > >
> > >
> > > > I have this RQS in place:
> > > >
> > > > {
> > > >    name         high.q-h_vmem
> > > >    description  "high.q h_vmem limited to 128G"
> > >
> > > The quotation marks are not necessary.
> > >
> > >
> > > >    enabled      TRUE
> > > >    limit        users {*} queues high.q to h_vmem=128g
> > > > }
> > >
> > > You made h_vmem consumable and attached a value per exechost?
> > >
> > > Yes - via qconf -mc, here is what the h_vmem line looks like:
> > >
> > > h_vmem              h_vmem     MEMORY      <=    YES         YES
>  248g     0
> >
> > It should be set to a default you expect to be taken for a job. We set
> it to 2g here, and users can increase the per job limit to the one set in
> the queue definition.
> >
> >
> > > (should the "default" value here be different than 248? see below..
> Must it be 0? Must it NOT be 0?)
> > >
> > > and via qconf -me I've set h_vmem to be a little less (248G) than the
> installed ram (256G)
> > > on each of the cluster's 4 nodes:
> > >
> > > qconf -se <cluster_node>
> > > hostname              <>
> > > load_scaling          NONE
> > > complex_values        slots=32,h_vmem=248G
> > > .....
> > >
> > >
> > >
> > > > which would seem to accomplish our goal. However, jobs submitted to
> high.q against this
> > > > RQS without stating h_vmem needs at submission but which are written
> to exceed the memory limit do exceed the memory limit.
> > >
> > > Correct, the RQS will check the job request for h_vmem, but there is
> no relation back, i.e. that the RQS will limit the job's memory. Specifying
> only an overall limit per user would even make it hard for the RQS to
> decide what limit to (per job) set at all. Or if the overall limit is
> passed: which job should be killed?
> > >
> > >
> > > > Worse, jobs submitted to high.q with an h_vmem need set below the
> RQS limit but which are written to exceed the limit successfully gobble up a
> > > > forbidden amount of ram.
> > >
> > > I don't get this sentence. Can you make an example?
> > >
> > > I have a simple shell script that runs the linux tool stress which
> asks the system for some amount of ram, here is that line:
> > >
> > > /usr/bin/stress -v --cpu 1 --io 2 --vm 1 --vm-bytes 150G --vm-hang 0
> > >
> > > which ramps up to occupy 150GB by reading and dirtying ram. The
> --vm-hang 0 part tells
> > > stress to simply stop and hang around indefinitely once stress has
> occupied 150GB. This
> > > script succeeds if I do not state h_vmem request at job submission
> >
> > ...as the default is 248g
> >
> > > or if I ask for h_vmem under
> > > the RQS limit of 128G.
> >
> > NB: g = base 1000, G = base 1024 (man sge_types)
> >
> >
> > > It seems if RQS is satisfied upon job submission
> >
> > No, at job start.
> >
> >
> > > then it does not
> > > monitor ram usage once a job is running
> >
> > RQS will never monitor running jobs.
> >
> >
> > > - you say as much in this response.
> >
> > You can check with:
> >
> > $ ulimit -aH
> > $ ulimit -aS
> >
> > what was set by SGE for the limits. In addition SGE's execd (not the
> RQS) will monitor the usage which was requested by -l h_vmem=... or set by
> the default in the complex definition (man queue_conf, section RESOURCE
> LIMITS). This will be done by the execd, which doesn't know anyhing about
> other user's jobs on other nodes.
> >
> >
> > > > Regarding ram usage: I have tested and read enough on RQS and the
> various ways to configure SGE to conclude that RQS doesn't actually monitor
> ram usage once a job has been submitted?
> > >
> > > It will monitor the requested RAM to decide whether any submitted job
> is eligible to start. All running ones should never pass h_vmem if added up.
> > >
> > > But here (second sentence) you imply that with an h_vmem value in an
> RQS that SGE does indeed monitor a user's running jobs to see if the
> cumulative ram usage exceeds the RQS
> >
> > Not the usage. It's a consumable, so it will just add up all requested
> h_vmem requests at time of start of the job and allow it to run or not.
> >
> > -- Reuti
> >
> >
> > > h_vmem limit? Is this true but also that SGE will not kill any job to
> get a user's ram footprint down below the RQS? The monitoring is used only
> to determine if a submitted job may be run if that submitted job's h_vmem
> request and that user's current ram usage are together below the RQS limit?
> >
> >
> >
>
>

_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Re: [gridengine users] RQS Help

Reply via email to