Hi Reuti, Back on the list. Please see below -
On Tue, Jun 26, 2012 at 11:41 AM, Reuti <[email protected]> wrote: > Am 26.06.2012 um 20:30 schrieb Ray Spence: > > > Reuti, > > > > Thank you so much for making RQS/h_vmem clear. > > > > I hope I'm not taking advantage of you here - I apologize if so. > > No, but please ask on the list or register. Therefore I didn't forward you > last posting as it was sent from an unknown address. > > > > I have another question regarding slots. Our cluster has 4 nodes each > with 32 cores. My > > assumption is that SGE should be able to run 128 total jobs at any time. > I see only > > 1 job running per node with many jobs in qw. I think I need to change > the "slots" value > > in the queue config? Here is what I have (still early on in learning SGE > config..) > > from qconf -sq <queue> > > > > slots 1,[scf-sm00.Stat.Berkeley.EDU=32], \ > > [scf-sm01.Stat.Berkeley.EDU=32], \ > > [scf-sm02.Stat.Berkeley.EDU=32], \ > > [scf-sm03.Stat.Berkeley.EDU=32] > > It's a matter of taste: the above is correct. If you have identical nodes, > you can even shorten it to: > > slots 32 > Great, we do - I'll try that! > > It's the number of slots per queue instance. > > > > should the "1" be 32? 128? Or, where is it that I tell SGE to use all 32 > cores? > > As the default memory consumption is 248g, only one job can run at a time > I would say. Ok, this is not what we want at all. It is more important to use all 32 cores/node than attempting any ram usage control. I'm going to back out the h_vmem complex setting in order to run 128 jobs at a time. Should I reset the h_vmem complex back to not consumable (NO) or keep it comsumable but set its default to say 1G? If I do that won't users have to request a higher h_vmem amount upon job submission? > Try to submit jobs with "sleep 120" or so for which you requested less > memory on the command line. The actual used up memory can be checked: > > qhost -F h_vmem > > > -- Reuti > > > > Thank you again, > > Ray > > > > > > On Tue, Jun 26, 2012 at 11:14 AM, Reuti <[email protected]> > wrote: > > Am 26.06.2012 um 19:42 schrieb Ray Spence: > > > > > Hi Reuti, > > > > > > I'll respond in-line: > > > > > > On Mon, Jun 25, 2012 at 4:21 PM, Reuti <[email protected]> > wrote: > > > Hi, > > > > > > Am 26.06.2012 um 00:57 schrieb Ray Spence: > > > > > > > I apologize for more questions but I'm not getting to where our > group wants our new > > > > cluster to be. In order to limit all of a given user's jobs in a > specified queue to a total > > > > amount of physical ram (h_vmem) I see no other solution than an RQS. > Is this true? > > > > > > Correct. h_vmem is a hard limit while others prefer virtual_free as a > guidance for SGE, while the latter is not enforced: > > > > > > > http://www.gridengine.info/2009/12/01/adding-memory-requirement-awareness-to-the-scheduler/ > > > > > > > > > I've read this info from you. When you say "Use the one you defined in > your qsub command by requesting it with the -l option..." I take you to > mean that once I've made a given memory complex (h_vmem, virtual_free, > etc.) consumable (qconf -mc) in order to enforce any limit on that complex > users must request a number value on that complex at job submission. I think > > > I'm repeating myself here.. Your info here is what lead me to pose my > question in the first place. > > > > > > > > > > Using qconf -mq <queue> will limit each job in <queue> but not each > user's total > > > > memory footprint across all his jobs, correct? > > > > > > Correct. > > > > > > > > > > The node-level limit does not do what > > > > we want here.. > > > > > > Correct, it's the memory usage across all queues and resp. all jobs on > a node. > > > > > > > > > > I have this RQS in place: > > > > > > > > { > > > > name high.q-h_vmem > > > > description "high.q h_vmem limited to 128G" > > > > > > The quotation marks are not necessary. > > > > > > > > > > enabled TRUE > > > > limit users {*} queues high.q to h_vmem=128g > > > > } > > > > > > You made h_vmem consumable and attached a value per exechost? > > > > > > Yes - via qconf -mc, here is what the h_vmem line looks like: > > > > > > h_vmem h_vmem MEMORY <= YES YES > 248g 0 > > > > It should be set to a default you expect to be taken for a job. We set > it to 2g here, and users can increase the per job limit to the one set in > the queue definition. > > > > > > > (should the "default" value here be different than 248? see below.. > Must it be 0? Must it NOT be 0?) > > > > > > and via qconf -me I've set h_vmem to be a little less (248G) than the > installed ram (256G) > > > on each of the cluster's 4 nodes: > > > > > > qconf -se <cluster_node> > > > hostname <> > > > load_scaling NONE > > > complex_values slots=32,h_vmem=248G > > > ..... > > > > > > > > > > > > > which would seem to accomplish our goal. However, jobs submitted to > high.q against this > > > > RQS without stating h_vmem needs at submission but which are written > to exceed the memory limit do exceed the memory limit. > > > > > > Correct, the RQS will check the job request for h_vmem, but there is > no relation back, i.e. that the RQS will limit the job's memory. Specifying > only an overall limit per user would even make it hard for the RQS to > decide what limit to (per job) set at all. Or if the overall limit is > passed: which job should be killed? > > > > > > > > > > Worse, jobs submitted to high.q with an h_vmem need set below the > RQS limit but which are written to exceed the limit successfully gobble up a > > > > forbidden amount of ram. > > > > > > I don't get this sentence. Can you make an example? > > > > > > I have a simple shell script that runs the linux tool stress which > asks the system for some amount of ram, here is that line: > > > > > > /usr/bin/stress -v --cpu 1 --io 2 --vm 1 --vm-bytes 150G --vm-hang 0 > > > > > > which ramps up to occupy 150GB by reading and dirtying ram. The > --vm-hang 0 part tells > > > stress to simply stop and hang around indefinitely once stress has > occupied 150GB. This > > > script succeeds if I do not state h_vmem request at job submission > > > > ...as the default is 248g > > > > > or if I ask for h_vmem under > > > the RQS limit of 128G. > > > > NB: g = base 1000, G = base 1024 (man sge_types) > > > > > > > It seems if RQS is satisfied upon job submission > > > > No, at job start. > > > > > > > then it does not > > > monitor ram usage once a job is running > > > > RQS will never monitor running jobs. > > > > > > > - you say as much in this response. > > > > You can check with: > > > > $ ulimit -aH > > $ ulimit -aS > > > > what was set by SGE for the limits. In addition SGE's execd (not the > RQS) will monitor the usage which was requested by -l h_vmem=... or set by > the default in the complex definition (man queue_conf, section RESOURCE > LIMITS). This will be done by the execd, which doesn't know anyhing about > other user's jobs on other nodes. > > > > > > > > Regarding ram usage: I have tested and read enough on RQS and the > various ways to configure SGE to conclude that RQS doesn't actually monitor > ram usage once a job has been submitted? > > > > > > It will monitor the requested RAM to decide whether any submitted job > is eligible to start. All running ones should never pass h_vmem if added up. > > > > > > But here (second sentence) you imply that with an h_vmem value in an > RQS that SGE does indeed monitor a user's running jobs to see if the > cumulative ram usage exceeds the RQS > > > > Not the usage. It's a consumable, so it will just add up all requested > h_vmem requests at time of start of the job and allow it to run or not. > > > > -- Reuti > > > > > > > h_vmem limit? Is this true but also that SGE will not kill any job to > get a user's ram footprint down below the RQS? The monitoring is used only > to determine if a submitted job may be run if that submitted job's h_vmem > request and that user's current ram usage are together below the RQS limit? > > > > > > > >
_______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
