Sorry a typo. The "VET" should be "VIRT".

On Mon, Jun 30, 2014 at 4:47 PM, Feng Zhang <prod.f...@gmail.com> wrote:
> Guys,
>
> Just curious, how does the h_vmem work on processes of MPI jobs(or
> OPENMP, multi-threading)? I have some parallel jobs, the top command
> shows "VET" of 40GB, while the "RES" is only 100MB.
>
> On Mon, Jun 30, 2014 at 3:01 PM, Michael Stauffer <mgsta...@gmail.com> wrote:
>>> Message: 4
>>> Date: Mon, 30 Jun 2014 11:53:12 +0200
>>> From: Txema Heredia <txema.llis...@gmail.com>
>>> To: Derrick Lin <klin...@gmail.com>, SGE Mailing List
>>>         <users@gridengine.org>
>>> Subject: Re: [gridengine users] Enforce users to use specific amount
>>>         of      memory/slot
>>> Message-ID: <53b13388.5060...@gmail.com>
>>> Content-Type: text/plain; charset="iso-8859-1"; Format="flowed"
>>>
>>>
>>> Hi Derrick,
>>>
>>> You could either set h_vmem as a consumable (consumable=yes) attribute
>>> and set a default value of 8GB for it. This way, whenever a job doesn't
>>> request any amount of h_vmem, it will automatically request 8GB per
>>> slot. This will affect all types of jobs.
>>>
>>> You could also define a JSV script that checks the username, and forces
>>> a -l h_vmem=8G for his/her jobs (
>>> jsv_sub_add_param('l_hard','h_vmem','8G') ). This will affect all jobs
>>> for that user, but could turn into a pain to manage.
>>>
>>> Or, you could set a different policy and allow all users to request the
>>> amount of memory they really need, trying to fit best the node. What is
>>> the point of forcing the user to reserve 63 additional cores when they
>>> only need 1 core and 500GB of memory? You could fit in that node one job
>>> like this, and, say, two 30-core-6GB-memory jobs.
>>>
>>> Txema
>>>
>>>
>>>
>>> El 30/06/14 08:55, Derrick Lin escribi?:
>>>
>>> > Hi guys,
>>> >
>>> > A typical node on our cluster has 64 cores and 512GB memory. So it's
>>> > about 8GB/core. Occasionally, we have some jobs that utilizes only 1
>>> > core but 400-500GB of memory, that annoys lots of users. So I am
>>> > seeking a way that can force jobs to run strictly below 8GB/core
>>> > ration or it should be killed.
>>> >
>>> > For example, the above job should ask for 64 cores in order to use
>>> > 500GB of memory (we have user quota for slots).
>>> >
>>> > I have been trying to play around h_vmem, set it to consumable and
>>> > configure RQS
>>> >
>>> > {
>>> >         name    max_user_vmem
>>> >         enabled true
>>> >         description     "Each user can utilize more than 8GB/slot"
>>> >         limit   users {bad_user} to h_vmem=8g
>>> > }
>>> >
>>> > but it seems to be setting a total vmem bad_user can use per job.
>>> >
>>> > I would love to set it on users instead of queue or hosts because we
>>> > have applications that utilize the same set of nodes and app should be
>>> > unlimited.
>>> >
>>> > Thanks
>>> > Derrick
>>
>>
>> I've been dealing with this too. I'm using h_vmem to kill processes that go
>> above the limit, and s_vmem set slightly lower by default to give
>> well-behaved processes a chance first to exit gracefully.
>>
>> The issue is that these use virtual memory, which is (always, more or less)
>> great than resident memory, i.e. the actual ram usage. And with java apps
>> like Matlab, the amount of virtual memory reserved/used is HUGE compared to
>> resident, by 10x give or take. So it makes it really impracticle actually.
>> However so far I've just set the default h_vmem and s_vmem values high
>> enough to accomadate jvm apps, and increased the per-host consumable
>> appropriately. We don't get fine-grained memory control, but it definitely
>> controls out-of-control users/procs that otherwise might gobble up enough
>> ram to slow dow the entire node.
>>
>> We may switch to UVE just for this reason, to get memory limits based on
>> resident memory, if it seems worth it enough in the end.
>>
>> -M
>>
>> _______________________________________________
>> users mailing list
>> users@gridengine.org
>> https://gridengine.org/mailman/listinfo/users
>>
_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to