Thanks guys,

>> Correct. The limits in place when sgeexecd is started are used (i.e. the
one of the root user).
I tried to simply restart the sgeexecd but it does not change anything.

In my /etc/security/limits.conf I have:
* soft nofile 18000
* hard nofile 20000

That should apply to every account? the SGE daemons are run under user
"sge".

>> Several ulimits can be set in the queue configuration, and can so
different for each queue or exechost.

We don't have any ulimits setting inside queue or other SGE parts,
limits.conf is the only place of the config.

It is so weird that most of the Compute Nodes pick up the settings
correctly, only a few fail to pick up.

Currently, my only workaround is to rebuild the Compute Node (reinstall OS
etc) so that it corrects this issue.

>> Can you check the limits that are set in the sge_execd and sge_shepherd
processes (/proc/<pid>/limits)?

I tried to look it up, but I could not find the <pid> directory which is
corresponding to the sgeexecd.

Cheers,
Derrick


On Thu, Jul 4, 2019 at 12:09 AM Skylar Thompson <skyl...@uw.edu> wrote:

> Can you check the limits that are set in the sge_execd and sge_shepherd
> processes (/proc/<pid>/limits)? It's possible that the user who ran the
> execd init script had limits applied, which would carry over to the execd
> process.
>
> On Wed, Jul 03, 2019 at 12:36:00PM +1000, Derrick Lin wrote:
> > Hi guys,
> >
> > We have custom settings for user open files in /etc/security/limits.conf
> in
> > all Compute Node. When checking if the configuration is effective with
> > "ulimit -a" by SSH to each node, it reflects the correct settings.
> >
> > but when ran the same command through SGE (both qsub and qrsh), we found
> > that some Compute Nodes do not reflects the correct settings but the rest
> > are fine.
> >
> > I am wondering if this is SGE related? And idea is welcomed.
> >
> > Cheers,
> > Derrick
>
> > _______________________________________________
> > users mailing list
> > users@gridengine.org
> > https://gridengine.org/mailman/listinfo/users
>
>
> --
> -- Skylar Thompson (skyl...@u.washington.edu)
> -- Genome Sciences Department, System Administrator
> -- Foege Building S046, (206)-685-7354
> -- University of Washington School of Medicine
>
_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to