That's a GridEngine bug whereby the event client ids or whatever they are called don't properly get cleaned up. The workaround is to restart the qmaster when that happens. Be careful, sometimes restarting the service doesn't work and you may need to kill the process. At the cluster I used to manage at JHU, we have a process which checks the output of qconf -secl and if it returns a number greater than 900 we restart the qmaster.
On Fri, Feb 21, 2020 at 1:35 AM Lana Deere <lana.de...@gmail.com> wrote: > On CentOS 7 using SoGE 8.1.9, I'm getting an error using qsub: > QSUB:Unable to initialize environment because of error: cannot register > event client. Only 979 event clients are allowed in the system > > Supposedly I have this limit configured much higher: > root# qconf -sconf | grep MAX_DYN_EC > qmaster_params MAX_DYN_EC=25000,gdi_retries=5 > > However, the qmaster at startup is reporting that it is not honoring the > limit: > |nr of dynamic event clients exceeds max file descriptor limit, setting > MAX_DYN_EC=979 > |qmaster hard descriptor limit is set to 4096 > |qmaster soft descriptor limit is set to 1024 > |qmaster will use max. 1004 file descriptors for communication > |qmaster will accept max. 979 dynamic event clients > |starting up SGE 8.1.9 (lx-amd64) > > This is surprising to me since my system's file descriptor limit is set > much higher than 1024/4096: > root# pwd > /etc/security/limits.d > root# cat 99*nofile*conf > * soft nofile 100000 > * hard nofile 100000 > root# ulimit -a -S | grep 'open files' > open files (-n) 100000 > > I hacked the script in /etc/init.d which starts the qmaster and it shows > the higher limit. However, if I look at /proc/<qmaster pid>/limits I can > see that it has the lower limits it reports. What I can't figure out is > why it is seeing the lower limit. Anyone know whether there's a > configuration parameter somewhere overriding the system limit? Any > suggestions on how to make it get the system's limit? > > Thanks. > > .. Lana (lana.de...@gmail.com) > > > _______________________________________________ > users mailing list > users@gridengine.org > https://gridengine.org/mailman/listinfo/users >
_______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users