Makes a sense to me. By setting h_vmem to 0 you define shell limit for address space to 0 - which means nothing can actually start as every malloc() will return E_NOMEM. Simple.
>-----Original Message----- >From: SGE-discuss [mailto:sge-discuss-boun...@liverpool.ac.uk] On Behalf Of >juanesteban.jime...@mdc-berlin.de >Sent: Thursday, June 01, 2017 9:49 AM >To: Reuti <re...@staff.uni-marburg.de> >Cc: SGE-discuss@liv.ac.uk <sge-disc...@liverpool.ac.uk> >Subject: Re: [SGE-discuss] Another QRSH problem > >I finally tracked this one down. The global conf had a default value of 0 for >h_vmem + requestable and consumable, and apparently GridEngine does not >like that. Nothing in the docs says it should be anything other than 0. A user >reporte that qrsh worked with an -l h_vmem of 10G so I did the logical noodle >dance and put a default value of 1G in the global conf. Now it works. > >Somebody explain why the error message related to this has absolutely, >positive NOTHING to do with the root cause? > >Mfg, >Juan Jimenez >System Administrator, HPC >MDC Berlin / IT-Dept. >Tel.: +49 30 9406 2800 > > >________________________________________ >From: Reuti [re...@staff.uni-marburg.de] >Sent: Tuesday, May 30, 2017 11:36 >To: Jimenez, Juan Esteban >Cc: SGE-discuss@liv.ac.uk >Subject: Re: [SGE-discuss] Another QRSH problem > >> Am 30.05.2017 um 11:32 schrieb juanesteban.jime...@mdc-berlin.de: >> >> Ok, now I understand how this works and what the limitations and >advantages of each option are. Our users are not forwarding X so we will try to >go back to built-in and see what happens. Does the qmaster need to be >restarted when I made the change to the conf? > >No. These entries are interpreted live. Just wait one or two minutes after the >change before you use it, until all exechosts honor the new setting. > >-- Reuti > > >> Mfg, >> Juan Jimenez >> System Administrator, BIH HPC Cluster >> MDC Berlin / IT-Dept. >> Tel.: +49 30 9406 2800 >> >> >> >> >> On 29.05.17, 19:45, "SGE-discuss on behalf of JuanEsteban.Jimenez@mdc- >berlin.de" <sge-discuss-boun...@liverpool.ac.uk on behalf of >juanesteban.jime...@mdc-berlin.de> wrote: >> >> How is the sheperd bring up this separate sshd daemon? What arguments >are being used? >> >> Mfg, >> Juan Jimenez >> System Administrator, HPC >> MDC Berlin / IT-Dept. >> Tel.: +49 30 9406 2800 >> >> >> ________________________________________ >> From: Reuti [re...@staff.uni-marburg.de] >> Sent: Monday, May 29, 2017 18:14 >> To: Jimenez, Juan Esteban >> Cc: SGE-discuss@liv.ac.uk >> Subject: Re: [SGE-discuss] Another QRSH problem >> >>> Am 29.05.2017 um 18:00 schrieb juanesteban.jime...@mdc-berlin.de: >>> >>> On 29.05.17, 17:56, "Reuti" <re...@staff.uni-marburg.de> wrote: >>> >>> >>>> Am 29.05.2017 um 17:26 schrieb juanesteban.jime...@mdc-berlin.de: >>>> >>>> I am getting this very specific error: >>>> >>>> debug1: ssh_exchange_identification: /usr/sbin/sshd: error while loading >shared libraries: libselinux.so.1: failed to map segment from shared object >>> >>>> I don't have a specific idea as this seems to be a permission problem. Are >you running selinux and could disable it? >>> >>> SELINUX is disabled on all nodes. >> >> Ok. >> >> >>> >>>> However, ssh works fine outside of qrsh. Every single test succeeds, from >all nodes to all nodes. >>> >>>> This is usually handled by the default running `sshd` on port 22, but the >one started by SGE runs on a different port. >>> >>> Started where??? On the node where the qrsh will be sent, by the exec >daemon? >> >> It's the shepherd who will start it. >> >> >>> That’s a heck of a big clue! Is there a way to disable this and use the >existing sshd? >> >> Not in the default setting. >> >> sgeadmin root /usr/sge/bin/lx24-em64t/sge_execd >> sgeadmin root \_ sge_shepherd-224557 -bg >> root root \_ sshd: reuti [priv] >> reuti reuti \_ sshd: reuti@pts/0 >> reuti reuti \_ -bash >> reuti reuti \_ ps -e f -o user,ruser,command >> >> This is completely unrelated to the default `sshd` to log in on port 22. >> In the >standard configuration it will use the same config files though. >> >> == >> >> You could try to use wrappers for both entries and ignore the port, but >then you will lose job control and accounting (I have no clue whether this will >work). The detailed startup is explained here: >> >> https://arc.liv.ac.uk/SGE/htmlman/htmlman5/remote_startup.html >> >> -- Reuti >> _______________________________________________ >> SGE-discuss mailing list >> SGE-discuss@liv.ac.uk >> https://arc.liv.ac.uk/mailman/listinfo/sge-discuss >> >> >> > >_______________________________________________ >SGE-discuss mailing list >SGE-discuss@liv.ac.uk >https://arc.liv.ac.uk/mailman/listinfo/sge-discuss ----- The information contained in this e-mail and in any attachments is confidential and is designated solely for the attention of the intended recipient(s). If you are not an intended recipient, you must not use, disclose, copy, distribute or retain this e-mail or any part thereof. If you have received this e-mail in error, please notify the sender by return e-mail and delete all copies of this e-mail from your computer system(s). Please direct any additional queries to: communicati...@s3group.com. Thank You. Silicon and Software Systems Limited (S3 Group). Registered in Ireland no. 378073. Registered Office: South County Business Park, Leopardstown, Dublin 18. _______________________________________________ SGE-discuss mailing list SGE-discuss@liv.ac.uk https://arc.liv.ac.uk/mailman/listinfo/sge-discuss