I have a lot of problems with AD, Kerberos, SSSD, LDAP and GridEngine, but I think it is related to the fact that I connect to AD servers that do not synchronize with the master quicktly enough. Once in a while I have to clear the SSSD cache and restart the SSSD services on all the nodes, and until they manage to repopulated, qrsh refuses to open a new shell unless I point it to a node that I know is working.
Mfg, Juan Jimenez System Administrator, HPC MDC Berlin / IT-Dept. Tel.: +49 30 9406 2800 ________________________________________ From: SGE-discuss [sge-discuss-boun...@liverpool.ac.uk] on behalf of Orion Poplawski [or...@cora.nwra.com] Sent: Thursday, April 06, 2017 22:46 To: sge-disc...@liverpool.ac.uk Subject: [SGE-discuss] Kerberos authentication I've built the gss utils with 'aimk -gss' and am testing with security_mode set to kerberos. In my first attempt I tried to make use of gssproxy to store the sge/qmaster principal, but unfortunately it appears that gssproxy is too old on EL7 to handle storing the delegated credential for us: put_cred stderr: GSS-API error copying delegated creds to ccache: The operation or option is not available or unsuppo Next attempt was to set: KRB5_KTNAME=FILE:/var/spool/gridengine/sge.keytab in the environment of the daemons and store the sge/host principals there. This avoids needing to run qmaster as root to access /etc/krb5.keytab. Need a sge service principal for the qmaster and each of the exec hosts, which seems appropriate. Another issue I ran into is that I'm running in an IPA/Active Directory trust setup where the users are stored in the AD domain, and the hosts are in the IPA domain. Therefore the code in gsslib_put_credentials that was using gss_compare_name() to compare users ended up comparing "orion" to "or...@ad.nwra.com". I changed that to also try using gss_localname() to convert the client principal to a local username and comparing that. Also, the later code that called krb5_kuserok() segfaulted because it was erroneously casting gss_name_t to krb5_principal. I've started work changing that to do the conversion properly but as of now that is untested. There are also a bunch of memory leaks in this code that probably should be cleaned up, although at the moment this is all run in short lived executables. Finally, I needed to tweak my peopen() patch to run put_cred and delete_cred as root on the exec hosts since they need to change the ownership and remove files of the user running the job. At least for a simple test case, this appears to be working now for me, so I'm fairly pleased. Next issue I expect to face is renewing and expiring user credentials for long running jobs. -- Orion Poplawski Technical Manager 720-772-5637 NWRA, Boulder/CoRA Office FAX: 303-415-9702 3380 Mitchell Lane or...@nwra.com Boulder, CO 80301 http://www.nwra.com _______________________________________________ SGE-discuss mailing list SGE-discuss@liv.ac.uk https://arc.liv.ac.uk/mailman/listinfo/sge-discuss _______________________________________________ SGE-discuss mailing list SGE-discuss@liv.ac.uk https://arc.liv.ac.uk/mailman/listinfo/sge-discuss