Hi,we're currently in the process of migrating from RHEL6 to 7, which also brings us the benefit of having systemd. However, we are observing problems with user applications that use e.g. XDG_RUNTIME_DIR, because SLURM apparently does not really run the user application through the PAM stack. The consequence is that SLURM jobs inherit the XDG_* environment variables from the login nodes (where sshd properly sets it up), but on the compute nodes, /run/user/$uid does not exist, leading to errors whenever a user application tries to access it.
We have tried setting UsePam=1, but that did not help.I have found the following issue on the systemd project regarding exactly this problem: https://github.com/systemd/systemd/issues/3355
There, Lennart Poettering argues that it should be the responsibility of the scheduler software (i.e. SLURM) to run user code only within a proper PAM session.
My question: does SLURM support this? If yes, how?If not, what are best practices to circumvent this problem on RHEL7/systemd installations? Surely other clusters must have already had the same issue...
Thanks in advance. -- Maik Schmidt HPC Services Technische Universität Dresden Zentrum für Informationsdienste und Hochleistungsrechnen (ZIH) Willers-Bau A116 D-01062 Dresden Telefon: +49 351 463-32836
smime.p7s
Description: S/MIME Cryptographic Signature