Hi, We encountered this issue some time ago (see: https://www.mail-archive.com/slurm-dev@schedmd.com/msg06628.html). You need to add pam_systemd to the slurm pam file, but pam_systemd will try to take over the slurm's cgroups. Our current solution is to add pam_systemd to the slurm pam file, but in addition to save/restore the slurm cgroup locations. It's not pretty, but for now it works...
If you don't constrain the devices (i.e. don't have GPUs), you probably can do without the pam_exec script and use the pam_systemd normally. We're using debian, but the basics should be the same. I've placed the script in github, if you want to try it: https://github.com/irush-cs/slurm-scripts Yair. On Mon, Jun 18, 2018 at 3:33 PM, John Hearns <hear...@googlemail.com> wrote: > Your problem is that you are listening to Lennart Poettering... > I cannot answer your question directly. However I am doing work at the > moment with PAM and sssd. > Have a look at the directory which contains the unit files. Go on > /lib/systemd/sysem > See that nice file named -.slice Yes that file is absolutely needed, it > is not line noise. > Now try to grep on the files in that directory, since you might want to > create a new systemd unit file based on an existing one. > > Yes, a regexp guru will point out that this is trivial. But to me creating > files that look like -.slice is putting your head in the lion's mouth. > > > > > > On 18 June 2018 at 14:15, Maik Schmidt <maik.schm...@tu-dresden.de> wrote: >> >> Hi, >> >> we're currently in the process of migrating from RHEL6 to 7, which also >> brings us the benefit of having systemd. However, we are observing problems >> with user applications that use e.g. XDG_RUNTIME_DIR, because SLURM >> apparently does not really run the user application through the PAM stack. >> The consequence is that SLURM jobs inherit the XDG_* environment variables >> from the login nodes (where sshd properly sets it up), but on the compute >> nodes, /run/user/$uid does not exist, leading to errors whenever a user >> application tries to access it. >> >> We have tried setting UsePam=1, but that did not help. >> >> I have found the following issue on the systemd project regarding exactly >> this problem: https://github.com/systemd/systemd/issues/3355 >> >> There, Lennart Poettering argues that it should be the responsibility of >> the scheduler software (i.e. SLURM) to run user code only within a proper >> PAM session. >> >> My question: does SLURM support this? If yes, how? >> >> If not, what are best practices to circumvent this problem on >> RHEL7/systemd installations? Surely other clusters must have already had the >> same issue... >> >> Thanks in advance. >> >> -- >> Maik Schmidt >> HPC Services >> >> Technische Universität Dresden >> Zentrum für Informationsdienste und Hochleistungsrechnen (ZIH) >> Willers-Bau A116 >> D-01062 Dresden >> Telefon: +49 351 463-32836 >> >> >