Hello, everyone.
I'm also fairly new to slurm, still in a conceptual rather than a test
or productive phase. Currently I am still trying to find out where to
create which files and directories, on the host or in a network directory.
I'm a little confused about the description in the manpage of slurm. conf.
For example, the JobCheckpointDir should be accessible from both the
primary and backup controller. Now it is clear (at least I believe) that
this has to be done in the NCCR, for example. If the primary controller
goes down, the backup controller must be able to access it.
On the other hand, SlurmctldPidFile should also be available on both the
primary and backup controller. Since there is usually in /var/run, I
assume that this should be a local path. It should also be unique on
every controller.
The manpage is not quite clear in its description. What about the
SlurmctldLogFile, for example? Theoretically, both could write to the
same file.
If anyone has an advice or would like to tell me how it was solved on
your site, I would be very happy.
best
Marcus
--
Marcus Wagner, Dipl.-Inf.
IT Center
Abteilung: Systeme und Betrieb
RWTH Aachen University
Seffenter Weg 23
52074 Aachen
Tel: +49 241 80-24383
Fax: +49 241 80-624383
wag...@itc.rwth-aachen.de
www.itc.rwth-aachen.de