Re: [slurm-users] Disable --no-allocate support for a node/SlurmD

René Sitt Wed, 14 Jun 2023 08:34:38 -0700

Hi,

Thanks for the suggestion.
However as I understand it this requires additionally trusting the node where those scripts are running on,
which is, I guess, the one running SlurmCtlD.
The reason we are using Prolog scripts is that they are running on the very node the job will be running on. So we make that one "secure" (or at least harden it by e.g. disabling SSH access and restricting any other connections). Then anything running on this node has a high trust level, e.g. the SlurmD and the Prolog script. If required the node could be rebooted with a fixed image after each job removing any potential compromise. That isn't feasible for the SlurmCtlD as that would affect the whole cluster and unrelated jobs.
Hence the checks (for example filtering out interactive jobs, but also some additional authentication) should be done on the hardened node(s).
It would work if there wasn't a way to circumvent the Prolog. So ideally I'd like to have a configuration option for the SlurmD such that it doesn't accept such jobs.
As the SlurmD config is on the node it can also be considered secure.
So while I fully agree that those plugins are better suited and likely easier to use I fear that it is much easier to prevent them from running and hence bypass those restrictions
than having something (local) at the level of the SlurmD.

Please correct me if I misunderstood anything.

Ah okay, so your requirements include completely insulating (some) jobs from outside access, including root? I've seen this kind of requirements on e.g. working non-defaced medical data - generally a tough problem imo because this level of data security seems more or less incompatible with the idea of a multi-user HPC system.

I remember that this year's ZKI-AK Supercomputing spring meeting had Sebastian Krey from GWDG presenting the KISSKI ("KI-Servicezentrum für Sensible und Kritische Infrastrukturen", https://kisski.gwdg.de/ ) project, which works in this problem domain, are you involved in that? The setup with containerization and 'node hardening' sounds very similar to me.

Re "preventing the scripts from running": I'd say it's about as easy as to otherwise manipulate any job submission that goes through slurmctld (e.g. by editing slurm.conf), so without knowing your exact use case and requirements, I can't think of a simple solution.


Kind regards,
René Sitt

--
Dipl.-Chem. René Sitt
Hessisches Kompetenzzentrum für Hochleistungsrechnen
Philipps-Universität Marburg
Hans-Meerwein-Straße
35032 Marburg

Tel. +49 6421 28 23523
si...@hrz.uni-marburg.de
www.hkhlr.de

smime.p7s
Description: S/MIME Cryptographic Signature

Re: [slurm-users] Disable --no-allocate support for a node/SlurmD

Reply via email to