job_submit.lua allows you to view (and edit!) all job parameters that
are known at submit time, including the option to refuse a configuration
by returning `slurm.ERROR`instead of `slurm.SUCCESS`. The common way to
filter for interactive jobs in job_submit.lua is checking whether
job_desc.script is nil or an empty string (i.e. the job submission
doesn't have a script attached to it). You can do a lot more within
job_submit.lua - I know of multiple sites (including the cluster I'm
maintaining) that use it to, for example, automatically sort jobs into
the correct partition(s) according to their resource requirements.
Thanks for the suggestion.

However as I understand it this requires additionally trusting the node where those scripts are running on,
which is, I guess, the one running SlurmCtlD.

All in all, these two interfaces are (imho) much better suited for the
kind of task you're suggesting (checking job parameters, refusing
specific job configurations) than prolog scripts, since technically by
the time the prolog scripts are starting, the job configuration has
already been finalized and accepted by the scheduler.
The reason we are using Prolog scripts is that they are running on the very node the job will be running on. So we make that one "secure" (or at least harden it by e.g. disabling SSH access and restricting any other connections). Then anything running on this node has a high trust level, e.g. the SlurmD and the Prolog script. If required the node could be rebooted with a fixed image after each job removing any potential compromise. That isn't feasible for the SlurmCtlD as that would affect the whole cluster and unrelated jobs.

Hence the checks (for example filtering out interactive jobs, but also some additional authentication) should be done on the hardened node(s).

It would work if there wasn't a way to circumvent the Prolog. So ideally I'd like to have a configuration option for the SlurmD such that it doesn't accept such jobs.
As the SlurmD config is on the node it can also be considered secure.

So while I fully agree that those plugins are better suited and likely easier to use I fear that it is much easier to prevent them from running and hence bypass those restrictions
than having something (local) at the level of the SlurmD.

Please correct me if I misunderstood anything.

Kind Regards,
Alexander Grund


Reply via email to