job_submit.lua allows you to view (and edit!) all job parameters that
are known at submit time, including the option to refuse a configuration
by returning `slurm.ERROR`instead of `slurm.SUCCESS`. The common way to
filter for interactive jobs in job_submit.lua is checking whether
job_desc.script is nil or an empty string (i.e. the job submission
doesn't have a script attached to it). You can do a lot more within
job_submit.lua - I know of multiple sites (including the cluster I'm
maintaining) that use it to, for example, automatically sort jobs into
the correct partition(s) according to their resource requirements.
Thanks for the suggestion.
However as I understand it this requires additionally trusting the node
where those scripts are running on,
which is, I guess, the one running SlurmCtlD.
All in all, these two interfaces are (imho) much better suited for the
kind of task you're suggesting (checking job parameters, refusing
specific job configurations) than prolog scripts, since technically by
the time the prolog scripts are starting, the job configuration has
already been finalized and accepted by the scheduler.
The reason we are using Prolog scripts is that they are running on the
very node the job will be running on.
So we make that one "secure" (or at least harden it by e.g. disabling
SSH access and restricting any other connections).
Then anything running on this node has a high trust level, e.g. the
SlurmD and the Prolog script.
If required the node could be rebooted with a fixed image after each job
removing any potential compromise.
That isn't feasible for the SlurmCtlD as that would affect the whole
cluster and unrelated jobs.
Hence the checks (for example filtering out interactive jobs, but also
some additional authentication) should be done on the hardened node(s).
It would work if there wasn't a way to circumvent the Prolog. So ideally
I'd like to have a configuration option for the SlurmD such that it
doesn't accept such jobs.
As the SlurmD config is on the node it can also be considered secure.
So while I fully agree that those plugins are better suited and likely
easier to use
I fear that it is much easier to prevent them from running and hence
bypass those restrictions
than having something (local) at the level of the SlurmD.
Please correct me if I misunderstood anything.
Kind Regards,
Alexander Grund