I believe I have solved this. I changed the configuration to replace:
TaskPlugin=task/affinity
with:
TaskPlugin=task/none
In my case, the login node, the head node, and all of the compute nodes are
running in their own containers. And docker compose is used to run all of
those containers to
There is a permission problem somewhere, but I don’t know where.
If I run as root, it works:
admin@slurmfrontend:~$ srun hostname
srun: error: task 0 launch failed: Slurmd could not execve job
slurmstepd: error: task_g_set_affinity: Operation not permitted
slurmstepd: error: _exec_wait_child_wait