Now what would be causing this? The srun just hangs and these are the only logs from slurmctld: [2024-02-24T23:23:26.003] error: Orphan StepId=463.extern reported on node node007 [2024-02-24T23:23:26.003] error: Orphan StepId=463.extern reported on node node006 [2024-02-24T23:23:26.003] error: Orphan StepId=463.extern reported on node node005 [2024-02-24T23:23:26.003] error: Orphan StepId=463.extern reported on node node009 [2024-02-24T23:23:26.003] error: Orphan StepId=463.extern reported on node node008
[2024-02-24T23:43:21.183] _slurm_rpc_complete_job_allocation: JobId=563 error Job/step already completing or completed [465.extern] error: common_file_write_content: unable to open '/sys/fs/cgroup/system.slice/slurmstepd.scope/job_463/step_extern/user/cgroup.freeze' for writing: Permission denied On Sat, Feb 24, 2024 at 12:09 PM Robert Kudyba <rkud...@fordham.edu> wrote: > <<<Traditionally /tmp and /var/tmp have been 1777<<< > > > Ah yes thanks for pointing that out. Hope this helps someone down the > line...perhaps the error detection could be more explicit in slurmctld? > > On Sat, Feb 24, 2024, 12:07 PM Chris Samuel via slurm-users < > slurm-users@lists.schedmd.com> wrote: > >> On 24/2/24 06:14, Robert Kudyba via slurm-users wrote: >> >> > For now I just set it to chmod 777 on /tmp and that fixed the errors. >> Is >> > there a better option? >> >> Traditionally /tmp and /var/tmp have been 1777 (that "1" being the >> sticky bit, originally invented to indicate that the OS should attempt >> to keep a frequently used binary in memory but then adopted to indicate >> special handling of a world writeable directory so users can only unlink >> objects they own and not others). >> >> Hope that helps! >> >> All the best, >> Chris >> -- >> Chris Samuel : >> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.csamuel.org_&d=DwICAg&c=aqMfXOEvEJQh2iQMCb7Wy8l0sPnURkcqADc2guUW8IM&r=X0jL9y0sL4r4iU_qVtR3lLNo4tOL1ry_m7-psV3GejY&m=1dr8K8YEcCyc4UDmIvmXWNuOled6fEZ424zSwluePPfhXD2Q5JVklrCrDUQU-mSW&s=ZbSiWLCu-81ZY1xhscjqczszYgOmqxUbVa6f2qUEd-o&e= >> : Berkeley, CA, USA >> >> >> -- >> slurm-users mailing list -- slurm-users@lists.schedmd.com >> To unsubscribe send an email to slurm-users-le...@lists.schedmd.com >> >
-- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com