Now what would be causing this? The srun just hangs and these are the only
logs from slurmctld:
[2024-02-24T23:23:26.003] error: Orphan StepId=463.extern reported on node
node007
[2024-02-24T23:23:26.003] error: Orphan StepId=463.extern reported on node
node006
[2024-02-24T23:23:26.003] error: Orphan StepId=463.extern reported on node
node005
[2024-02-24T23:23:26.003] error: Orphan StepId=463.extern reported on node
node009
[2024-02-24T23:23:26.003] error: Orphan StepId=463.extern reported on node
node008

[2024-02-24T23:43:21.183] _slurm_rpc_complete_job_allocation: JobId=563
error Job/step already completing or completed

[465.extern] error: common_file_write_content: unable to open
'/sys/fs/cgroup/system.slice/slurmstepd.scope/job_463/step_extern/user/cgroup.freeze'
for writing: Permission denied

On Sat, Feb 24, 2024 at 12:09 PM Robert Kudyba <rkud...@fordham.edu> wrote:

> <<<Traditionally /tmp and /var/tmp have been 1777<<<
>
>
> Ah yes thanks for pointing that out. Hope this helps someone down the
> line...perhaps the error detection could be more explicit in slurmctld?
>
> On Sat, Feb 24, 2024, 12:07 PM Chris Samuel via slurm-users <
> slurm-users@lists.schedmd.com> wrote:
>
>> On 24/2/24 06:14, Robert Kudyba via slurm-users wrote:
>>
>> > For now I just set it to chmod 777 on /tmp and that fixed the errors.
>> Is
>> > there a better option?
>>
>> Traditionally /tmp and /var/tmp have been 1777 (that "1" being the
>> sticky bit, originally invented to indicate that the OS should attempt
>> to keep a frequently used binary in memory but then adopted to indicate
>> special handling of a world writeable directory so users can only unlink
>> objects they own and not others).
>>
>> Hope that helps!
>>
>> All the best,
>> Chris
>> --
>> Chris Samuel  :
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.csamuel.org_&d=DwICAg&c=aqMfXOEvEJQh2iQMCb7Wy8l0sPnURkcqADc2guUW8IM&r=X0jL9y0sL4r4iU_qVtR3lLNo4tOL1ry_m7-psV3GejY&m=1dr8K8YEcCyc4UDmIvmXWNuOled6fEZ424zSwluePPfhXD2Q5JVklrCrDUQU-mSW&s=ZbSiWLCu-81ZY1xhscjqczszYgOmqxUbVa6f2qUEd-o&e=
>>  :  Berkeley, CA, USA
>>
>>
>> --
>> slurm-users mailing list -- slurm-users@lists.schedmd.com
>> To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
>>
>
-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

Reply via email to