date:20240623

[slurm-users] Re: error: unpack_header: protocol_version 9472 not supported

2024-06-23 Thread Arnuld via slurm-users

I found the problem. It was not that this node was trying to reach some machine. It was the other way around, some other machine (running controller) had this node in the config there, and hence that controller was trying to reach to this. It was a different slurm cluster. I removed the config from

[slurm-users] Re: Can Not Use A Single GPU for Multiple Jobs

2024-06-23 Thread Arnuld via slurm-users

> No, Slurm has to launch the batch script on compute node cores > ... SNIP... > Even with srun directly from a login node there's still processes that > have to run on the compute node and those need at least a core > (and some may need more, depending on the application). Alright, understood.