[slurm-users] QOS MaxTRESPU node=X intepretation

2024-08-30 Thread David Magda via slurm-users
Hello, Have a question on how to interpret a Node=X MaxTRESPU value for a QOS: If (e.g.) X=4, and each node has (say) 64 CPUs or cores: if a particular job needs 32 cores, then would two jobs count as the equivalent of one node (2*32=64)? And if X=4 for the QOS, would that mean that eight of th

[slurm-users] Re: Node (anti?) Feature / attribute

2024-06-17 Thread David Magda via slurm-users
Could you post that snippet? > On Jun 14, 2024, at 14:33, Laura Hild via slurm-users > wrote: > > I wrote a job_submit.lua also. It would append "¢os79" to the feature > string unless the features already contained "el9," or if empty, set the > features string to "centos79" without the ampe

[slurm-users] Re: Node (anti?) Feature / attribute

2024-06-17 Thread David Magda via slurm-users
This functionality in slurmd was added in August 2023, so not in the version we’re currently running: https://github.com/SchedMD/slurm/commit/0daa1fda97c125c0b1c48cbdcdeaf1382ed71c4f Perhaps something for the future. Currently looking like the job_submit.lua is the best candidate.

[slurm-users] Node (anti?) Feature / attribute

2024-06-14 Thread David Magda via slurm-users
Hello, What I’m looking for is a way for a node to continue to be in the same partition, and have the same QoS(es), but only be chosen if a particular capability is being asked for. This is because we are rolling something (OS upgrade) out slowly to a small batch of nodes at first, and then mor

Re: [slurm-users] srun: error: io_init_msg_unpack: unpack error

2022-08-08 Thread David Magda
On Aug 6, 2022, at 15:13, Chris Samuel wrote: > > On 6/8/22 10:43 am, David Magda wrote: > >> It seems that the the new srun(1) cannot talk to the old slurmd(8). >> Is this 'on purpose'? Does the backwards compatibility of the protocol not >> extend t

[slurm-users] srun: error: io_init_msg_unpack: unpack error

2022-08-06 Thread David Magda
Hello, We are testing the upgrade process with going from 20.11.9 to 22.05.2. The master server is running 22.05.2 slurmctld/slurmdbd, and the compute nodes are (currently) running slurm-20.11.9 slurmd. We are running this 'mixed environment' because our production cluster has a reasonable numb