[slurm-users] job_container/tmpfs and srun.

2024-01-09 Thread Phill Harvey-Smith
Hi all, On our setup we are using job_container/tmpfs to give each job it's own temp space. Since our compute nodes have reasonably sized disks for tasks that do a lot of disk I/O on user's data we have asked users to copy their data to the local disk at the beginning of the task and (if need

Re: [slurm-users] DBD_SEND_MULT_MSG - invalid uid error

2024-01-09 Thread Timony, Mick
You could enable debug logging on your slurm controllers to see if that provides some more useful info. I'd also check your firewall settings to make sure your not blocking some traffic that you shouldn't. iptables -F​ will clear your local Linux firewall. I'd also triple check the UID on all t

[slurm-users] Beginner admin question: Prioritization within a partition based on time limit

2024-01-09 Thread Kenneth Chiu
I'm just learning about slurm. I understand that different different partitions can be prioritized separately, and can have different max time limits. I was wondering whether or not there was a way to have a finer-grained prioritization based on the time limit specified by a job, within a single pa

Re: [slurm-users] Beginner admin question: Prioritization within a partition based on time limit

2024-01-09 Thread Paul Edmon
Yeah, that's sort of the job of the backfill scheduler, as smaller jobs will fit better into the gaps. There are several options with in the priority framework that you can use to dial in which jobs get which priority. I recommend reading through all those and finding the options that will work