[slurm-users] setting default working directory in prolog

2021-10-26 Thread Stefan Kelber
Hello List, i am a SLURM newbie and would like to set the job specific working directory for it's processing to make use of local disks on compute nodes, as a default, using a job prolog (slurmd) therefore, sparing the users to have to take care of this in the batch script. The "SLURM_JOB_WORK

[slurm-users] backfill on overlapping partitions problem

2021-10-26 Thread Andrej Filipcic
Hi, We have a strange problem with backfilling, there are large partition "cpu" and overlapping partition "largemem" which is a subset of "cpu" nodes. Now, user A is submitting low priority jobs to "cpu", user B high priority jobs to "largemem" If there are queued jobs in "largemem" (draini

[slurm-users] errors requesting gpus

2021-10-26 Thread Benjamin Nacar
Hi, I'm setting up a slurm cluster where some subset of compute nodes will have gpus. My slurm.conf contains, among other lines: [...] GresTypes=gpu [...] Include /etc/slurm/slurm.conf.d/allnodes [...] and the abovementioned /etc/slurm/slurm.conf.d/allnodes file has the line NodeName=gpu1601 C

Re: [slurm-users] backfill on overlapping partitions problem

2021-10-26 Thread Matt Jay
Hi Andrej, Take a look at this, and see if it matches up with your issue (I'm not 100% sure based on your description): https://bugs.schedmd.com/show_bug.cgi?id=3881 The takeaway from that is the following (quote from SchedMD): " If there are _any_ jobs pending (regardless of the reason for the

Re: [slurm-users] slurm.conf syntax checker?

2021-10-26 Thread Marcus Wagner
Hi Diego, sorry for the delay. On 10/18/21 14:20, Diego Zuccato wrote: Il 15/10/2021 06:02, Marcus Wagner ha scritto: mostly, our problem was, that we forgot to add/remove a node to/from the partitions/topology file, which caused slurmctld to deny startup. So I wrote a simple checker for th