Re: [slurm-users] Question/Clarification: Batch array multiple tasks on nodes

2020-09-01 Thread Renfro, Michael
We set DefMemPerCPU in each partition to approximately the amount of RAM in a node divided by the number of cores in the node. For heterogeneous partitions, we use a lower limit, and we always reserve a bit of RAM for the OS, too. So for a 64 GB node with 28 cores, we default to 2000 M per CPU,

Re: [slurm-users] Question/Clarification: Batch array multiple tasks on nodes

2020-09-01 Thread Dana, Jason T.
Spencer, Thank you for your response! It does appear that the memory allocation was the issue. When I specify --mem=1, I am able to queue jobs on a single node. That being said, I was under the impression that the DefMemPerCPU, DefMemPerNode (what sbatch claims to default to), etc. values defa

Re: [slurm-users] Question/Clarification: Batch array multiple tasks on nodes

2020-09-01 Thread Spencer Bliven
Jason, The array jobs are designed to behave like independent jobs (but are stored more efficiently internally to avoid straining the controller). So in principle slurm could schedule them one per node or multiple per node. The --nodes and --ntasks parameters apply to individual jobs in the arr

[slurm-users] Priority QOS with Preempt on Some Resources?

2020-09-01 Thread Jason Simms
Hello all, I have a couple of users, each of whom has contributed funds to purchase a node for the cluster, much like a condo system. Each node has 52 cores, so I'd like to provide each user with preempt access for up to 52 cores. I can configure that easily enough with a QOS for each user with Gr

[slurm-users] Question/Clarification: Batch array multiple tasks on nodes

2020-09-01 Thread Dana, Jason T.
Hello, I am new to Slurm and I am working on setting up a cluster. I am testing out running a batch execution using an array and am seeing only one task executed in the array per node. Even if I specify in the sbatch command that only one node should be used, it executes a single task on each o

Re: [slurm-users] Adding Users to Slurm's Database

2020-09-01 Thread Diego Zuccato
Il 19/08/20 08:17, Loris Bennett ha scritto: > I'd be interested in the removal part. This seems to me to be the > trickiest bit, not so much technically, but from a policy point of view. We use Active Directory, so I created a script that iterates all the cluster-related AD groups and generates

[slurm-users] Special queues

2020-09-01 Thread Diego Zuccato
Hello all. We'd need to setup two QoS: - debug : low priority, max 15' wall time, unaccounted CPU time - hpc: should get "half node" allocation units for accounting purposes (both for CPU and RAM) For "debug" I tried (and obviously failed): # sacctmgr create qos debug MaxJobsPerUser=1 MaxSubmitJo