[slurm-users] Disable exclusive flag for users

2022-03-24 Thread pankajd
Hi, We have slurm 21.08.6 and GPUs in our compute nodes. We want to restrict / disable the use of "exclusive" flag in srun for users. How should we do it? -- Thanks and regards, PVD For assimilation and dissemination of knowledge, visit cakes.cdac.in ---

Re: [slurm-users] Question about sbatch options: -n, and --cpus-per-task

2022-03-24 Thread Stephen Cousins
If you want to have the same number of processes per node, like: #PBS -l nodes=4:ppn=8 then what I am doing (maybe there is another way?) is: #SBATCH --ntasks-per-node=8 #SBATCH --nodes=4 #SBATCH --mincpus=8 This is because "--ntasks-per-node" is actually "maximum number of tasks per node" and

Re: [slurm-users] Question about sbatch options: -n, and --cpus-per-task

2022-03-24 Thread David Henkemeyer
Thank you! We recently converted from pbs, and I was converting “ppn=X” to “-n X”. Does it make more sense to convert “ppn=X” to --“cpus-per-task=X”? Thanks again David On Thu, Mar 24, 2022 at 3:54 PM Thomas M. Payerle wrote: > Although all three cases ( "-N 1 --cpus-per-task 64 -n 1", "-N 1

Re: [slurm-users] How to open a slurm support case

2022-03-24 Thread Fulcomer, Samuel
...it is a bit arcane, but it's not like we're funding lavish lifestyles with our support payments. I would prefer to see a slightly more differentiated support system, but this suffices... On Thu, Mar 24, 2022 at 6:06 PM Sean Crosby wrote: > Hi Jeff, > > The support system is here - https://bug

Re: [slurm-users] How to open a slurm support case

2022-03-24 Thread Sean Crosby
Hi Jeff, The support system is here - https://bugs.schedmd.com/ Create an account, log in, and when creating a request, select your site from the Site selection box. Sean From: slurm-users on behalf of Jeffrey R. Lang Sent: Friday, 25 March 2022 08:48 To: slu

Re: [slurm-users] How to open a slurm support case

2022-03-24 Thread Jason Booth
Jeff, I will reach out to you directly. -Jason On Thu, Mar 24, 2022 at 3:51 PM Jeffrey R. Lang wrote: > Can someone provide me with instructions on how to open a support case > with SchedMD? > > > > We have a support contract, but no where on their website can I find a > link to open a case w

[slurm-users] How to open a slurm support case

2022-03-24 Thread Jeffrey R. Lang
Can someone provide me with instructions on how to open a support case with SchedMD? We have a support contract, but no where on their website can I find a link to open a case with them. Thanks, Jeff

Re: [slurm-users] Question about sbatch options: -n, and --cpus-per-task

2022-03-24 Thread Thomas M. Payerle
Although all three cases ( "-N 1 --cpus-per-task 64 -n 1", "-N 1 --cpus-per-task 1 -n 64", and "-N 1 --cpus-per-task 32 -n 2") will cause Slurm to allocate 64 cores to the job, there can (and will) be differences in the other respects. The variable SLURM_NTASKS will be set to the argument of the -

[slurm-users] Help with failing job execution

2022-03-24 Thread Jeffrey R. Lang
My site recently updated to Slurm 21.08.6 and for the most part everything went fine. Two Ubuntu nodes however are having issues.Slurmd cannot execve the jobs on the nodes. As an example: [jrlang@tmgt1 ~]$ salloc -A ARCC --nodes=1 --ntasks=20 -t 1:00:00 --bell --nodelist=mdgx01 --partitio

Re: [slurm-users] Question about sbatch options: -n, and --cpus-per-task

2022-03-24 Thread David Henkemeyer
“ Will launch 64 instances of your application, each bound to a single cpu” This is true for srun, but not for sbatch. A while back, we did an experiment using “hostname” to verify. On Thu, Mar 24, 2022 at 12:47 PM Ralph Castain wrote: > Well, there is indeed a difference - and it is significa

Re: [slurm-users] Question about sbatch options: -n, and --cpus-per-task

2022-03-24 Thread Ralph Castain
Well, there is indeed a difference - and it is significant. > On Mar 24, 2022, at 12:32 PM, David Henkemeyer > wrote: > > Assuming -N is 1 (meaning, this job needs only one node), then is there a > difference between any of these 3 flag combinations: > > -n 64 (leaving cpus-per-task to be the

[slurm-users] Question about sbatch options: -n, and --cpus-per-task

2022-03-24 Thread David Henkemeyer
Assuming -N is 1 (meaning, this job needs only one node), then is there a difference between any of these 3 flag combinations: -n 64 (leaving cpus-per-task to be the default of 1) --cpus-per-task 64 (leaving -n to be the default of 1) --cpus-per-task 32 -n 2 As far as I can tell, there is no fun

Re: [slurm-users] Make sacct show short job state codes?

2022-03-24 Thread Ole Holm Nielsen
Here is an example command for getting parseable output from sacct of all completed jobs during a specific period of time: $ sacct -p -X -a -S 032322 -E 032422 -o JobID,User,State -s ca,cd,f,to,pr,oom The fields are separated by | and can easily be parsed by awk. Example output: JobID|User|St

Re: [slurm-users] srun and --cpus-per-task

2022-03-24 Thread Hermann Schwärzler
Hi Durai, I see the same thing as you on our test-cluster that has ThreadsPerCore=2 configured in slurm.conf The double-foo goes away with this: srun --cpus-per-task=1 --hint=nomultithread echo foo Having multithreading enabled leads to imho surprising behaviour of Slurm. My impression is that

Re: [slurm-users] Make sacct show short job state codes?

2022-03-24 Thread Brian Andrus
I don't think that is part of sacct options. Feature request maybe. Meanwhile, awk would be your friend here. Just post-process by piping the output to awk and doing the substitutions before printing the output. eg:     sacct  |awk '{sub("CANCELLED","CA");sub("RUNNING","RU");print}' Just add

Re: [slurm-users] Make sacct show short job state codes?

2022-03-24 Thread Ole Holm Nielsen
Hi Chip, Use the sacct -p or --parsable option to get the complete output delimited by | /Ole On 3/24/22 14:12, Chip Seraphine wrote: I’m trying to shave a few columns off the output of some sacct output, and while it will happily accept the short codes (e.g. CA instead of CANCELLED) I ca

[slurm-users] Make sacct show short job state codes?

2022-03-24 Thread Chip Seraphine
I’m trying to shave a few columns off the output of some sacct output, and while it will happily accept the short codes (e.g. CA instead of CANCELLED) I can’t find a way to get it to report them. Shaving down the columns using %N in –format just results in a truncated version of the long code,

[slurm-users] srun and --cpus-per-task

2022-03-24 Thread Durai Arasan
Hello Slurm users, We are experiencing strange behavior with srun executing commands twice only when setting --cpus-per-task=1 $ srun --cpus-per-task=1 --partition=gpu-2080ti echo foo srun: job 1298286 queued and waiting for resources srun: job 1298286 has been allocated resources foo foo This i

Re: [slurm-users] how to locate the problem when slurm failed to restrict gpu usage of user jobs

2022-03-24 Thread Sean Maxwell
cgroups can control access to devices (e.g. /dev/nvidia0), which is how I understand it to work. -Sean On Thu, Mar 24, 2022 at 4:27 AM wrote: > Well, this is indeed the point. We didn’t set *ConstrainDevices=yes *in > cgroup.conf. After adding this, gpu restriction works as expected. > > But wh

[slurm-users] 答复: how to locate the problem when slurm failed to restrict gpu usage of user jobs

2022-03-24 Thread taleintervenor
Well, this is indeed the point. We didn’t set ConstrainDevices=yes in cgroup.conf. After adding this, gpu restriction works as expected. But what is the relation between gpu restriction and cgroup? I never heard that cgroup can limit gpu card usage. Isn’t it a feature of cuda or nvidia driver?