Re: [slurm-users] Only 2 jobs will start per GPU node despite 4 GPU's being present

2020-08-13 Thread Jodie H. Sprouse
got 2 jobs currently running on each node that’s available. > > So maybe: > > NodeName=c0005 Name=gpu File=/dev/nvidia[0-3] CPUs=0-10,11-21,22-32,33-43 > > would work? > >> On Aug 7, 2020, at 12:40 PM, Jodie H. Sprouse wrote: >> >> External Email W

Re: [slurm-users] Only 2 jobs will start per GPU node despite 4 GPU's being present

2020-08-12 Thread Jodie H. Sprouse
got 2 jobs currently running on each node that’s available. > > So maybe: > > NodeName=c0005 Name=gpu File=/dev/nvidia[0-3] CPUs=0-10,11-21,22-32,33-43 > > would work? > >> On Aug 7, 2020, at 12:40 PM, Jodie H. Sprouse wrote: >> >> External Email W

Re: [slurm-users] Only 2 jobs will start per GPU node despite 4 GPU's being present

2020-08-07 Thread Jodie H. Sprouse
PUs=14-27 Name=gpu Type=tesla File=/dev/nvidia3 CPUs=14-27 to 'assign' all GPUs to the first 14 CPUs or second 14 CPUs (your config makes me think there are two 14 core CPUs, so cores 0-13 would probably be CPU1 etc?) (What is the actual topology of the system (according to, say &#x

Re: [slurm-users] Only 2 jobs will start per GPU node despite 4 GPU's being present

2020-08-07 Thread Jodie H. Sprouse
ot schedule any more jobs to the GPUs. Needed to disable binding in job submission to schedule to all of them. Not sure that applies in your situation (don't know your system), but it's something to check? Tina On 07/08/2020 15:42, Jodie H. Sprouse wrote: > Good morning. > I ha

Re: [slurm-users] Only 2 jobs will start per GPU node despite 4 GPU's being present

2020-08-07 Thread Jodie H. Sprouse
Good morning. I have having the same experience here. Wondering if you had a resolution? Thank you. Jodie On Jun 11, 2020, at 3:27 PM, Rhian Resnick mailto:rresn...@fau.edu>> wrote: We have several users submitting single GPU jobs to our cluster. We expected the jobs to fill each node and fu

[slurm-users] Slurm gpu vs cpu via partition, fair share and/or sos help

2020-07-20 Thread Jodie H. Sprouse
Good morning. I’m wondering if one could point me in the right direction to fulfill a request on one of our small clusters. Cluster info: * 5 nodes with 4 gpus/28 cpus each node. * User 1 only will submit to cpus, all other 8 users will submit to gpus * only one account in the database with

Re: [slurm-users] Errors after removing partition

2019-07-26 Thread Jodie H. Sprouse
fyi… Joe is there now staining front entrance & fixing a few minor touchups, nailing baseboard in basement… Lock box is on the house now w/ key in it… On Jul 26, 2019, at 11:28 AM, Jeffrey Frey mailto:f...@udel.edu>> wrote: If you check the source code (src/slurmctld/job_mgr.c) this error is i

Re: [slurm-users] Slurm strigger configuration

2018-09-20 Thread Jodie H. Sprouse
Thank you to both Kilian and Chris, I have running on the slurm server to report once when any of the nodes go into “Drain” State: sudo -u slurm bash -c “strigger --set -D -p /etc/slurm/triggers/slurm_admin_notify --flags=perm" /bin/mail -s “ClusterName DrainedNode:$*” our_admin_email_address

[slurm-users] Slurm strigger configuration

2018-09-19 Thread Jodie H. Sprouse
Good morning. I’m struggling with getting strigger working correctly. My end goal sounds fairly simple: to get a mail notification if a node gets set into ‘drain’ mode. The man page for strigger states it must be run by the set slurmuser which is slurm: # scontrol show config | grep SlurmUser S