Re: [slurm-users] How to throttle sinfo/squeue/scontrol show so they don't throttle slurmctld

2020-08-17 Thread Steven Senator (slurm-dev-list)
The slurm scheduler only locks out user requests when specific data structures are locked due to modification, or potential modification. So, the most effective technique is to limit the time window when that will be happening by a combination of efficient traversal of the main scheduling loop (whe

Re: [slurm-users] How to queue jobs based on non-existent features

2020-08-14 Thread Steven Senator (slurm-dev-list)
We use a scenario that is analogous to yours using features. Features are defined in slurm.conf and are associated with nodes from-which a job may be submitted, as an administratively, configuration-managed authoritative source. (NodeName=xx-login State=FUTURE AvailableFeatures=) (ie. ={green,blue,

Re: [slurm-users] Change ExcNodeList on a running job

2020-06-04 Thread Steven Senator (slurm-dev-list)
Also consider the --no-kill ("-k") options to sbatch (and srun.) Following from the sbatch man page. -k, --no-kill [=off] Do not automatically terminate a job if one of the nodes it has been allocated fails. The user will assume the responsibilities for fau

Re: [slurm-users] Problem with permisions. CentOS 7.8

2020-05-28 Thread Steven Senator (slurm-dev-list)
What is in /var/log/munge/munged.log? Munge is quite strict about permissions in its whole hierarchy of control and configuration files, appropriately. On Thu, May 28, 2020 at 11:01 AM Rodrigo Santibáñez wrote: > > Hello, > > You could find the solution here > https://wiki.fysik.dtu.dk/niflheim/S

Re: [slurm-users] Slurm Upgrade from 17.02

2020-02-20 Thread Steven Senator (slurm-dev-list)
When upgrading to 18.08 it is prudent to add following lines into your /etc/my.cnf as per https://slurm.schedmd.com/accounting.html https://slurm.schedmd.com/SLUG19/High_Throughput_Computing.pdf (slide #6) [mysqld] innodb_buffer_pool_size=1G innodb_log_file_size=64M innodb_lock_wait_timeout=90