We're running slurm-17.11.12 on Bright Cluster 8.1 and our node002 keeps
going into a draining state:
sinfo -a
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
defq*up infinite 1 drng node002
info -N -o "%.20N %.15C %.10t %.10m %.15P %.15G %.35E"
NODELIST CPUS(A/I/
hi all,
by concidence, I have stumbled today over the troubleshooting slides
from slug 2019.
SchedMD there explicitly tells us to use SIGUSR2 instead of restart /
reload / reconfig / SIGHUP.
Please compare
https://slurm.schedmd.com/SLUG19/Troubleshooting.pdf
Around slide 22 or so.
Best
navin srivastava writes:
> can i move the log file to some other location and then restart.reload of
> slurm service will start a new log file.
Yes, restarting it will start a new log file if the old one is moved
away. However, also reconfig will do, and you can trigger that by
sending the pro
Hi,
i wanted to understand how log rotation of slurmctld works.
in my environment i don't have any logrotation for the slurmctld.log and
now the log file size reached to 125GB.
can i move the log file to some other location and then restart.reload of
slurm service will start a new log file.i thi
Hi,
i guess i found the Problem.
It seems to come from this file:
src/plugins/accounting_storage/mysql/as_mysql_convert.c
in particular from here:
--- code ---
static int _convert_job_table_pre(mysql_conn_t *mysql_conn, char *cluster_name)
{
int rc = SLURM_SUCCESS;
char *query =