[slurm-users] Slurm - sacct: error: slurm_persist_conn_open_without_init: failed to open persistent connection to host:localhost:6819: Connection refused (Zainul Abiddin)

2021-02-02 Thread Michael Smith
A few things to check here: * Ensure that your firewall ports are open – ports 6817/6818/6819/3306 * Make sure that munge is working correctly: $ munge -n | unmunge * Make sure you go through the accounting web-page as well - https://slurm.schedmd.com/accounting.html * In pa

[slurm-users] Suspend/Resume, CGROUP and SIGTSTP

2021-01-29 Thread Michael Smith
I’ve setup SLURM to enable pre-emption so that high-priority jobs can take-over resources from lower-priority jobs. As we use a lot of expensive EDA software, we want to get the best use of these expensive licenses. The software all uses the FlexLM license manager, and when a job is suspended

[slurm-users] error: DBD_SEND_MULT_MSG message from invalid uid 9920

2021-01-15 Thread Michael Smith
I’m new to SLURM and attempting to setup a new installation. I’ve built the 20.11.2 tools on CentOS 7, and now I’ve got the MariaDB running but the slurmdbd log file is full of: [2021-01-15T09:34:25.002] error: Processing last message from connection 10(192.168.1.16) uid(9920) [2021-01-15T09:3