You will need to also reinstall/restart slurmdbd with the updated binary.
Look in the slurmdbd logs to see what is happening there. I suspect it
had errors updating/creating the database and tables. If you have no
data in it yet, you can just DROP the database and restart slurmdbd.
Brian Andrus
On 12/3/2021 6:42 AM, Giuseppe G. A. Celano wrote:
Thanks for the answer, Brian. I now added
--with-mysql_config=/etc/mysql/my.cnf, but the problem is still there
and now also slurmctld does not work, with the error:
[2021-12-03T15:36:41.018] accounting_storage/slurmdbd:
clusteracct_storage_p_register_ctld: Registering slurmctld at port
6817 with slurmdbd
[2021-12-03T15:36:41.019] error: _conn_readable: persistent connection
for fd 9 experienced error[104]: Connection reset by peer
[2021-12-03T15:36:41.019] error: _slurm_persist_recv_msg: only read
150 of 2613 bytes
[2021-12-03T15:36:41.019] error: Sending PersistInit msg: No error
[2021-12-03T15:36:41.020] error: _conn_readable: persistent connection
for fd 9 experienced error[104]: Connection reset by peer
[2021-12-03T15:36:41.020] error: _slurm_persist_recv_msg: only read
150 of 2613 bytes
[2021-12-03T15:36:41.020] error: Sending PersistInit msg: No error
[2021-12-03T15:36:41.020] error: _conn_readable: persistent connection
for fd 9 experienced error[104]: Connection reset by peer
[2021-12-03T15:36:41.020] error: _slurm_persist_recv_msg: only read
150 of 2613 bytes
[2021-12-03T15:36:41.020] error: Sending PersistInit msg: No error
[2021-12-03T15:36:41.020] error: DBD_GET_TRES failure: No error
[2021-12-03T15:36:41.021] error: _conn_readable: persistent connection
for fd 9 experienced error[104]: Connection reset by peer
[2021-12-03T15:36:41.021] error: _slurm_persist_recv_msg: only read 0
of 2613 bytes
[2021-12-03T15:36:41.021] error: Sending PersistInit msg: No error
[2021-12-03T15:36:41.021] error: DBD_GET_QOS failure: No error
[2021-12-03T15:36:41.021] error: _conn_readable: persistent connection
for fd 9 experienced error[104]: Connection reset by peer
[2021-12-03T15:36:41.021] error: _slurm_persist_recv_msg: only read
150 of 2613 bytes
[2021-12-03T15:36:41.021] error: Sending PersistInit msg: No error
[2021-12-03T15:36:41.021] error: DBD_GET_USERS failure: No error
[2021-12-03T15:36:41.022] error: _conn_readable: persistent connection
for fd 9 experienced error[104]: Connection reset by peer
[2021-12-03T15:36:41.022] error: _slurm_persist_recv_msg: only read 0
of 2613 bytes
[2021-12-03T15:36:41.022] error: Sending PersistInit msg: No error
[2021-12-03T15:36:41.022] error: DBD_GET_ASSOCS failure: No error
[2021-12-03T15:36:41.022] error: _conn_readable: persistent connection
for fd 9 experienced error[104]: Connection reset by peer
[2021-12-03T15:36:41.022] error: _slurm_persist_recv_msg: only read 0
of 2613 bytes
[2021-12-03T15:36:41.022] error: Sending PersistInit msg: No error
[2021-12-03T15:36:41.022] error: DBD_GET_RES failure: No error
[2021-12-03T15:36:41.022] fatal: You are running with a database but
for some reason we have no TRES from it. This should only happen if
the database is down and you don't have any state files.
On Thu, Dec 2, 2021 at 10:36 PM Brian Andrus <toomuc...@gmail.com> wrote:
Your slurm needs built with the support. If you have mysql-devel
installed it should pick it up, otherwise you can specify the
location with --with-mysql when you configure/build slurm
Brian Andrus
On 12/2/2021 12:40 PM, Giuseppe G. A. Celano wrote:
Hi everyone,
I am having trouble getting /slurmdbd/ to work. This is the error
I get:
/error: Couldn't find the specified plugin name for
accounting_storage/mysql looking at all files
error: cannot find accounting_storage plugin for
accounting_storage/mysql
error: cannot create accounting_storage context for
accounting_storage/mysql
fatal: Unable to initialize accounting_storage/mysql accounting
storage plugin/
I have installed /mysql/ (/apt install mysql/) on Ubuntu 20.04.03
and followed the instructions on the slurm website
<https://slurm.schedmd.com/accounting.html>; /mysql/ is running
(/port 3306/) and these are the relevant parts in my /.conf/ files:
/slurm.conf/
# LOGGING AND ACCOUNTING
AccountingStorageHost=localhost
AccountingStoragePort=3306
AccountingStorageType=accounting_storage/slurmdbd
AccountingStorageUser=slurm
JobCompType=jobcomp/none
JobAcctGatherFrequency=30
JobAcctGatherType=jobacct_gather/linux
SlurmctldDebug=info
SlurmctldLogFile=/var/log/slurmctld.log
SlurmdDebug=info
SlurmdLogFile=/var/log/slurmd.log
/slurmdbd.conf/
AuthType=auth/munge
DbdAddr=localhost
DbdHost=localhost
DbdPort=3306
LogFile=/var/log/slurmdbd.log
PidFile=/var/run/slurmdbd.pid
PluginDir=/usr/lib/slurm
SlurmUser=slurm
StoragePass=password
StorageType=accounting_storage/mysql
StorageUser=slurm
StorageLoc=slurm_acct_db
I changed the port to 3306 because otherwise /slurmdbd /could not
communicate with /mysql/. If I run /sacct/, for example, I get:
/sacct: error: _slurm_persist_recv_msg: read of fd 3 failed: No error
sacct: error: _slurm_persist_recv_msg: only read 126 of 2616 bytes
sacct: error: slurm_persist_conn_open: No response to persist_init
sacct: error: Sending PersistInit msg: No error
JobID JobName Partition Account AllocCPUS
State ExitCode
------------ ---------- ---------- ---------- ----------
---------- --------
sacct: error: _slurm_persist_recv_msg: read of fd 3 failed: No error
sacct: error: _slurm_persist_recv_msg: only read 126 of 2616 bytes
sacct: error: Sending PersistInit msg: No error
sacct: error: DBD_GET_JOBS_COND failure: Unspecified error/
/
/
Does anyone have a suggestion to solve this problem? Thank you
very much.
Best,
Giuseppe