You will need to also reinstall/restart slurmdbd with the updated binary.

Look in the slurmdbd logs to see what is happening there. I suspect it had errors updating/creating the database and tables. If you have no data in it yet, you can just DROP the database and restart slurmdbd.

Brian Andrus

On 12/3/2021 6:42 AM, Giuseppe G. A. Celano wrote:
Thanks for the answer, Brian. I now added --with-mysql_config=/etc/mysql/my.cnf, but the problem is still there and now also slurmctld does not work, with the error:

[2021-12-03T15:36:41.018] accounting_storage/slurmdbd: clusteracct_storage_p_register_ctld: Registering slurmctld at port 6817 with slurmdbd [2021-12-03T15:36:41.019] error: _conn_readable: persistent connection for fd 9 experienced error[104]: Connection reset by peer [2021-12-03T15:36:41.019] error: _slurm_persist_recv_msg: only read 150 of 2613 bytes
[2021-12-03T15:36:41.019] error: Sending PersistInit msg: No error
[2021-12-03T15:36:41.020] error: _conn_readable: persistent connection for fd 9 experienced error[104]: Connection reset by peer [2021-12-03T15:36:41.020] error: _slurm_persist_recv_msg: only read 150 of 2613 bytes
[2021-12-03T15:36:41.020] error: Sending PersistInit msg: No error
[2021-12-03T15:36:41.020] error: _conn_readable: persistent connection for fd 9 experienced error[104]: Connection reset by peer [2021-12-03T15:36:41.020] error: _slurm_persist_recv_msg: only read 150 of 2613 bytes
[2021-12-03T15:36:41.020] error: Sending PersistInit msg: No error
[2021-12-03T15:36:41.020] error: DBD_GET_TRES failure: No error
[2021-12-03T15:36:41.021] error: _conn_readable: persistent connection for fd 9 experienced error[104]: Connection reset by peer [2021-12-03T15:36:41.021] error: _slurm_persist_recv_msg: only read 0 of 2613 bytes
[2021-12-03T15:36:41.021] error: Sending PersistInit msg: No error
[2021-12-03T15:36:41.021] error: DBD_GET_QOS failure: No error
[2021-12-03T15:36:41.021] error: _conn_readable: persistent connection for fd 9 experienced error[104]: Connection reset by peer [2021-12-03T15:36:41.021] error: _slurm_persist_recv_msg: only read 150 of 2613 bytes
[2021-12-03T15:36:41.021] error: Sending PersistInit msg: No error
[2021-12-03T15:36:41.021] error: DBD_GET_USERS failure: No error
[2021-12-03T15:36:41.022] error: _conn_readable: persistent connection for fd 9 experienced error[104]: Connection reset by peer [2021-12-03T15:36:41.022] error: _slurm_persist_recv_msg: only read 0 of 2613 bytes
[2021-12-03T15:36:41.022] error: Sending PersistInit msg: No error
[2021-12-03T15:36:41.022] error: DBD_GET_ASSOCS failure: No error
[2021-12-03T15:36:41.022] error: _conn_readable: persistent connection for fd 9 experienced error[104]: Connection reset by peer [2021-12-03T15:36:41.022] error: _slurm_persist_recv_msg: only read 0 of 2613 bytes
[2021-12-03T15:36:41.022] error: Sending PersistInit msg: No error
[2021-12-03T15:36:41.022] error: DBD_GET_RES failure: No error
[2021-12-03T15:36:41.022] fatal: You are running with a database but for some reason we have no TRES from it.  This should only happen if the database is down and you don't have any state files.



On Thu, Dec 2, 2021 at 10:36 PM Brian Andrus <toomuc...@gmail.com> wrote:


    Your slurm needs built with the support. If you have mysql-devel
    installed it should pick it up, otherwise you can specify the
    location with --with-mysql when you configure/build slurm

    Brian Andrus

    On 12/2/2021 12:40 PM, Giuseppe G. A. Celano wrote:
    Hi everyone,

    I am having trouble getting /slurmdbd/ to work. This is the error
    I get:

    /error: Couldn't find the specified plugin name for
    accounting_storage/mysql looking at all files
    error: cannot find accounting_storage plugin for
    accounting_storage/mysql
    error: cannot create accounting_storage context for
    accounting_storage/mysql
    fatal: Unable to initialize accounting_storage/mysql accounting
    storage plugin/

    I have installed /mysql/ (/apt install mysql/) on Ubuntu 20.04.03
    and followed the instructions on the slurm website
    <https://slurm.schedmd.com/accounting.html>; /mysql/ is running
    (/port 3306/) and these are the relevant parts in my /.conf/ files:

    /slurm.conf/

    # LOGGING AND ACCOUNTING
    AccountingStorageHost=localhost
    AccountingStoragePort=3306
    AccountingStorageType=accounting_storage/slurmdbd
    AccountingStorageUser=slurm
    JobCompType=jobcomp/none
    JobAcctGatherFrequency=30
    JobAcctGatherType=jobacct_gather/linux
    SlurmctldDebug=info
    SlurmctldLogFile=/var/log/slurmctld.log
    SlurmdDebug=info
    SlurmdLogFile=/var/log/slurmd.log

    /slurmdbd.conf/

    AuthType=auth/munge
    DbdAddr=localhost
    DbdHost=localhost
    DbdPort=3306
    LogFile=/var/log/slurmdbd.log
    PidFile=/var/run/slurmdbd.pid
    PluginDir=/usr/lib/slurm
    SlurmUser=slurm
    StoragePass=password
    StorageType=accounting_storage/mysql
    StorageUser=slurm
    StorageLoc=slurm_acct_db

    I changed the port to 3306 because otherwise /slurmdbd /could not
    communicate with /mysql/. If I run /sacct/, for example, I get:

    /sacct: error: _slurm_persist_recv_msg: read of fd 3 failed: No error
    sacct: error: _slurm_persist_recv_msg: only read 126 of 2616 bytes
    sacct: error: slurm_persist_conn_open: No response to persist_init
    sacct: error: Sending PersistInit msg: No error
    JobID           JobName  Partition    Account  AllocCPUS    
     State ExitCode
    ------------ ---------- ---------- ---------- ----------
    ---------- --------
    sacct: error: _slurm_persist_recv_msg: read of fd 3 failed: No error
    sacct: error: _slurm_persist_recv_msg: only read 126 of 2616 bytes
    sacct: error: Sending PersistInit msg: No error
    sacct: error: DBD_GET_JOBS_COND failure: Unspecified error/
    /
    /
    Does anyone have a suggestion to solve this problem? Thank you
    very much.

    Best,
    Giuseppe

Reply via email to