That SIGTERM message means something is telling slurmdbd to quit.

Check your cron jobs, maintenance scripts, etc. Slurmdbd is being told to shutdown. If you are running in the foreground, a ^C does that. If you run a kill or killall on it, you will get that same message.

Brian Andrus

On 5/30/2024 6:53 AM, Radhouane Aniba via slurm-users wrote:
Yes I can connect to my database using mysql --user=slurm --password=slurmdbpass  slurm_acct_db and there is no firewall blocking mysql after checking the firewall question

ALso here is the output of slurmdbd -D -vvv (note I can only run this as sudo )

sudo slurmdbd -D -vvv
slurmdbd: debug: Log file re-opened
slurmdbd: debug: Munge authentication plugin loaded
slurmdbd: debug2: mysql_connect() called for db slurm_acct_db
slurmdbd: debug2: Attempting to connect to localhost:3306
slurmdbd: debug2: innodb_buffer_pool_size: 134217728
slurmdbd: debug2: innodb_log_file_size: 50331648
slurmdbd: debug2: innodb_lock_wait_timeout: 50
slurmdbd: error: Database settings not recommended values: innodb_buffer_pool_size innodb_lock_wait_timeout
slurmdbd: Accounting storage MYSQL plugin loaded
slurmdbd: debug2: ArchiveDir = /tmp
slurmdbd: debug2: ArchiveScript = (null)
slurmdbd: debug2: AuthAltTypes = (null)
slurmdbd: debug2: AuthInfo = (null)
slurmdbd: debug2: AuthType = auth/munge
slurmdbd: debug2: CommitDelay = 0
slurmdbd: debug2: DbdAddr = localhost
slurmdbd: debug2: DbdBackupHost = (null)
slurmdbd: debug2: DbdHost = hannibal-hn
slurmdbd: debug2: DbdPort = 7032
slurmdbd: debug2: DebugFlags = (null)
slurmdbd: debug2: DebugLevel = 6
slurmdbd: debug2: DebugLevelSyslog = 10
slurmdbd: debug2: DefaultQOS = (null)
slurmdbd: debug2: LogFile = /var/log/slurmdbd.log
slurmdbd: debug2: MessageTimeout = 100
slurmdbd: debug2: Parameters = (null)
slurmdbd: debug2: PidFile = /run/slurmdbd.pid
slurmdbd: debug2: PluginDir = /usr/lib/x86_64-linux-gnu/slurm-wlm
slurmdbd: debug2: PrivateData = none
slurmdbd: debug2: PurgeEventAfter = 1 months*
slurmdbd: debug2: PurgeJobAfter = 12 months*
slurmdbd: debug2: PurgeResvAfter = 1 months*
slurmdbd: debug2: PurgeStepAfter = 1 months
slurmdbd: debug2: PurgeSuspendAfter = 1 months
slurmdbd: debug2: PurgeTXNAfter = 12 months
slurmdbd: debug2: PurgeUsageAfter = 24 months
slurmdbd: debug2: SlurmUser = root(0)
slurmdbd: debug2: StorageBackupHost = (null)
slurmdbd: debug2: StorageHost = localhost
slurmdbd: debug2: StorageLoc = slurm_acct_db
slurmdbd: debug2: StoragePort = 3306
slurmdbd: debug2: StorageType = accounting_storage/mysql
slurmdbd: debug2: StorageUser = slurm
slurmdbd: debug2: TCPTimeout = 2
slurmdbd: debug2: TrackWCKey = 0
slurmdbd: debug2: TrackSlurmctldDown= 0
slurmdbd: debug2: acct_storage_p_get_connection: request new connection 1
slurmdbd: debug2: Attempting to connect to localhost:3306
slurmdbd: slurmdbd version 19.05.5 started
slurmdbd: debug2: running rollup at Thu May 30 13:50:08 2024
slurmdbd: debug2: Everything rolled up


It goes like this for some time and then it crashes with this message

slurmdbd: Terminate signal (SIGINT or SIGTERM) received
slurmdbd: debug: rpc_mgr shutting down


On Thu, May 30, 2024 at 8:18 AM mercan <ahmet.mer...@uhem.itu.edu.tr> wrote:

    Did you try to connect database using mysql command?

    mysql --user=slurm --password=slurmdbpass slurm_acct_db

    C. Ahmet Mercan

    On 30.05.2024 14:48, Radhouane Aniba via slurm-users wrote:
    Thank you Ahmet,
    I dont have a firewall active.
    And because slurmdbd cannot connect to the database I am not able
    to getting it to be activated through systemctl I will share the
    output for slurmdbd -D -vvv shortly but overall it is always
    saying trying to connect to the db and then retries a couple of
    times and crashes

    R.




    On Thu, May 30, 2024 at 2:51 AM mercan
    <ahmet.mer...@uhem.itu.edu.tr> wrote:

        Hi;

        Did you check can you connect db with your conf parameters
        from head-node:

        mysql --user=slurm --password=slurmdbpass slurm_acct_db

        Also, check and stop firewall and selinux, if they are running.

        Last, you can stop slurmdbd, then run run terminal with:

        slurmdbd -D -vvv

        Regards;

        C. Ahmet Mercan

        On 30.05.2024 00:05, Radhouane Aniba via slurm-users wrote:
        Hi everyone
        I am trying to get slurmdbd to run on my local home server
        but I am really struggling.
        Note : am a novice slurm user
        my slurmdbd always times out even though all the details in
        the conf file are correct

        My log looks like this

        [2024-05-29T20:51:30.088] Accounting storage MYSQL plugin
        loaded
        [2024-05-29T20:51:30.088] debug2: ArchiveDir = /tmp
        [2024-05-29T20:51:30.088] debug2: ArchiveScript = (null)
        [2024-05-29T20:51:30.088] debug2: AuthAltTypes = (null)
        [2024-05-29T20:51:30.088] debug2: AuthInfo = (null)
        [2024-05-29T20:51:30.088] debug2: AuthType = auth/munge
        [2024-05-29T20:51:30.088] debug2: CommitDelay = 0
        [2024-05-29T20:51:30.088] debug2: DbdAddr = localhost
        [2024-05-29T20:51:30.088] debug2: DbdBackupHost = (null)
        [2024-05-29T20:51:30.088] debug2: DbdHost = head-node
        [2024-05-29T20:51:30.088] debug2: DbdPort = 7032
        [2024-05-29T20:51:30.088] debug2: DebugFlags = (null)
        [2024-05-29T20:51:30.088] debug2: DebugLevel = 6
        [2024-05-29T20:51:30.088] debug2: DebugLevelSyslog = 10
        [2024-05-29T20:51:30.088] debug2: DefaultQOS = (null)
        [2024-05-29T20:51:30.088] debug2: LogFile =
        /var/log/slurmdbd.log
        [2024-05-29T20:51:30.088] debug2: MessageTimeout = 100
        [2024-05-29T20:51:30.088] debug2: Parameters = (null)
        [2024-05-29T20:51:30.088] debug2: PidFile = /run/slurmdbd.pid
        [2024-05-29T20:51:30.088] debug2: PluginDir =
        /usr/lib/x86_64-linux-gnu/slurm-wlm
        [2024-05-29T20:51:30.088] debug2: PrivateData = none
        [2024-05-29T20:51:30.088] debug2: PurgeEventAfter = 1 months*
        [2024-05-29T20:51:30.088] debug2: PurgeJobAfter = 12 months*
        [2024-05-29T20:51:30.088] debug2: PurgeResvAfter = 1 months*
        [2024-05-29T20:51:30.088] debug2: PurgeStepAfter = 1 months
        [2024-05-29T20:51:30.088] debug2: PurgeSuspendAfter = 1 months
        [2024-05-29T20:51:30.088] debug2: PurgeTXNAfter = 12 months
        [2024-05-29T20:51:30.088] debug2: PurgeUsageAfter = 24 months
        [2024-05-29T20:51:30.088] debug2: SlurmUser = root(0)
        [2024-05-29T20:51:30.089] debug2: StorageBackupHost = (null)
        [2024-05-29T20:51:30.089] debug2: StorageHost = localhost
        [2024-05-29T20:51:30.089] debug2: StorageLoc = slurm_acct_db
        [2024-05-29T20:51:30.089] debug2: StoragePort = 3306
        [2024-05-29T20:51:30.089] debug2: StorageType =
        accounting_storage/mysql
        [2024-05-29T20:51:30.089] debug2: StorageUser = slurm
        [2024-05-29T20:51:30.089] debug2: TCPTimeout = 2
        [2024-05-29T20:51:30.089] debug2: TrackWCKey = 0
        [2024-05-29T20:51:30.089] debug2: TrackSlurmctldDown= 0
        [2024-05-29T20:51:30.089] debug2:
        acct_storage_p_get_connection: request new connection 1
        [2024-05-29T20:51:30.089] debug2: Attempting to connect to
        localhost:3306
        [2024-05-29T20:51:30.090] slurmdbd version 19.05.5 started
        [2024-05-29T20:51:30.090] debug2: running rollup at Wed May
        29 20:51:30 2024
        [2024-05-29T20:51:30.091] debug2: Everything rolled up
        [2024-05-29T20:51:49.673] Terminate signal (SIGINT or
        SIGTERM) received
        [2024-05-29T20:51:49.673] debug: rpc_mgr shutting down



        my config file looks like this

        ArchiveEvents=yes
        ArchiveJobs=yes
        ArchiveResvs=yes
        ArchiveSteps=no
        ArchiveSuspend=no
        ArchiveTXN=no
        ArchiveUsage=no
        PurgeEventAfter=1month
        PurgeJobAfter=12month
        PurgeResvAfter=1month
        PurgeStepAfter=1month
        PurgeSuspendAfter=1month
        PurgeTXNAfter=12month
        PurgeUsageAfter=24month
        # Authentication info
        AuthType=auth/munge
        # slurmDBD info
        DbdAddr=localhost
        DbdHost=head-node
        DbdPort=7032
        SlurmUser=root
        MessageTimeout=100
        DebugLevel=5
        #DefaultQOS=normal,standby
        LogFile=/var/log/slurmdbd.log
        PidFile=/run/slurmdbd.pid
        #PrivateData=accounts,users,usage,jobs
        #TrackWCKey=yes
        #
        # Database info
        StorageType=accounting_storage/mysql
        StorageHost=localhost
        StoragePort=3306
        StoragePass=slurmdbpass
        StorageUser=slurm
        StorageLoc=slurm_acct_db
        I used standard names and passwords to get started and I
        will change later

        but everytime I try to start slurmdbd.service it crashes and
        I have that log that I shared with you

        I use these versions

        slurmdbd -V
        slurm-wlm 19.05.5
        mysql Ver 15.1 Distrib 10.3.39-MariaDB, for debian-linux-gnu
        (x86_64) using readline 5.2
        Everything else Is working properly except I cannot get
        slurmdbd to work and at this point I exhausted all my
        possible trials :) looking for some expert insights :)


        Any idea what I am doing wrong here ? Also I didn't compile
        any slurm package. I used the binary from apt repos

        Any help will be appreciated

        Cheers

        Rad

--




--
*Rad Aniba, PhD*

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

Reply via email to