Hi David,

I'd recommend the following that I've learned from bad experiences upgrading 
between the last major version.
        
1. Consider upgrading to mysql-server 5.5 or greater

2. Purge/archive unneeded jobs/steps before the upgrade, to make the upgrade as 
quick as possible:

slurmdbd.conf:

ArchiveDir=/common/adm/slurmdb_archive
ArchiveEvents=yes
ArchiveJobs=yes
ArchiveSteps=no
ArchiveResvs=no
ArchiveSuspend=no
PurgeEventAfter=1month
PurgeJobAfter=6months
PurgeResvAfter=2month
PurgeStepAfter=6months
PurgeSuspendAfter=2month


3. Take a fresh mysql dump after the archives occur:

mysqldump --all-databases > slurm_db.sql


4. Testing the update on another machine, or vm that has a representation of 
your environment (same rpms, configs, etc). Just take your newly created dump 
from production and load it into the test system:

mysql -u root < slurm_db.db


Once you take care of any connection issues in mysql, allowing a different host 
to connect, then you can fire up slumdbd to perform the upgrade. And see how 
long it takes, and what hiccups you will run into. Now you know, and can plan 
your maintenance window accordingly.

Hope that helps! Good luck!

Best,
Chris

—
Christopher Coffey
High-Performance Computing
Northern Arizona University
928-523-1167
 

On 9/26/18, 8:57 AM, "slurm-users on behalf of Baker D.J." 
<slurm-users-boun...@lists.schedmd.com on behalf of d.j.ba...@soton.ac.uk> 
wrote:

    Thank you for your reply. You're correct, the systemd commands aren't 
invoked, however upgrading the slurm rpm effectively pulls the rug from under 
/usr/sbin/slurmctld. The v17.02 slurm rpm provides /usr/sbin/slurmctld,
     but from v17.11 that executable is provided by the slurm-slurmctld rpm. 
    
    
    In other words, doing a minimal install of just the slurm and the slurmdbd 
rpms deletes the slurmctld executable. I haven't explicitly tested this, 
however I tested the upgrade on a compute node and experimented with
     the slurmd -- the logic should be the same. 
    
    
    I guess that the question that comes to mind is.. Is it a really big deal 
if the slurmctld process is down whilst the slurmdbd is being upgraded? Bearing 
in mind that I will probably opt to suspend all run jobs and stop
     the partitions during the upgrade.
    
    
    Best regards,
    David
    
    ________________________________________
    From: slurm-users <slurm-users-boun...@lists.schedmd.com> on behalf of 
Chris Samuel <ch...@csamuel.org>
    Sent: 26 September 2018 11:26
    To: slurm-users@lists.schedmd.com
    Subject: Re: [slurm-users] Upgrading a slurm on a cluster, 17.02 --> 18.08 
    
    On Tuesday, 25 September 2018 11:54:31 PM AEST Baker D. J.  wrote:
    
    > That will certainly work, however the slurmctld (or in the case of my test
    > node, the slurmd) will be killed. The logic is that at v17.02 the slurm 
rpm
    > provides slurmctld and slurmd. So upgrading that rpm will destroy/kill the
    > existing slurmctld or slurmd processes.
    
    If you do that with the --noscripts then will it really kill the process?  
    Nothing should invoke the systemd commands with that, should it?  Or do you 
    mean taking the libraries, etc, away out underneath of the running process 
    will cause it to crash?
    
    Might be worth testing that on on a VM to see if it will happen.
    
    Best of luck!
    Chris
    -- 
     Chris Samuel  :  
    
https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.csamuel.org%2F&amp;data=01%7C01%7Cd.j.baker%40soton.ac.uk%7C8b7cb9ecbbfe4644d3fa08d6239b7821%7C4a5378f929f44d3ebe89669d03ada9d8%7C1&amp;sdata=hdM3hZuFetDEqdCYj4VCrgCZ8hOC2FGsBuS8Ql74Ly0%3D&amp;reserved=0
 
<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.csamuel.org%2F&data=02%7C01%7Cchris.coffey%40nau.edu%7Ccc8f355d4d974c92165108d623c8c787%7C27d49e9f89e14aa099a3d35b57b2ba03%7C0%7C0%7C636735742585380968&sdata=b4CRx9DRwkCb8BwJtXMU7eqcYeW6CVasvO1C25Y3X%2FA%3D&reserved=0>
 
     :  Melbourne, VIC
    
    
    
    
    
    
    
    
    

Reply via email to