From: slurm-users On Behalf Of John
DeSantis
Sent: 18 May 2022 15:39
To: slurm-users@lists.schedmd.com
Subject: Re: [slurm-users] SLURM upgrade from 20.11.3 to 20.11.9
misidentification of job steps
Hello,
It also appears that random jobs are being identified as using too much memory,
d
work on Monday.
-Original Message-
From: slurm-users On Behalf Of John
DeSantis
Sent: 18 May 2022 15:39
To: slurm-users@lists.schedmd.com
Subject: Re: [slurm-users] SLURM upgrade from 20.11.3 to 20.11.9
misidentification of job steps
Hello,
It also appears that random jobs are being identified as
Hello,
It also appears that random jobs are being identified as using too much memory,
despite being well within limits.
For example, a job is running that requested 2048 MB per CPU and all processes
are within the limit. But, the job is identified as being over limit when it
isn't. Please
Hello,
Due to the recent CVE posted by Tim, we did upgrade from SLURM 20.11.3 to
20.11.9.
Today, I received a ticket from a user with their output files populated with the
"slurmstepd: error: Exceeded job memory limit" message. But, the jobs are
still running and it seems that the controller
Slurmdbd has an issue and from the logs is still trying to load the old
version:
[2021-01-22T14:17:18.430] MySQL server version is: 5.5.68-MariaDB
[2021-01-22T14:17:18.433] error: Database settings not recommended values:
innodb_buffer_pool_size innodb_log_file_size innodb_lock_wait_timeout
[2021-0
On 24/12/20 6:24 am, Paul Edmon wrote:
We then have a test cluster that we install the release on a run a few
test jobs to make sure things are working, usually MPI jobs as they tend
to hit most of the features of the scheduler.
One thing I meant to mention last night was that we use Reframe
We are the same way, though we tend to keep pace with minor releases.
We typically wait until the .1 release of a new major release before
considering upgrade so that many of the bugs are worked out. We then
have a test cluster that we install the release on a run a few test jobs
to make sure
On Friday, 18 December 2020 10:10:19 AM PST Jason Simms wrote:
> Thanks to several helpful members on this list, I think I have a much better
> handle on how to upgrade Slurm. Now my question is, do most of you upgrade
> with each major release?
We do, though not immediately and not without a deg
Hi Jason,
Ultimately each site decides how/why to do it; in my case I tend to do big
"forklift upgrades", so I'm running 18.08 on the current cluster and will
go to latest SLURM for my next cluster build. But you may have good
reasons to upgrade slurm more often on your existing cluster. I don't
Hello all,
Thanks to several helpful members on this list, I think I have a much
better handle on how to upgrade Slurm. Now my question is, do most of you
upgrade with each major release?
I recognize that, normally, if something is working well, then don't
upgrade it! In our case, we're running 2
On 11/5/20 7:14 AM, navin srivastava wrote:
Thank you all for the response.
but my question here is
I have already built a new server slurm 20.2 with the latest DB. my
question is, shall i do a mysqldump into this server from existing server
running with version slurm version 17.11.8 and the
Hi Navin,
On 11/4/20 10:14 pm, navin srivastava wrote:
I have already built a new server slurm 20.2 with the latest DB. my
question is, shall i do a mysqldump into this server from existing
server running with version slurm version 17.11.8
This won't work - you must upgrade your 17.11 datab
Thank you all for the response.
but my question here is
I have already built a new server slurm 20.2 with the latest DB. my
question is, shall i do a mysqldump into this server from existing server
running with version slurm version 17.11.8 and then i will upgrade all
client with 20.x followed b
On 11/2/20 2:25 PM, navin srivastava wrote:
Currently we are running slurm version 17.11.x and wanted to move to 20.x.
We are building the New server with Slurm 20.2 version and planning to
upgrade the client nodes from 17.x to 20.x.
wanted to check if we can upgrade the Client from 17.x to 2
We have hit this when we naively ran using the service and it timed out
and borked the database. Fortunately we had a backup to go back to.
Since then we have run it straight from the command line. Like yours
our production DB is now 23 GB for 6 months worth of data so major
schema updates t
On 11/2/20 7:31 am, Paul Edmon wrote:
e. Run slurmdbd -Dv to do the database upgrade. Depending on the
upgrade this can take a while because of database schema changes.
I'd like to emphasis the importance of doing the DB upgrade in this way,
do not use systemctl for this as if systemd run
We haven't really had MPI ugliness with the latest versions. Plus we've
been rolling our own PMIx and building against that which seems to have
solved most of the cross compatibility issues.
-Paul Edmon-
On 11/2/2020 10:38 AM, Fulcomer, Samuel wrote:
Our strategy is a bit simpler. We're migrat
Our strategy is a bit simpler. We're migrating compute nodes to a new
cluster running 20.x. This isn't an upgrade. We'll keep the old slurmdbd
running for at least enough time to suck the remaining accounting data into
XDMoD.
The old cluster will keep running jobs until there are no more to run.
W
We don't follow the recommended procedure here but rather build RPMs and
upgrade using those. We haven't and any issues. Here is our procedure:
1. Build rpms from source using a version of the slurm.spec file that we
maintain. It's the version SchedMD provides but modified with some
specific
We don't follow the recommended procedure here but rather build RPMs and
upgrade using those. We haven't and any issues. Here is our procedure:
1. Build rpms from source using a version of the slurm.spec file that we
maintain. It's the version SchedMD provides but modified with some
specific
Hello all,
I am going to reveal the degree of my inexperience here, but am I perhaps
the only one who thinks that Slurm's upgrade procedure is too complex? Or,
at least maybe not explained in enough detail?
I'm running a CentOS 8 cluster, and to me, I should be able simply to
update the Slurm pac
In general I would follow this:
https://slurm.schedmd.com/quickstart_admin.html#upgrade
Namely:
Almost every new major release of Slurm (e.g. 19.05.x to 20.02.x)
involves changes to the state files with new data structures, new
options, etc. Slurm permits upgrades to a new major release from
We're doing something similar. We're continuing to run production on 17.x
and have set up a new server/cluster running 20.x for testing and MPI app
rebuilds.
Our plan had been to add recently purchased nodes to the new cluster, and
at some point turn off submission on the old cluster and switch e
From: slurm-users on behalf of
Christopher J Cawley
Sent: Monday, November 2, 2020 8:33 AM
To: Slurm User Community List
Subject: Re: [slurm-users] Slurm Upgrade
I do not think so.
In any case, make sure that you stop services
and make a backup of the database
...@gmu.edu
From: slurm-users on behalf of navin
srivastava
Sent: Monday, November 2, 2020 8:25 AM
To: Slurm User Community List
Subject: [slurm-users] Slurm Upgrade
Dear All,
Currently we are running slurm version 17.11.x and wanted to move to 20.x.
We are
Dear All,
Currently we are running slurm version 17.11.x and wanted to move to 20.x.
We are building the New server with Slurm 20.2 version and planning to
upgrade the client nodes from 17.x to 20.x.
wanted to check if we can upgrade the Client from 17.x to 20.x directly or
we need to go through
When upgrading to 18.08 it is prudent to add following lines into your
/etc/my.cnf as per
https://slurm.schedmd.com/accounting.html
https://slurm.schedmd.com/SLUG19/High_Throughput_Computing.pdf (slide #6)
[mysqld]
innodb_buffer_pool_size=1G
innodb_log_file_size=64M
innodb_lock_wait_timeout=90
done.
Regards,
Ricardo Gregorio
-Original Message-
From: slurm-users On Behalf Of Ole Holm
Nielsen
Sent: 19 February 2020 14:41
To: slurm-users@lists.schedmd.com
Subject: Re: [slurm-users] Slurm Upgrade from 17.02
On 2/19/20 3:10 PM, Ricardo Gregorio wrote:
> I am putting together
Hi Ricardo,
If I remember right, you can only upgrade two versions further. So you
WILL have to upgrade to 18.08, even if you want to use 19.05 or the
coming 20.02
17.02 -> 17.11 -> 18.08 -> 19.05 -> 20.02
^ ^
| |
|- you are here |- "farthest jump" to a ne
On 19/2/20 6:10 am, Ricardo Gregorio wrote:
I am putting together an upgrade plan for slurm on our HPC. We are
currently running old version 17.02.11. Would you guys advise us
upgrading to 18.08 or 19.05?
Slurm versions only support upgrading from 2 major versions back, so you
could only upg
On 2/19/20 3:10 PM, Ricardo Gregorio wrote:
I am putting together an upgrade plan for slurm on our HPC. We are
currently running old version 17.02.11. Would you guys advise us upgrading
to 18.08 or 19.05?
You should be able to upgrade 2 Slurm major versions in one step. The
18.08 version is
hi all,
I am putting together an upgrade plan for slurm on our HPC. We are currently
running old version 17.02.11. Would you guys advise us upgrading to 18.08 or
19.05?
I understand we will have to also upgrade the version of mariadb from 5.5 to
10.X and pay attention to 'long db upgrade from
Hi all,
After the upgrade of our cluster from slurm 17.11.12 to 19.05.2 we started
noticing that jobs above ~ 10 nodes start failing with:
[2019-09-23T16:51:34.310] debug: Checking credential with 640 bytes of sig
data
[2019-09-23T16:51:34.311] error: Credential signature check: Credential
data s
33 matches
Mail list logo