Hi Lech,
Thanks! I added the 18.08 Release Notes reference to
https://wiki.fysik.dtu.dk/niflheim/Slurm_installation#database-upgrade-from-slurm-17-02-and-older
I've already upgraded from 17.11 to 18.08 without your patch, and this
went smoothly as expected. We upgraded from 17.02 to 17.11 l
Hi Ole,
your summary is correct as far as I can tell and will hopefully help some users.
One thing I’d add is the remark from the 18.08 Release Notes (
https://github.com/SchedMD/slurm/blob/slurm-18.08/RELEASE_NOTES ), which adds
mysql 5.5 to the list.
They’ve mentioned that mysql 5.5 is the def
Hi Lech,
I've tried to summarize your work on the Slurm database upgrade patch in
my Slurm Wiki page:
https://wiki.fysik.dtu.dk/niflheim/Slurm_installation#database-upgrade-from-slurm-17-02-and-older
Could you kindly check if my notes are correct and complete? Hopefully
this Wiki will also h
Lech,
Thanks for the explanation. Now that you explained it like that, I
understand SchedMD's decision. I was misreading the situation. I was
under the impression that this affected *all* db upgrades, not just
those from one old version a slightly less older version.
Prentice
On 4/4/19 7:07
> Upgrading more than 2 releases isn't supported, so I don't believe the 19.05
> slurmdbd will have the code in it to upgrade tables from earlier than 17.11.
I haven’t found any mention of this in the upgrade section of the QuickStart
guide (see https://slurm.schedmd.com/quickstart_admin.html#up
On 4/4/19 4:07 am, Lech Nieroda wrote:
Furthermore, upgrades shouldn’t skip more than one release, as that would lead
to loss of state files and other important information, so users probably won’t
upgrade from 17.02 to 19.05 directly. If they’d do that then yes, the patch
would be applicable
That’s correct but let’s keep in mind that it only concerns the upgrade process
and not production runtime which has certain implications.
The affected database structures have been introduced in 17.11 and an upgrade
affects only versions 17.02 or prior, it wouldn’t be a problem for users who
ha
On Wednesday, 3 April 2019 6:33:17 AM PDT Prentice Bisbal wrote:
> Anyone else as disappointed by this as I am? I get that it's too late to
> add something like this to 17.11 or 18.08, but it seems like SchedMD
> isn't even interested in looking at this for 19.x
Not really surprising, it's not ap
the dev stated that they’d rather keep that warning than fixing the issue, so
I’m not sure if that’ll be enough to convince them.
Anyone else as disappointed by this as I am? I get that it's too late to
add something like this to 17.11 or 18.08, but it seems like SchedMD
isn't even interested i
Hi Ole,
since we aren’t using RHEL7/CentOS7 we haven’t tested it with mysql 5.5 and
it’d probably carry more weight if someone running that OS would test it and
add an appropriate comment. You are welcome to try it out.
That being said, the release notes explicitly mention that versions 5.1 and
Hi Lech,
Maybe you could add your arguments to the bug report
https://bugs.schedmd.com/show_bug.cgi?id=6796 hoping that SchedMD may be
convinced that this is a useful patch for future versions of Slurm, also
for MySQL/MariaDB versions 5.5 and newer.
Best regards,
Ole
On 4/3/19 1:17 PM, Lec
Hi Ole,
> Am 03.04.2019 um 12:53 schrieb Ole Holm Nielsen :
> SchedMD already decided that they won't fix the problem:
Yes, I guess it’s a bit late in the release lifecycles. Nevertheless it’s a
pity, as there are certainly a lot of users around who’d rather not upgrade
their distribution defau
Hi Lech,
Thanks for submitting the patch to SchedMD:
https://bugs.schedmd.com/show_bug.cgi?id=6796
SchedMD already decided that they won't fix the problem:
Thank you for the submission, but I will not be merging this upstream at this
time.
Support for the 17.11 release is nearly ended, and
Hello Chris,
I’ve submitted the bug report together with a patch.
We don’t have a support contract but I suppose they’ll at least read it ;)
The code is identical for 18.08.x and 19.05.x, it’s just a different offset.
Kind regards,
Lech
> Am 02.04.2019 um 15:18 schrieb Ole Holm Nielsen :
>
> H
Hi Lech,
IMHO, the Slurm user community would benefit the most from your
interesting work on MySQL/MariaDB performance, if your patch could be
made against the current 18.08 and the coming 19.05 releases. This
would ensure that your work is carried forward.
Would you be able to make patches
That’s probably it.
Sub-queries are known for potential performance issues, so one wonders why the
devs didn’t extract it accordingly and made the code more robust or at least
compatible with RHEL/CentOS 6 rather than including that remark in the release
notes.
> Am 02.04.2019 um 07:20 schrie
On Monday, 1 April 2019 7:55:09 AM PDT Lech Nieroda wrote:
> Further analysis of the query has shown that the mysql optimizer has choosen
> the wrong execution plan. This may depend on the mysql version, ours was
> 5.1.69.
I suspect this is the issue documented in the release notes for 17.11:
ht
We’ve run into exactly the same problem, i.e. an extremely long upgrade process
to the 17.11.x major release. Luckily, we’ve found a solution.
The first approach was to tune various innodb options, like increasing the
buffer pool size (8G), the log file size (64M) or the lock wait timeout (900)
On 07/18/2018 10:56 AM, Roshan Thomas Mathew wrote:
We ran into this issue trying to move from 16.05.3 -> 17.11.7 with 1.5M
records in job table.
In our first attempt, MySQL reported "ERROR 1206 The total number of
locks exceeds the lock table size" after about 7 hours.
Increased InnoDB Buff
We ran into this issue trying to move from 16.05.3 -> 17.11.7 with 1.5M
records in job table.
In our first attempt, MySQL reported "ERROR 1206 The total number of locks
exceeds the lock table size" after about 7 hours.
Increased InnoDB Buffer Pool size -
https://dba.stackexchange.com/questions/27
On Wed, 28 Feb 2018 06:51:15 +1100
Chris Samuel wrote:
> On Wednesday, 28 February 2018 2:13:41 AM AEDT Miguel Gila wrote:
>
> > Microcode patches were not applied to the physical system, only the
> > kernel was upgraded, so I'm not sure whether the performance hit
> > could come from that or no
On Wednesday, 28 February 2018 2:13:41 AM AEDT Miguel Gila wrote:
> Microcode patches were not applied to the physical system, only the kernel
> was upgraded, so I'm not sure whether the performance hit could come from
> that or not.
Yes it would, it's the kernel changes that cause the impact. M
Microcode patches were not applied to the physical system, only the kernel was
upgraded, so I'm not sure whether the performance hit could come from that or
not.
Reducing the size of the DB to make the upgrade process complete in a
reasonable time is like shooting a mosquito with a shotgun. Yea
Good thought Chris. Yet in our case our system does not have the
spectre/meltdown kernel fix.
Just to update everyone, we performed the upgrade successfully after we purged
more data jobs/steps first. We did the following to ensure the purge happened
right away per Hendryk's recommendation:
Ar
On Friday, 23 February 2018 8:04:50 PM AEDT Miguel Gila wrote:
> Interestingly enough, a poor vmare VM (2CPUs, 3GB/RAM) with MariaDB 5.5.56
> outperformed our central MySQL 5.5.59 (128GB, 14core, SAN) by a factor of
> at least 3 on every table conversion.
Wild idea completely out of left field..
On 22-02-2018 21:27, Christopher Benjamin Coffey wrote:
Thanks Paul. I didn't realize we were tracking energy ( . Looks like the best
way to stop tracking energy is to specify what you want to track with
AccountingStorageTRES ? I'll give that a try.
Perhaps it's a good idea for a lot of sites
We recently ran a similar exercise: when updating from 17.02.7 to 17.11.03-2,
we had to stop the upgrade on our production DB (shared with other databases)
after nearly half-day into it. It had reached a job table for a system with 6
million jobs and still had to go thru another one with >7 mill
Thanks Paul. I didn't realize we were tracking energy ( . Looks like the best
way to stop tracking energy is to specify what you want to track with
AccountingStorageTRES ? I'll give that a try.
Best,
Chris
—
Christopher Coffey
High-Performance Computing
Northern Arizona University
928-523-1167
Typically the long db upgrades are only for major version upgrades.
Most of the time minor versions don't take nearly as long.
At least with our upgrade from 17.02.9 to 17.11.3 the upgrade only took
1.5 hours with 6 months worth of jobs (about 10 million jobs). We don't
track energy usage th
We experienced the same problem. On our two new clusters with smaller
databases (<1 million jobs), the upgrade from 17.02.9 to 17.11.2 and
17.11.3 was quick and smooth. On the third, older cluster, where we have a
larger database (>30 million jobs) the upgrade was a mess, both in mysql
and mariadb.
Hi Chris,
we were faced with exactly the same problem - update of 16.05.11 to
17.11.3 took more than 24 hours without finalizing the conversion of job
table. Finally, we cancelled the process, went back to "old" version
16.05.11 and restored the database. At that time we had 10.5 million
jobs
This is great to know Kurt. We can't be the only folks running into this.. I
wonder if the mysql update code gets into a deadlock or something. I'm hoping a
slurm dev will chime in ...
Kurt, out of band if need be, I'd be interested in the details of what you
ended up doing.
Best,
Chris
—
Chr
On Wed, Feb 21, 2018 at 11:56:38PM +, Christopher Benjamin Coffey wrote:
> Hello,
>
> We have been trying to upgrade slurm on our cluster from 16.05.6 to 17.11.3.
> I'm thinking this should be doable? Past upgrades have been a breeze, and I
> believe during the last one, the db upgrade took
33 matches
Mail list logo