Yeah, a word of warning about going from 21.08 to 22.05, make sure you
have enough storage on the database host you are doing the work on and
budget a long enough time for the upgrade. We just converted our 198 GB
(compressed, 534 GB raw) database this week. The initial attempt failed
(after running for 8 hours) because we ran out of disk space (part of
the reason we had to compress is that the server we use for our slurm
master only has 800 GB of SSD on it). That meant we had to reimport our
DB, which took 8 hours, plus then we had to drop the job scripts and job
envs, which took another 5 hours, to then attempt the upgrade which took
2 hours.
Moral of the story, make sure you have enough space and budget
sufficient time. You may want to consider nulling out the job scripts
and envs for the upgrade as they complete redo the way those are stored
in the database in 22.05 so that it is more efficient but getting from
here to there is the trick.
For details see the bug report we filed:
https://bugs.schedmd.com/show_bug.cgi?id=14514
-Paul Edmon-
On 7/14/2022 2:34 PM, Timony, Mick wrote:
What I can tell you is that we have never had a problem
reimporting the data back in that was dumped from older versions
into a current version database. So the import using sacctmgr
must do the conversion from the older formats to the newer formats
and handle the schema changes.
That's the bit of info I was missing, I didn't realise that it
outputs the data in a format that sacctmgr can read.
I will note that if you are storing job_scripts and envs those can
eat up a ton of space in 21.08. It looks like they've solved that
problem in 22.05 but the archive steps on 21.08 took forever due
to those scripts and envs.
Yes, we are storing job_scripts with:
AccountingStoreFlags=job_script
I think when we made that decision, we decided that also saving
the job_env would take up too much room as our DB is pretty big at the
moment, at approx. 300GB with the o2_step_table and the o2_job_table
taking up the most space for obvious reasons:
+----------------------------+-----------+
| Table | Size (GB) |
+----------------------------+-----------+
| o2_step_table | 183.83 |
| o2_job_table | 128.18 |
That's good advice Paul, much appreciated.
>took forever and actually caused issues with the archive process
I think that should be highlighted for other users!
For those interested, to find the tables sizes I did this:
SELECT table_name AS "Table", ROUND(((data_length + index_length) /
1024 / 1024 / 1024), 2) AS "Size (GB)" FROM information_schema.TABLES
WHERE table_schema = "slurmdbd" ORDER BY (data_length + index_length)
DESC;
Replace slurmdbdwith the name of your database.
Cheers
--Mick