If this were me, I would:
- Stop both Bareos and pgsql processes on the server;
- If you can, copy the root '/' to a removable drive for safety.
- Prune any cruft from the root volume to make space, for example prune
journalctl (--vacuum-size 4M), syslog, cached rpm packages, etc.
Anything else that can be recreated/downloaded easily. Use "sudo du -s"
to discover disk space used by parts of the system, e.g. "sudo du -sm
/var/*" will show the total megabytes used by every directory under
/var. Check the tape spool area too, if you are using it.
If in pruning cruft you can save 1 to 2GB then try to get pgsql running
again. I don't know pgsql well, but 'big' databases generally don't
release disk space as soon as you delete a row -- they just mark the
space as unused. in mysql you would run "analyze" commands to ask the db
engine to compact the space and free up what can be freed, but I don't
know for pg.
It might be simpler, assuming pgsql runs, to take a sql-format backup of
the db, which can then be reloaded into a new pgsql instance & hence
only consume the space actually needed. I would imagine you have such a
sql-format db backup, already but ...
If you can't prune enough to get pgsql to work properly again, I think
the best option is to move the pgsql data files onto a new drive --
either install a third ssd or use a removable drive. It doesn't have to
be fast or fancy. Once that is online, delete the pgsql files from
/var/.../pgsql and mount the new drive "on top" of the /var/.../pgsql
directory [sorry, can't recall exact dir name]. Having done that you
should be able to bring up pgsql again properly.
Get pgsql to check it's database integrity. It looks like pg_checksums
is the tool.
If that all pans out, start bareos again and see if it's happy. If so
(probably is) delete the last job, as it's incomplete, and see if you
can manually restart it. Hopefully you can do so and all will be well.
Once initial recovery complete:
I suggest ordering a couple of 1TB SSDs and use them (mirrored) as a
dedicated pgsql drive, migrating the whole DB to them, and reserving the
380GB drives (mirrored) for system use. Consider using a
data-checksumming filesystem such as btrfs or zfs, so you can discover
if the drives fail.
If you have been using the tape spool function, I strongly suggest using
a _separate_ physical drive/drives for that purpose as the spool is
writing a lot of data and so ssd wear will be significant. Also, the
spool area has the chance of accumulating spooled data from failed jobs,
which can very quickly fill a system disk.
Hope this all helps,
Ruth
On 20/06/2025 21:02, Riot Nrrrd wrote:
At my work we have a Bareos setup with about 70 or so clients.
It wasn't set up by me originally; I inherited it. It (Bareos 22.1.4)
was set up on a RHEL system with dual SSDs for the root volume and the
total disk space on "/" is around 380 GB. Well, due to 'slow creep'
eventually the Bareos PostgreSQL database filled up the partition to
100% and now backups have stopped.
The last job before they stopped failed with
--
18-Jun 21:06 bareos-sd JobId 93060: Releasing device "Disk2"
(/export/bareos/storage2).
18-Jun 21:06 bareos-sd JobId 93060: Elapsed time=01:06:55, Transfer
rate=12.86 M Bytes/second
18-Jun 21:06 bareos-dir JobId 93060: Insert of attributes batch table
with 475489 entries start
18-Jun 21:07 bareos-dir JobId 93060: Fatal error:
cats/sql_create.cc:815 Fill File table Query failed: INSERT INTO File
(FileIndex, JobId, PathId, Name, LStat, MD5, DeltaSeq, Fhinfo, Fhnode)
SELECT batch.FileIndex, batch.JobId, Path.PathId, batch.Name,
batch.LStat, batch.MD5, batch.DeltaSeq, batch.Fhinfo, batch.Fhnode
FROM batch JOIN Path ON (batch.Path = Path.Path) : ERR=ERROR:
relation "batch" does not exist
LINE 1: ..., batch.DeltaSeq, batch.Fhinfo, batch.Fhnode FROM batch JOIN...
--
I tried using the /bconsole/ 'prune' command to prune back the jobs,
hoping it might result in a database shrinkage. Instead it just kept
getting larger. :-( I tried asking ChatGPT for suggestions and it
just returned a bunch of pgsql commands that I don't really understand
(not that I'd trust ChatGPT anyway).
Does anyone have any 'ELI5" suggestions on what to do?
I suppose I could shut down Bareos and the database and move
//var/lib/pgsql/ to one of the data (backups) volumes and out of the
root partition, but I was hoping I could solve this without having to
move it by getting the database to shrink. Is that a possibility?
--
You received this message because you are subscribed to the Google
Groups "bareos-users" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to [email protected].
To view this discussion visit
https://groups.google.com/d/msgid/bareos-users/7e58bcbc-0913-408c-8909-1a43b210d5bbn%40googlegroups.com
<https://groups.google.com/d/msgid/bareos-users/7e58bcbc-0913-408c-8909-1a43b210d5bbn%40googlegroups.com?utm_medium=email&utm_source=footer>.
--
You received this message because you are subscribed to the Google Groups
"bareos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion visit
https://groups.google.com/d/msgid/bareos-users/8e2b853b-3657-4c90-bf7d-a576bbc70a86%40gmail.com.