Hi Kern,
If 100 is a large number of jobs, I have a relatively small number (23). I
don’t have any “former” clients that I’m not backing up anymore. One thing I
*do* know is that I have an absolute ton of tiny little files, though I’m
pretty sure that most of them stick around. According to my statistics here,
Bacula has 1,284,029,677 files in its catalog. I can probably afford a little
downtime on my database, so I may take that option if I get above 95%
utilization on the file system.
If my retention periods are quite long, do you have any recommendations on what
would be more typical values?
Thanks,
James
> On Mar 16, 2017, at 10:31 AM, Kern Sibbald <k...@sibbald.com> wrote:
>
> Hello,
>
> I recently took a look at my catalog a bit more in detail when an upgrade of
> my backup server from 14.04 to 16.04 failed (I have 6 systems where the
> upgrade totally failed and left me with a broken system), and so I reloaded
> the Bacula catalog from scratch and in doing so I realized that there were
> lots and lots of old records in it from jobs that I had run several years
> ago. This happens when you create a job or a client, then stop using that
> job or client (or even remove the client) so that no more jobs for that
> client run. What is important is that Bacula prunes only when a job runs
> unless you do it manually, and if jobs never run, the retention periods never
> apply and you end up with lots and lots of unused (orphaned) records in the
> catalog. The only way to clean it up is to see what jobs exist in the
> database and prune/purge those which are no longer used -- this is done
> manual with bconsole.
> The same happens if you have lots and lots of temporary files that get backed
> up. There are mail programs that create a temporary file for each email,
> then delete it a day or two later. If these files are backed up even once,
> they will create lots of name entries in the database. This can be cleaned
> up by using dbcheck.
> Finally, if you can afford a bit of downtime on your database, first back it
> up, then delete it and recreate it with the backup. This creates a database
> that is nicely compacted. If you regularly run vacuums this is probably not
> necessary, but in extreme cases such as after deleting hundreds of old backup
> jobs or clients, it can be a quick way to compact the database.
>
> Note also, your retention periods are quite long so if you have lots of jobs
> (more than 100) that run every night, you will need a big database.
> Best regards,
>
> Kern
>
> On 03/16/2017 03:17 PM, James Chamberlain wrote:
>>> On Mar 16, 2017, at 3:29 AM, Mikhail Krasnobaev <mi...@ya.ru
>>> <mailto:mi...@ya.ru>> wrote:
>>>
>>>> 15.03.2017, 19:57, "James Chamberlain" <jam...@exa.com
>>>> <mailto:jam...@exa.com>>:
>>>>
>>>> Hi all,
>>>>
>>>> I’m getting a touch concerned about the size of my Bacula database, and
>>>> was wondering what I can do to prune it, compress it, or otherwise keep it
>>>> at a manageable size. The database itself currently stands at 324 GB, and
>>>> is using 90% of the file system it’s on. I’m running Bacula 7.4.0 on
>>>> CentOS 6.8, with PostgreSQL 8.4.20 as the database. My file and job
>>>> retention times are set to 180 days, and my volume retention time is set
>>>> to 365 days. Is there any other information I can share which would help
>>>> you help me track this down?
>>>>
>>>> Thanks,
>>>>
>>>> James
>>>
>>> Good day,
>>>
>>> do you run any maintenance jobs on the database?
>>>
>>> Like:
>>> --------------
>>> [root@1c83centos ~]# cat /etc/crontab
>>> SHELL=/bin/bash
>>> PATH=/sbin:/bin:/usr/sbin:/usr/bin
>>> MAILTO=root
>>> HOME=/
>>>
>>> # dump all databases once every 24 hours
>>> 45 4 * * * root nice -n 19 su - postgres -c "pg_dumpall --clean" | gzip -9
>>> > /home/pgbackup/postgres_all.sql.gz
>>>
>>> # vacuum all databases every night (full vacuum on Sunday night, lazy
>>> vacuum every other night)
>>> 45 3 * * 0 root nice -n 19 su - postgres -c "vacuumdb --all --full
>>> --analyze"
>>> 45 3 * * 1-6 root nice -n 19 su - postgres -c "vacuumdb --all --analyze
>>> --quiet"
>>>
>>> # re-index all databases once a week
>>> 0 3 * * 0 root nice -n 19 su - postgres -c 'psql -t -c "select datname from
>>> pg_database order by datname;" | xargs -n 1 -I"{}" -- psql -U postgres {}
>>> -c "reindex database {};"'
>>> -----------------
>>> vacuumdb is a utility for cleaning a PostgreSQL database. vacuumdb will
>>> also generate internal statistics used by the PostgreSQL query optimizer.
>>
>> I don’t believe I’ve been doing any of this. I’ll read up on the
>> documentation and see about putting these into place.
>>
>> Thanks!
>>
>> James
>>
>>
>> ------------------------------------------------------------------------------
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> <http://sdm.link/slashdot>
>>
>> _______________________________________________
>> Bacula-users mailing list
>> Bacula-users@lists.sourceforge.net
>> <mailto:Bacula-users@lists.sourceforge.net>
>> https://lists.sourceforge.net/lists/listinfo/bacula-users
>> <https://lists.sourceforge.net/lists/listinfo/bacula-users>
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users