Hi Kern,

If 100 is a large number of jobs, I have a relatively small number (23).  I 
don’t have any “former” clients that I’m not backing up anymore.  One thing I 
*do* know is that I have an absolute ton of tiny little files, though I’m 
pretty sure that most of them stick around.  According to my statistics here, 
Bacula has 1,284,029,677 files in its catalog.  I can probably afford a little 
downtime on my database, so I may take that option if I get above 95% 
utilization on the file system.

If my retention periods are quite long, do you have any recommendations on what 
would be more typical values?

Thanks,

James


> On Mar 16, 2017, at 10:31 AM, Kern Sibbald <k...@sibbald.com> wrote:
> 
> Hello,
> 
> I recently took a look at my catalog a bit more in detail when an upgrade of 
> my backup server from 14.04 to 16.04 failed (I have 6 systems where the 
> upgrade totally failed and left me with a broken system), and so I reloaded 
> the Bacula catalog from scratch and in doing so I realized that there were 
> lots and lots of old records in it from jobs that I had run several years 
> ago.  This happens when you create a job or a client, then stop using that 
> job or client (or even remove the client) so that no more jobs for that 
> client run.  What is important is that Bacula prunes only when a job runs 
> unless you do it manually, and if jobs never run, the retention periods never 
> apply and you end up with lots and lots of unused (orphaned) records in the 
> catalog.  The only way to clean it up is to see what jobs exist in the 
> database and prune/purge those which are no longer used -- this is done 
> manual with bconsole.
> The same happens if you have lots and lots of temporary files that get backed 
> up.  There are mail programs that create a temporary file for each email, 
> then delete it a day or two later.  If these files are backed up even once, 
> they will create lots of name entries in the database.  This can be cleaned 
> up by using dbcheck. 
> Finally, if you can afford a bit of downtime on your database, first back it 
> up, then delete it and recreate it with the backup.  This creates a database 
> that is nicely compacted.  If you regularly run vacuums this is probably not 
> necessary, but in extreme cases such as after deleting hundreds of old backup 
> jobs or clients, it can be a quick way to compact the database.
> 
> Note also, your retention periods are quite long so if you have lots of jobs 
> (more than 100) that run every night, you will need a big database.  
> Best regards,
> 
> Kern
> 
> On 03/16/2017 03:17 PM, James Chamberlain wrote:
>>> On Mar 16, 2017, at 3:29 AM, Mikhail Krasnobaev <mi...@ya.ru 
>>> <mailto:mi...@ya.ru>> wrote:
>>>  
>>>> 15.03.2017, 19:57, "James Chamberlain" <jam...@exa.com 
>>>> <mailto:jam...@exa.com>>:
>>>> 
>>>> Hi all,
>>>> 
>>>> I’m getting a touch concerned about the size of my Bacula database, and 
>>>> was wondering what I can do to prune it, compress it, or otherwise keep it 
>>>> at a manageable size. The database itself currently stands at 324 GB, and 
>>>> is using 90% of the file system it’s on. I’m running Bacula 7.4.0 on 
>>>> CentOS 6.8, with PostgreSQL 8.4.20 as the database. My file and job 
>>>> retention times are set to 180 days, and my volume retention time is set 
>>>> to 365 days. Is there any other information I can share which would help 
>>>> you help me track this down?
>>>> 
>>>> Thanks,
>>>> 
>>>> James
>>> 
>>> Good day,
>>> 
>>> do you run any maintenance jobs on the database?
>>> 
>>> Like:
>>> --------------
>>> [root@1c83centos ~]# cat /etc/crontab
>>> SHELL=/bin/bash
>>> PATH=/sbin:/bin:/usr/sbin:/usr/bin
>>> MAILTO=root
>>> HOME=/
>>>  
>>> # dump all databases once every 24 hours
>>> 45 4 * * * root nice -n 19 su - postgres -c "pg_dumpall --clean" | gzip -9 
>>> > /home/pgbackup/postgres_all.sql.gz
>>>  
>>> # vacuum all databases every night (full vacuum on Sunday night, lazy 
>>> vacuum every other night)
>>> 45 3 * * 0 root nice -n 19 su - postgres -c "vacuumdb --all --full 
>>> --analyze"
>>> 45 3 * * 1-6 root nice -n 19 su - postgres -c "vacuumdb --all --analyze 
>>> --quiet"
>>>  
>>> # re-index all databases once a week
>>> 0 3 * * 0 root nice -n 19 su - postgres -c 'psql -t -c "select datname from 
>>> pg_database order by datname;" | xargs -n 1 -I"{}" -- psql -U postgres {} 
>>> -c "reindex database {};"'
>>> -----------------
>>> vacuumdb is a utility for cleaning a PostgreSQL database. vacuumdb will 
>>> also generate internal statistics used by the PostgreSQL query optimizer.
>> 
>> I don’t believe I’ve been doing any of this.  I’ll read up on the 
>> documentation and see about putting these into place.
>> 
>> Thanks!
>> 
>> James
>> 
>> 
>> ------------------------------------------------------------------------------
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot 
>> <http://sdm.link/slashdot>
>> 
>> _______________________________________________
>> Bacula-users mailing list
>> Bacula-users@lists.sourceforge.net 
>> <mailto:Bacula-users@lists.sourceforge.net>
>> https://lists.sourceforge.net/lists/listinfo/bacula-users 
>> <https://lists.sourceforge.net/lists/listinfo/bacula-users>
> 

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to