I had another idea. What if you try bacula progressive virtual full backups?

I do not know if you have heard of this method. You take a full backup,
then do incremental backups for a period of time, like one month. Then you
consolidate the previous full backup with the incremental backups. This
gives you a new full backup containing the changes from the incrementals
and the history from the original full backup. I have been told that bacula
enterprise can do this consolidation purely in the catalog, but that bacula
community must read the old backups and write a new full backup.

I am sure you are thinking that this doesn't solve your problem because for
a period of time you would have the old backups and the new backups on disk
at the same time. You would be correct, except for one thing:
What if you did the consolidation for only one FD at a time? This way the
most you could have on disk at any one time would be 6 FD backups + the new
consolidated backup being made for one of the FDs. You would have to use
scripts or commands to ensure that the old source backups were pruned right
away.

I do not have a lot of knowledge about progressive virtual full backups
beside what I just shared here. I may be missing some important information.
Read more about them here: https://www.bacula.org/whitepapers/PVF.pdf

Even if you did not do progressive virtual full, and instead went to Full
>> incrementals for each month instead of Full Inc Diff Inc Diff Inc Diff
like you currently plan, this could save space because the Diffs would not
be repeating the same data over and over again. You would have to analyze
to see if this could make a difference.

In the fileset, you can increase the compression level for GZIP and ZSTD.
The default value gives good compression / performance tradeoff. If you
want maximum compression, you could try setting the value to maximum
compression. Look in the manual for details. Be aware that this may not
give you big improvements and will cost more CPU.

If you aren't using deduplication, progressive virtual full backups, or
simply increasing your storage space to allow for the backups, then I am
not sure what else could be done. The only other alternative is to reduce
what is backed up. Maybe do less frequent backups?

Additionally, if you were considering s3 compatible cloud backups as a
cheaper alternative to expanding your hosted servers, you may be interested
to know that bacula 15.0.2 now supports encrypting file volumes. The only
thing left unencrypted is the volume header. The encryption CPU load is
applied to the SD since the SD does the encryption. If your SD cannot
afford this CPU load, then you could consider FD encryption. In this case,
the filenames and metadata would not be encrypted.

Oh! One thing about s3 compatible cloud backups: you must store the data
collected from the FDs locally for a brief period, at least until it is
uploaded. There is a setting to immediately truncate this cached data after
upload. This could save space because at absolute worst you could have the
complete cache for a full backup for all 6 FDs at once, but this is still
better than 12 full backups at once. I doubt you would have that much cache
on disk at once, especially if you staggered your full backups.

Regarding FileRetention and JobRetention: just set them to be >=
VolumeRetention. Bacula will prune a volume with no jobs on it, but will
not prune a job with no file records. In some cases admins might set a
smaller file retention period to delete catalog records for files in older
backups (to improve catalog performance / size), but the data would still
be accessible if the job was restored. Bacula will automatically remove
File and Job records for volumes that are pruned.

Retention periods set in the pool override all other retention periods
specified elsewhere.

Regards,
Robert Gerber
402-237-8692
r...@craeon.net


On Thu, Jan 9, 2025 at 11:49 AM Rob Gerber <r...@craeon.net> wrote:

> I think there will no problem having one set of pools per FD, but maybe it
> increases complexity for you. Otherwise, no space savings.
>
> I have a couple ideas.
>
> 1. Bacula has the 'aligned' plugin. If you use the aligned plugin, it
> attempts to better align blocks found on disk in the volumes. It does this
> so deduplication filesystems can work on the bacula volumes. I think that
> your full backups are probably very similar to each other. So maybe
> deduplication could help here?
>
> 2. Bacula has a cloud storage plugin. Maybe you can store the backups in a
> lower cost s3 compatible file storage service?
>
> 3. If your FDs are similar to each other or based on a common image, maybe
> you could back up only essential services, with the intent to apply the
> image, then restore backups to "freshen" the image. Maybe something like
> Relax And Recover?
>
> I think idea 2 is probably easier and/or safer safer than the others.
>
> Robert Gerber
> 402-237-8692
> r...@craeon.net
>
> On Thu, Jan 9, 2025, 11:15 AM Christophe PEREZ via Bacula-users <
> bacula-users@lists.sourceforge.net> wrote:
>
>> Hi Rob,
>>
>> I understood well since our first exchange that the MaximumVolumes at 1
>> was a heresy. I also understood well that to best manage several
>> volumes, it remained important to separate Full, Diff and Inc in
>> different pools. So, we are returning substantially to what I already
>> have.
>> On the other hand, sincerely, I do not see what it would bring me in
>> terms of occupied space to group the Full of my FDs in a single pool
>> instead of having a pool per FD, the same for Diff and Inc.
>>
>> What I know is that 2 Full Volumes + 3 (or 4) Diff Volumes + 2 Volume
>> of 6 days of Inc, for the 2 FDs that I save on this SD (the 4 other FDs
>> are saved on another SD), it will be very tight with the space that I
>> have. And I cannot unfortunately increase the space because it is a
>> hosted server whose increase in space is too expensive.
>> I would have preferred to avoid getting into something more complicated
>> by forcing the expiration of volumes by admin job or something else.
>> I need to find a concrete solution to save my 30 days of 2 FDs on this
>> SD.
>> I must also admit that I don't know what to put optimally in:
>> File Retention = ?? days
>> Job Retention = ?? days
>> for clients.
>>
>> In any case, thank you for all the efforts you make to help me. Even if
>> I'm not sure I understand everything.
>> I'm still thinking about all this and if anyone else wants to shed some
>> light, don't hesitate.
>>
>> Regarding ZSTD, I confirm that it is already installed, called by many
>> packages. I configured bacula to use it.
>>
>> Le mercredi 08 janvier 2025 à 17:27 -0600, Rob Gerber a écrit :
>> > Replies below. :)
>> > Regards,
>> > Robert Gerber
>> > 402-237-8692
>> > r...@craeon.net
>> >
>> >
>> > On Wed, Jan 8, 2025 at 4:03 PM Christophe PEREZ via Bacula-users
>> > <bacula-users@lists.sourceforge.net> wrote:
>> > > Le mercredi 08 janvier 2025 à 14:46 -0600, Rob Gerber a écrit :
>> > > > Christophe,
>> > >
>> > > Hi Rob,
>> > >
>> > >
>> > > Not 2nd AND 5th, but 2nd TO 5th.
>> > >
>> >
>> >  That makes more sense. I understand now.
>> > >
>> > >
>> > > >   MaximumVolumes="2" can only work if you have only 1 FD. Really,
>> > > > for
>> > > > this to work the right way you must have MaximumVolumes=X where X
>> > > > =
>> > > > (total FD x 2).
>> > >
>> > > And save all FD in the same pool ?
>> > >
>> >
>> > I would want to put Full backups in the Full pool, Diff in the diff
>> > pool, etc. Different pool for different backup type and different
>> > retention, but not different pools for different FD unless you have a
>> > very good reason to do it. So, per storage medium, you will have 3
>> > total pools, Full, Diff, Inc. You do not want to mix media types in a
>> > pool, so if you were backing up to cloud, to another SD, or to tape
>> > you would want a Full, Diff, and Inc pool for each storage medium or
>> > device. In my 15.0.2 device, I have local backup pools Synology-
>> > Local-Full, Synology-Local-Diff, and Synology-Local-Inc, and pools
>> > B2-Full, B2-Diff, B2-Inc for copying jobs from the local pools to the
>> > Backblaze B2 cloud service.
>> > > It seems to me that this will not gain me any space.
>> > >
>> > >
>> >
>> > I am trying to say that for you to use the MaximumVolumes=X idea
>> > with MaximumVolumeJobs=1, X must be >= the number of volumes you
>> > expect to use within your retention period. Let us say you have 6 FD,
>> > MaximumVolumes=6, MaximumVolumeJobs=1, VolumeRetention=40days, and
>> > you do 1 full backup each month. In month 1 there is no problem -
>> > bacula does its backups and creates 6 volumes. In month 2, Bacula
>> > cannot make any volumes available without deleting the volumes from
>> > last month BEFORE you have done any new backups! In my understanding
>> > bacula would not violate the retention period, so it would be unable
>> > to make the new backups. So in this example you would want at least
>> > MaximumVolumes=12 with MaximumVolumeJobs=1. You don't want to run out
>> > of volumes before your retention period expires.
>> >
>> > There are other ways to limit bacula's space usage. I'll talk about
>> > them.
>> > >
>> > > So, with Maximum Volume Jobs = 0, the volume size will continue to
>> > > grow?
>> > >
>> >
>> > I don't know what would happen if we set MaximumVolumeJobs=0 , but if
>> > we comment it out or remove it, yes, the volume will grow forever
>> > unless restrained in some other way.
>> > Bacula gives us many ways to restrain volume growth. We can set
>> > MaximumVolumeUseDuration, MaximumVolumeJobs, MaximumVolumeBytes,
>> > etc.
>> > See the manual for some more discussion of each.
>> >
>> https://www.bacula.org/15.0.x-manuals/en/main/Configuring_Director.html#SECTION00231600000000000000000
>> > > So I have to find something else...
>> > > I thought I had made my life easier like that.
>> > >
>> >
>> > I think MaximumVolumeJobs is good for your situation. Maybe
>> > MaximumVolumes is difficult if not set correctly. If you experience
>> > unexpected blockages where a job cannot run because the pool does not
>> > have any more volumes, then you can experiment with removing
>> > MaximumVolumes and instead relying on the VolumeRetention period to
>> > remove volumes.
>> >
>> > As a warning if relying on VolumeRetention alone, it is possible that
>> > bacula will not remove a volume if it does not have to, even if
>> > expired. Bacula's goal is to keep your volumes available as long as
>> > possible, just in case you need them. If that becomes a problem, I
>> > believe you can Schedule a job of Type=Admin and tell it to truncate
>> > expired volumes. I have not set up such a job, so I am not certain
>> > how it is done.
>> >
>> > I think MaximumVolumes could work for you, as long as it is set to be
>> > >= the number of volumes you will need during your retention period.
>> >
>> > Here is the bacula manual entry all about automatic volume recycling.
>> > It is very useful.
>> >
>> https://www.bacula.org/15.0.x-manuals/en/main/Automatic_Volume_Recycling.html
>> >
>> > One extract from that link above that explains what bacula will do if
>> > not restrained in some way:
>> > " A key point mentioned above, that can be a source of frustration,
>> > is that Bacula will only recycle purged Volumes if there is no other
>> > appendable Volume available, otherwise, it will always write to an
>> > appendable Volume before recycling even if there are Volumes marked
>> > as Purged. This preserves your data as long as possible. So, if you
>> > wish to “force” Bacula to use a purged Volume, you must first ensure
>> > that no other Volume in the Pool is marked Append. If necessary, you
>> > can manually set a volume to Full. The reason for this is that Bacula
>> > wants to preserve the data on your old Volumes (even though purged
>> > from the catalog) as long as absolutely possible before overwriting
>> > it. There are also a number of directives such as Volume Use Duration
>> > that will automatically mark a volume as Used and thus no longer
>> > appendable."
>> >
>> > This behavior is why we must restrain bacula. Your suggestion of
>> > MaximumVolumeJobs=1 and
>> > MaximumVolumes=$expectedNumberOfVolumesNeededDuringRetentionPeriod
>> > accomplishes this, with the warning that you will not be able to make
>> > new backups if you run out of volumes because you added more FDs but
>> > did not increase MaximumVolumes, or something like that.
>> >
>> > I have set a very short retention period on my test system, with a
>> > job scheduled once an hour. I'll examine the behavior I see and
>> > report back.
>> >
>> > > I have bacula 15.0.2.
>> > > What linux tool does this ZSTD compression format depend on?
>> > >
>> >
>> > I don't know what tool it uses. I am also running bacula 15.0.2, on
>> > Rocky Linux 9. For me, the ZSTD feature just worked. I think bacula
>> > made sure the correct libraries are present.
>> > >
>>
>> --
>> Christophe PEREZ
>>
>>
>> _______________________________________________
>> Bacula-users mailing list
>> Bacula-users@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/bacula-users
>>
>
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to