Re: [Bacula-users] Incremental backup and interruption due to no free LTO

neumeise1 Fri, 08 Oct 2021 13:35:56 -0700

Hello Samuel,
I also wrote my answer under the text-blocks and summarized it at the end, to 
get it a little bit clearer, because the email is a little bit lengthy. If you 
want to answer to this you can leave everthing out except the summarization. 
That should get the email way shorter again an makes it easyier for other 
people to read by. Thank you.


> Le jeu. 30 sept. 2021 02:23,  a crit :
> > > 
> > > I mean that my goal is to save any new file, once and permanently. 
> > > If monday I have file1 on my nas, I want it to be saved on tape. 
> > > Tuesday I add file2 : I want it to be saved on tape. 
> > > Wednesday, file1 is deleted from NAS: it's a mistake, and I still want to 
> > > keep file1 forever on tape (and be able to restore it ). 
> > > Every file that has existed once on my NAS must be saved permanently on a 
> > > tape. 
> > > 
> > Okay I understand it like this:
> > You have done one full backup at the beginning. After that, you are doing 
> > incremental backups every night to save every new file. If the tape is full 
> > it gets packed away as archive and never gets rewritten? Right?
> > Your primary goal is to save your current data and archive "deleted" files 
> > forever?
> 
> Yes ! Exactly.   Okay!     > > I don't use tapes, but I think if you do 
> incremental backups and you want to restore something you need to insert a 
> huge part of the tapes because bacula needs to read them.(I'm not sure about 
> that.)
> > If bacula should to this, you will have a huge problem if you want to 
> > restore a file in lets say 10years.
> 
> Not really. I can do a restore job, searching by finename. If the tape is not 
> in the library, Bacula asks me to put it in...I've tested this procedure a 
> few times, it works.   Okay, I trust you in this.   > > And being honest I 
> really don't like the idea of doing incremental backups endlessly without 
> differential- and full-backups in between(I wrote more about that later) 
> > 
> > > Let me show you my (simplified) configuration : 
> > > 
> > > I mounted ( nfs ) my first NAS on, say, /mnt/NAS1/ 
> > > My file set is : FileSet {
> > > Name = "NAS1"
> > > File = /mnt/NAS1
> > > } 
> > > 
> > > My job is Job {
> > > Name = "BackupNAS1"
> > > JobDefs = "DefaultJob"
> > > Level = Incremental
> > > FileSet="NAS1"
> > > #Accurate = yes # Not clear what I should do here. activate to yes seemed 
> > > to add many unwanted files - probably moved/renamed files ? 
> > > Pool = BACKUP1
> > > Storage = ScalarI3-BACKUP1 # this is my tape library
> > > Schedule = NAS1Daily #run every day
> > > 
> > > } 
> > > 
> > > with 
> > > JobDefs {
> > > Name = "DefaultJob"
> > > Type = Backup
> > > Level = Incremental
> > > Client = lto8-fd
> > > FileSet = "Test File Set"
> > > Messages = Standard
> > > SpoolAttributes = yes
> > > Priority = 10
> > > Write Bootstrap = "/var/lib/bacula/%c.bsr"
> > > } 
> > > 
> > > My pool is : 
> > > Pool {
> > > Name = BACKUP1
> > > Pool Type = Backup
> > > Recycle = no
> > > AutoPrune = no
> > > Volume Retention = 100 years
> > > Job Retention = 100 years
> > > Maximum Volume Bytes = 0
> > > Maximum Volumes = 1000 
> > > Storage = ScalarI3-BACKUP1
> > > Next Pool = BACKUP1
> > > } 
> > 
> > To your .conf:
> > -under JobDefs-DefaultJob :you declare "FileSet = "Test File Set"" and in 
> > your jobdef you declare "FileSet="NAS1"" if that's your standard fileset, 
> > set it like this or try to ommit it in the jobdef. It is a little bit 
> > confusing.
> OK 
> > -you use the "Next Pool"-Ressource in your Pool. Documentation states: it 
> > belongs under Schedule>Run>Next Pool. Either way it describes a migrating 
> > job. I think that's not what you want to do?
> I had tried a "virtual backup", so that all my incremental jobs merge into 
> one, periodically. I thought it was only virtual, only dealing with the 
> catalog data, but it seems I can do that only by recreating a whole bunch of 
> volumes.
> I have hundreds of TeraOctets of datas and I don't want to do that ! So I let 
> the incremental jobs running. Let aside my current problem, it's convenient 
> for what I need...   Okay, I noted that you did "virtual backups". As far as 
> I know is a "virtual full-backup" something where bacula reads incremental- 
> and differential- backups and the last full-backup and constructs a new full 
> backup out of them without sending all of the data over the network. See: 
> https://www.baculasystems.com/incremental-backup-software/ This Site states: 
> "[...]Virtual Full" in Bacula terminology). With this technique Bacula's 
> software calculates a new full backup from all differential and incremental 
> backups that followed the initial full backup, without the requirement of 
> another full data transfer over the network." I also took note that you have 
> a lot of data to manage. 
> > If I would be in your place, I would do it differently(assumed i got your 
> > primary goal right that you want to save your current data and archive 
> > "deleted" files forever?):
> > -first I would set nfs user permissions, if nfs or samba doesn't to the 
> > trick I would straight head to nextcloud(also Open-Source with a pricing 
> > plan for companies)
> > Why?: -> you can set permissions that your users can't delete their files 
> > and are forced to move it in a archive-folder with a good 
> > naming-convention, when they want to get rid of it(maybe you can automate 
> > it and the files go in an archive-folder, when your users hit the delete 
> > button and not to the bin. Should they do mistakes, it's up to you, to 
> > figure out the right file(might not be that clean).
> > -> having a good naming-convention and some sort of documentation makes it 
> > 1million times easier to find the right file in the future.
> 
> We have all this settle, in a way or another. But, I still need to give some 
> full rights to some users , and the problem is more : what if the NAS burns 
> or what if 3 HDD crash at the same time etc...
> I want a robust and simple backup solution in case of rare event...
> 
> > I think you have two major goals: 1. keeping the productiveData(the data 
> > your users are currently using) save and 2. archiving old files, your users 
> > don't need anymore.
> No. Every file could be used any time. Any file is available.
> > To achieve the first goal I would go ahead and implement a backup-strategy 
> > with three pools(one incremental-, one differential-, one full-pool) and 
> > rotating tapes(rewriting them after a given time).
> 
> I have hundreds of TeraOctets. This would mean doubling or more the space I 
> need...

 Okay, I clearly wouldn't suggest that but that's up to you and your decision.  
   > > Should one of your NASs fail on you, you will be able to restore the 
files by yourself fast, keeping offline-time short, and therefore 
blackout-costs small.
> > 
> > To achieve the second goal I would go ahead and search for companies that 
> > are specialized in securing data for a long time(I've read about them in a 
> > book). The first idea I have had, was using a tape like a "special 
> > harddrive". Collect the files your users don't need anymore somewhere and 
> > write them once to a tape and label it by hand. If something happens to 
> > this tape, the data will be gone. I don't like this idea. I wouldn't do 
> > that. Probably the best idea would be to call a data-securing-company which 
> > do that job for you. Either way I wouldn't keep the productiveData-tapes 
> > and the archiv-tapes at the same spot(that would be another pro for the 
> > data-securing-company), because it violates the 
> > 3-2-1-backup-rule(everything will be gone when disaster strikes(flood, 
> > fire, hurricane....)). If you don't know about the 3-2-1-backup-rule please 
> > look it up on the internet(this rule discusses good backups in more 
> > detail). > 
> My idea was : when one volume is full, store it in another place... So it was 
> OK, I guess.
> > > I'm not sure I fully understand here : you say "since the volume-use 
> > > duration is set to short." . But I believe it's exactly the contrary here 
> > > : my volume-use duration is set to 100 years !? isn't it ?. 
> > Yes, is exactly the contrary. I'm not sure but that shouldn't be a problem. 
> > If you want to write to it indefinitely you can specify it as 0(the 
> > default) as specified in the documentation(chapter: configuring the 
> > director).
> > 
> > > > In the bacula-dir.conf you specified the director ressource type 
> > > > "Messages" there is a option called "append"
> > > > A part of my bacula-dir.conf:
> > > > # WARNING! the following will create a file that you must cycle from
> > > > # time to time as it will grow indefinitely. However, it will
> > > > # also keep all your messages if they scroll off the console.
> > > > append = "/var/log/bacula/bacula.log" = all, !skipped
> > > > console = all, !skipped
> > > > 
> > > > At the end "all, !skipped" are the types or classes of messages which 
> > > > go into it. They are described in more detail in the "Messages 
> > > > Resource"-Chapter:
> > > > https://www.bacula.org/11.0.x-manuals/en/main/Messages_Resource.html
> > > > 
> > > > If I type the "messages"-command in the bconsole the output is in my 
> > > > case in both cases the same.
> > > > 
> > > 
> > > This is regarding logs, right ? Doesn't seem to apply to me here. I'm 
> > > dealing with big video files being unnecessarily saved 10, 15 or 20 times 
> > > on tapes.... 
> > > Or maybe I missed something here ? 
> > In your last email, you asked "Specifically, how do you go about 
> > identifying exactly which volumes / jobid are to be "deactivated" and how 
> > do you do that?
> > You know the day where everything came to a halt. Knowing this, you can 
> > look through your logs which jobs ran on that day. For every Job there is a 
> > longer list with one tag named "Volume name(s)".
> > Under this tag are the volumes listed, that got used in that job.
> > Sorry for making it not clearer.
> 
> I understand very clearly. But this is going to be quite long to check, 
> because I have also to see what job has got "new" files. I was hoping there 
> would be a way to "deduplicate" files in jobs, and jobs in incremental backup 
> ...
> Well it seems I have to do this by hand ?   I don't know a faster way, so yes 
> doing it by hand is probably the only way. You can try to write a script but 
> that get's also very tedious, and if there is a mistake in the script you 
> probably get an even bigger problem. I wouldn't do that. 
> > Maybe there is someone who has experiences with a similar backup-job or 
> > such data-securing-companies and can help you better.
> 
> Anyway, I think my case shows a kind of misconception ( or misconfiguration ? 
> ) : If an incremental job is delayed for some reason, why should it backup 
> many times the same file !? How to avoid that ?   Yes, I can help you with 
> this.   I would suggest that we first go through your bacula-dir.conf and 
> search for mistakes to set up the system as you intended in the beginning 
> even I clearly don't recommend it. But I unterstand the problem you are in 
> and I'm the one in the positioin of easy talking. Summarization/things I want 
> to point out or already mentioned: -"under JobDefs-DefaultJob :you declare 
> "FileSet = "Test File Set"" and in your jobdef you declare "FileSet="NAS1"" 
> if that's your standard fileset, set it like this or try to ommit it in the 
> jobdef. It is a little bit confusing."
-"you use the "Next Pool"-Ressource in your Pool. Documentation states: it 
belongs under Schedule>Run>Next Pool. Either way it describes a migrating job. 
I think that's not what you want to do?" -"[...]volume-use duration is set to 
100 years[...]" "If you want to write to it indefinitely you can specify it as 
0(the default) as specified in the documentation(chapter: configuring the 
director)." -I slightly changed the Fileset to make it fit how it's done in the 
documentation.    Fileset {
 Name = "NAS1"
 Include {
 Options{
 signature = SHA1
 } 
 File = "/mnt/NAS1"
 } # Exclude {
# File = 
# } }

My job is Job {
Name = "BackupNAS1"
JobDefs = "DefaultJob"
Level = Incremental
FileSet="NAS1"
#Accurate = yes # Not clear what I should do here. activate to yes seemed to 
add many unwanted files - probably moved/renamed files ? 
Pool = BACKUP1
Storage = ScalarI3-BACKUP1 # "this is my tape library"
Schedule = NAS1Daily # "run every day" 
} 
   JobDefs {
Name = "DefaultJob"
Type = Backup
Level = Incremental
Client = lto8-fd
# FileSet = "Test File Set" Try leaving it out
Messages = Standard
SpoolAttributes = yes
Priority = 10
Write Bootstrap = "/var/lib/bacula/%c.bsr"
} 

Pool {
Name = BACKUP1
Pool Type = Backup
Recycle = no
AutoPrune = no
Volume Retention = 100 years # set to 0 to disable
Job Retention = 100 years
Maximum Volume Bytes = 0
Maximum Volumes = 1000 # might get you in trouble, set it to 0 to permit any 
number of volumes 
Storage = ScalarI3-BACKUP1
# Next Pool = BACKUP1 doesn't belong here
}     I would like to also have a look at your schedule-resource. If it's 
possible.   Is it possible that you accidently added the fileset multiple times 
and you are doing multiple backups of the same files? Documentaion states: 
"Take special care not to include a directory twice or Bacula will backup the 
same files two times wasting a lot of space on your archive device. Including a 
directory twice is very easy to do. For example:"  

 Include { Options {compression=GZIP } File = / File = /usr }

I hope that helps.

Sebastian  

-------------------------------------------------------------------------------------------------
FreeMail powered by mail.de - MEHR SICHERHEIT, SERIOSITÄT UND KOMFORT

_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] Incremental backup and interruption due to no free LTO

Reply via email to