[Bacula-devel] Open source Bacula plugin for deduplication
Good morning, I know Bacula enterprise provides deduplicacion plugins, but sadly we can't afford it. No problem, we will try to create an open source deduplication plugin for bacula file daemon. I would use rdiff (part of librsync) for delta patching and signature generation. I would love to create a Bacula plugin for deduplicating content at fd level. This way, even if the backup is sent crypted by fd to sd, the deduplication could be done obtaining the best results as the deduplication takes place when the files are not crypted yet. The deduplication, would only be applied to files, let's say larger than 10GB. If you don't mind, I would like to share with you my ideas, in order to at least know, "this all" is a possible way. My idea is basically : - WHEN DOING A BACKUP : ++ Check the backup level we are running. I suppose that asking bVarLevel to getBaculaValue() ++ In startBackupFile() I suppose it gives me file size info (or if at least gives me the name and I'll do an stat() in some manner), get the file size. +++If it's a full level and bigger than 10GB, obtain the file signature and finally store that new (previously non existing) signature (written in a file with a known nomenclature based on ORIGINAL_FILE's name), plus the whole ORIGINAL_FILE (the one we have generated the signature from) in Bacula tapes. Should I need to say to Bacula, to re-read the directory for being able to backup generated file signatures?. They weren't until know we have generated a file that contains ORIGINAL_FILE signature. +++If it's an inc level and a previous signature of ORIGINAL_FILE file exists (I would know because they will have a known nomenclature based on ORIGINAL_FILE's name), with the previous signature plus the new state of the file (the new file state I mean), create a patch. Later obtain again, the file signature in the new status. Finally store that new signature plus the patch in Bacula tapes. Finally return a bRC_Skip of the ORIGINAL_FILE (because we are going to copy a delta patch and a signature). If I return a bRC_Skip to here... would the fd, skip this file, but see the signatures and delta patches generated before retuning the bRC_Skip?. Or should I ask to fd, in some manner, to re-read the directory?. As you would assume in the incremental backups, I'm not storing the filename as its in the filesystem. It should more or less the following way : In a full level backup : ++ BEFORE THE BACKUP : _BACKED SERVER'S FS <> BACULA "VIRTUAL TAPE" CONTENT_ ORIGINAL_FILE<---> ++ AFTER THE BACKUP : _BACKED SERVER'S FS <> BACULA "VIRTUAL TAPE" CONTENT_ ORIGINAL_FILE + SIGNATURE FILE <---> ORIGINAL FILE + SIGNATURE FILE In the next incremental level backup : ++ BEFORE THE BACKUP : _BACKED SERVER'S FS <> BACULA "VIRTUAL TAPE" CONTENT_ NEW_STATE_ORIGINAL_FILE + SIGNATURE FILE GENERATED THE LAST FULL DAY <---> _FROM THE FULL BACKUP_(ORIGINAL FILE + SIGNATURE FILE) ++ AFTER THE BACKUP : _BACKED SERVER'S FS <> BACULA "VIRTUAL TAPE" CONTENT_ NEW_STATE_ORIGINAL_FILE + SIGNATURE FILE OF NEW_STATE_ORIGINAL_FILE <---> _FROM THE FULL BACKUP_(ORIGINAL FILE + SIGNATURE FILE) + PATCH FILE + SIGNATURE FILE OF NEW_STATE_ORIGINAL_FILE - WHEN RESTORING A BACKUP : If the restored files nomenclature is (for example...) ORIGINAL_FILE-SIGNATURE- OR ORIGINAL_FILE-PATCH that would mean (I assume I could see in the filename to be restored in startRestoreFile() because it has accesible the filename), we have backed up deltas of ORIGINAL_FILE in the incremental backups. So, let's write to a plain text file with this path inside it, in order for later, in a post restore job (or even bEventEndBackupJob event of the api?), to apply the patches in that path, to the ORIGINAL_FILE obtainted from the own name of the patch files. Finally after patching job done, remove signature files and patch files. Obviously leaving the last status of ORIGINAL_FILE at the restored date. So, at this point, I would be very very thankful :) :) :) if some experienced developer, could give me some idea or if can see something is wrong or should achieved in some other manner or with other plugin api functions. Thank you :) :) Cheers!!___ Bacula-devel mailing list Bacula-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-devel
Re: [Bacula-devel] Open source Bacula plugin for deduplication
Hello Radoslaw, I will answer below in green color for instance... just for discerning better what both have spoke... :) El 2022-03-03 12:46, Radosław Korzeniewski escribió: > ATENCION: Este correo se ha enviado desde fuera de la organización. No pinche > en los enlaces ni abra los adjuntos a no ser que reconozca el remitente y > sepa que el contenido es seguro. > > Hello, > > czw., 3 mar 2022 o 12:09 egoitz--- via Bacula-devel > napisał(a): > >> Good morning, >> >> I know Bacula enterprise provides deduplicacion plugins, but sadly we can't >> afford it. No problem, we will try to create an open source deduplication >> plugin for bacula file daemon. I would use rdiff (part of librsync) for >> delta patching and signature generation. > > What signatures rdiff is using? > > BASICALLY HERE IS DOCUMENTED EXACTLY... > HTTPS://LIBRSYNC.GITHUB.IO/PAGE_FORMATS.HTML > > IT'S FOR BEING ABLE TO GENERATE DELTA PATCHES, WITHOUT THE NEED OF HAVING OLD > AND NEW VERSION OF A FILE... AND SO... FOR AVOID DOUBLING THE SPACE USED OR > REQUIRED FOR BACKING UP... > >> I would love to create a Bacula plugin for deduplicating content at fd >> level. This way, even if the backup is sent crypted by fd to sd, the >> deduplication could be done obtaining the best results as the deduplication >> takes place when the files are not crypted yet. > > Yes, for proper encryption you would always get different bits for the same > data block making deduplication totally useless. :) > > I THINK THAT TOO.. YES... > >> The deduplication, would only be applied to files, let's say larger than >> 10GB. > > ??? > > I designed Bacula deduplication to handle blocks (files) larger than 1k > because indexing overhead for such small blocks was too high. The larger the > block you use the lower chance to get a good deduplication ratio. So it is a > trade-off - small blocks == good deduplication ratio but higher indexing > overhead; larger blocks == weak deduplication ratio but lower indexing > overhead. So it was handling block levels from 1K up to 64k (the default > bacula block size, but could be extended to any size). > > I UNDERSTAND WHAT YOU SAY BUT THE PROBLEM WE ARE FACING IS THE FOLLOWING ONE. > IMAGINE, A MACHINE WITH A SQL SERVER AND 150GB OF DATABASES. OUR PROBLEM IS > TO HAVE TO INCREMENTALLY COPY THAT EACH DAY. WE DON'T REALLY MIND COPYING 5GB > OF "WASTED" SPACE PER DAY... EVEN WHEN NON NECESSARY (JUST FOR > UNDERSTANDING) BUT OBVIOUSLY 100GB PER DAY OR 200GB... ARE DIFFERENT > TERMS > > I WAS THINKING IN APPLYING THIS DEDUPLICATION ONLY FOR IMPORTANT FILES > REALLY HOPE YOU CAN UNDERSTAND ME NOW.. :) > >> If you don't mind, I would like to share with you my ideas, in order to at >> least know, "this all" is a possible way. >> >> My idea is basically : >> >> - WHEN DOING A BACKUP : >> >> ++ Check the backup level we are running. I suppose that asking bVarLevel to >> getBaculaValue() > > Deduplication should be totally transparent to the backup level. You want to > deduplicate data, especially for largest full level backups, right? > > WELL... REALLY... THE PROBLEM FOR US IS WHAT I TOLD JUST BEFORE SO... WE > DON'T REALLY MIND COPYING A BIG FILE ONCE A MONTH, BUT WE WANT TO AVOID > COPYING IT IN INCREMENTAL BACKUPS (AT LEAST THE WHOLE OF THE FILE...). APART, > WHEN RESTORING (AND NOT IN VIRTUAL BACKUPS), YOU RESTORE A FULL PLUS > INCREMENTALS. SO THIS WAY, WE WOULD RESTORE THE FULL ORIGINAL_FILE PLUS THE > PATCHES AND WE WOULD APPLY THEM TO ORIGINAL_FILE AT THE END OF THE RESTORING > JOB. > >> ++ In startBackupFile() I suppose it gives me file size info (or if at least >> gives me the name and I'll do an stat() in some manner), get the file size. > > No. The standard "Bacula command Plugin API" expects that a plugin will > return a file stat info to backup. > > OK, NO PROBLEM... IF I GET IN SOME MANNER FILENAME AND PATH I COULD ALWAYS DO > A STAT() > >> +++If it's a full level and bigger than 10GB, obtain the file signature and >> finally store that new (previously non existing) signature (written in a >> file with a known nomenclature based on ORIGINAL_FILE's name), plus the >> whole ORIGINAL_FILE (the one we have generated the signature from) in Bacula >> tapes. Should I need to say to Bacula, to re-read the directory for being >> able to backup generated file signatures?. They weren't until know we have >> generated a file that contains ORIGINAL_FILE signature. > > Why d
Re: [Bacula-devel] Open source Bacula plugin for deduplication
Hi Christopher! Thanks a lot for your time!!. Answering below in blue for better discerning. El 2022-03-03 15:03, webmaster escribió: > ATENCION: Este correo se ha enviado desde fuera de la organización. No pinche > en los enlaces ni abra los adjuntos a no ser que reconozca el remitente y > sepa que el contenido es seguro. > > Hello > > I was reading this and had a thought about deduplication, > > I WANTED TO REFER TO DELTA ENCODING SORRY NOT BYTES DEDUP IN A STORAGE... > > The zfs filesytem has inbuilt deduplication (and compression) support > > so you could when creating a new backup volume > create a virtual zfs pool/filesystem > Write all backuped files to the zfs pool > Which automatically does deduplication > > WE RUN ZFS AS THE FILESYSTEM OF OUR FILE STORAGES... > > You then write the virtual zfs file system to your bacula volume > > Though Not sure how well this would work in practice, but seems like a > "simple" way to implement basic deduplication > > YES, ZFS IS NICE... BUT WE ARE LOOKING FOR TRANSFER AND STORE THE LESS > POSSIBLE INFO THAT A FD CAN SEND US > > Christopher tyerman > > CHEERS!!! > > Sent from my Galaxy > > Original message > From: egoitz--- via Bacula-devel > Date: 03/03/2022 12:36 (GMT+00:00) > To: Radosław Korzeniewski > Cc: bacula-devel@lists.sourceforge.net > Subject: Re: [Bacula-devel] Open source Bacula plugin for deduplication > > Hello Radoslaw, > > I will answer below in green color for instance... just for discerning better > what both have spoke... :) > > El 2022-03-03 12:46, Radosław Korzeniewski escribió: > > ATENCION: Este correo se ha enviado desde fuera de la organización. No pinche > en los enlaces ni abra los adjuntos a no ser que reconozca el remitente y > sepa que el contenido es seguro. > > Hello, > > czw., 3 mar 2022 o 12:09 egoitz--- via Bacula-devel > napisał(a): > > Good morning, > > I know Bacula enterprise provides deduplicacion plugins, but sadly we can't > afford it. No problem, we will try to create an open source deduplication > plugin for bacula file daemon. I would use rdiff (part of librsync) for delta > patching and signature generation. > What signatures rdiff is using? > > BASICALLY HERE IS DOCUMENTED EXACTLY... > HTTPS://LIBRSYNC.GITHUB.IO/PAGE_FORMATS.HTML > > IT'S FOR BEING ABLE TO GENERATE DELTA PATCHES, WITHOUT THE NEED OF HAVING OLD > AND NEW VERSION OF A FILE... AND SO... FOR AVOID DOUBLING THE SPACE USED OR > REQUIRED FOR BACKING UP... > > I would love to create a Bacula plugin for deduplicating content at fd level. > This way, even if the backup is sent crypted by fd to sd, the deduplication > could be done obtaining the best results as the deduplication takes place > when the files are not crypted yet. > Yes, for proper encryption you would always get different bits for the same > data block making deduplication totally useless. :) > > I THINK THAT TOO.. YES... > > The deduplication, would only be applied to files, let's say larger than > 10GB. > ??? > > I designed Bacula deduplication to handle blocks (files) larger than 1k > because indexing overhead for such small blocks was too high. The larger the > block you use the lower chance to get a good deduplication ratio. So it is a > trade-off - small blocks == good deduplication ratio but higher indexing > overhead; larger blocks == weak deduplication ratio but lower indexing > overhead. So it was handling block levels from 1K up to 64k (the default > bacula block size, but could be extended to any size). > > I UNDERSTAND WHAT YOU SAY BUT THE PROBLEM WE ARE FACING IS THE FOLLOWING ONE. > IMAGINE, A MACHINE WITH A SQL SERVER AND 150GB OF DATABASES. OUR PROBLEM IS > TO HAVE TO INCREMENTALLY COPY THAT EACH DAY. WE DON'T REALLY MIND COPYING 5GB > OF "WASTED" SPACE PER DAY... EVEN WHEN NON NECESSARY (JUST FOR > UNDERSTANDING) BUT OBVIOUSLY 100GB PER DAY OR 200GB... ARE DIFFERENT > TERMS > > I WAS THINKING IN APPLYING THIS DEDUPLICATION ONLY FOR IMPORTANT FILES > REALLY HOPE YOU CAN UNDERSTAND ME NOW.. :) > > If you don't mind, I would like to share with you my ideas, in order to at > least know, "this all" is a possible way. > > My idea is basically : > > - WHEN DOING A BACKUP : > > ++ Check the backup level we are running. I suppose that asking bVarLevel to > getBaculaValue() > Deduplication should be totally transparent to the backup level. You want to > deduplicate data, especially for largest full level backups, right? > > WELL... R
Re: [Bacula-devel] Open source Bacula plugin for deduplication
Hi Eric!! Sorry what are you referring to with the "aligned driver". By they way... I think this is an incorrect asked question... I was looking for delta encoding instead of dedup. Sorry mates :) Cheers! El 2022-03-03 15:31, Eric Bollengier via Bacula-devel escribió: > ATENCION > ATENCION > ATENCION!!! Este correo se ha enviado desde fuera de la organizacion. No > pinche en los enlaces ni abra los adjuntos a no ser que reconozca el > remitente y sepa que el contenido es seguro. > > Hello, > > On 03.03.22 15:03, webmaster via Bacula-devel wrote: > >> Hello >> >> I was reading this and had a thought about deduplication, >> >> The zfs filesytem has inbuilt deduplication (and compression) support >> >> so you could when creating a new backup volume >> create a virtual zfs pool/filesystem >> Write all backuped files to the zfs pool >> Which automatically does deduplication >> >> You then write the virtual zfs file system to your bacula volume >> >> Though Not sure how well this would work in practice, but seems like a >> "simple" way to implement basic deduplication > Thanks, the aligned driver is designed to work in this configuration. > > Best Regards, > > Eric > > ___ > Bacula-devel mailing list > Bacula-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/bacula-devel___ Bacula-devel mailing list Bacula-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-devel
Re: [Bacula-devel] Open source Bacula plugin for deduplication
Hi Christopher, Thank you so much again :) Well you know.. I'm looking delta encoding really. But anyway, well for that deduplication your are talking about... that should assume you are able to run ZFS in the server we are backing up. Isn't it?. Cheers!! :) El 2022-03-03 15:33, webmaster escribió: > ATENCION: Este correo se ha enviado desde fuera de la organización. No pinche > en los enlaces ni abra los adjuntos a no ser que reconozca el remitente y > sepa que el contenido es seguro. > > Sorry might not have been obvious but what i was suggesting was a possible > way of getting the file daemon to do deduplication using virtual zfs > filesystems > > Christopher tyerman > > Sent from my Galaxy > > Original message > From: ego...@ramattack.net > Date: 03/03/2022 14:23 (GMT+00:00) > To: webmaster > Cc: Radosław Korzeniewski , > bacula-devel@lists.sourceforge.net > Subject: Re: [Bacula-devel] Open source Bacula plugin for deduplication > > Hi Christopher! > > Thanks a lot for your time!!. Answering below in blue for better discerning. > > El 2022-03-03 15:03, webmaster escribió: > > ATENCION: Este correo se ha enviado desde fuera de la organización. No pinche > en los enlaces ni abra los adjuntos a no ser que reconozca el remitente y > sepa que el contenido es seguro. > > Hello > > I was reading this and had a thought about deduplication, > > I WANTED TO REFER TO DELTA ENCODING SORRY NOT BYTES DEDUP IN A STORAGE... > > The zfs filesytem has inbuilt deduplication (and compression) support > > so you could when creating a new backup volume > create a virtual zfs pool/filesystem > Write all backuped files to the zfs pool > Which automatically does deduplication > > WE RUN ZFS AS THE FILESYSTEM OF OUR FILE STORAGES... > > You then write the virtual zfs file system to your bacula volume > > Though Not sure how well this would work in practice, but seems like a > "simple" way to implement basic deduplication > > YES, ZFS IS NICE... BUT WE ARE LOOKING FOR TRANSFER AND STORE THE LESS > POSSIBLE INFO THAT A FD CAN SEND US > > Christopher tyerman > > CHEERS!!! > > Sent from my Galaxy > > Original message > From: egoitz--- via Bacula-devel > Date: 03/03/2022 12:36 (GMT+00:00) > To: Radosław Korzeniewski > Cc: bacula-devel@lists.sourceforge.net > Subject: Re: [Bacula-devel] Open source Bacula plugin for deduplication > > Hello Radoslaw, > > I will answer below in green color for instance... just for discerning better > what both have spoke... :) > > El 2022-03-03 12:46, Radosław Korzeniewski escribió: > > ATENCION: Este correo se ha enviado desde fuera de la organización. No pinche > en los enlaces ni abra los adjuntos a no ser que reconozca el remitente y > sepa que el contenido es seguro. > > Hello, > > czw., 3 mar 2022 o 12:09 egoitz--- via Bacula-devel > napisał(a): > > Good morning, > > I know Bacula enterprise provides deduplicacion plugins, but sadly we can't > afford it. No problem, we will try to create an open source deduplication > plugin for bacula file daemon. I would use rdiff (part of librsync) for delta > patching and signature generation. > What signatures rdiff is using? > > BASICALLY HERE IS DOCUMENTED EXACTLY... > HTTPS://LIBRSYNC.GITHUB.IO/PAGE_FORMATS.HTML > > IT'S FOR BEING ABLE TO GENERATE DELTA PATCHES, WITHOUT THE NEED OF HAVING OLD > AND NEW VERSION OF A FILE... AND SO... FOR AVOID DOUBLING THE SPACE USED OR > REQUIRED FOR BACKING UP... > > I would love to create a Bacula plugin for deduplicating content at fd level. > This way, even if the backup is sent crypted by fd to sd, the deduplication > could be done obtaining the best results as the deduplication takes place > when the files are not crypted yet. > Yes, for proper encryption you would always get different bits for the same > data block making deduplication totally useless. :) > > I THINK THAT TOO.. YES... > > The deduplication, would only be applied to files, let's say larger than > 10GB. > ??? > > I designed Bacula deduplication to handle blocks (files) larger than 1k > because indexing overhead for such small blocks was too high. The larger the > block you use the lower chance to get a good deduplication ratio. So it is a > trade-off - small blocks == good deduplication ratio but higher indexing > overhead; larger blocks == weak deduplication ratio but lower indexing > overhead. So it was handling block levels from 1K up to 64k (the default > bacula block size, but could be exten
[Bacula-devel] Questions about delta encoding implementation and virtual files
Hi!, After digging in plugin building documentation and checking the provided examples, I have some doubts I have not been able to clarify by my own. I describe them below in case you could give me a hand :). I would be very thankful if you could help me a little clarifying this doubts :) - I'm working in trying to create an open source version of the delta encoding plugin by using the bacula-fd plugin api. When working on it I have seen Bacula's source is aware of delta and delta file sequentiation. I have seen for instance, even a .bvfs command exists for showing deltas of a file id. But, what I have not found is that Bacula works on that Delta files generation (patch generation, signatures, etc...). I assume that Bacula in the non-fd part, acts just as a just delta file holder keeping the files and stores the patch sequentiation just that. Bacula keeps records of deltas in database (and file storages) but only fd works with them (with probably a library like librsync in the delta plugin) in the sense of applying patches over an original file or even generating deltas when backup. Am I wrong?. Was just for understanding the nice work done and what's already written and free in Bacula's source for this purpose. - By the way, I have one question about virtual files. I have not seen very clear (perhaps my problem as don't understand it) how to work with them. I understand the concept, but have not seen a clear example of how for instance in the backup you create a virtual file, how do you see it in bvfs and finally... what you get after restoring. In page 36/146 of Bacula 11 for developers pdf, you say "This will create a virtual file." but really you are entering in the structure : sp->type = FT_REG; sp->statp.st_mode = 0700 | S_IFREG; FT_REG and S_IFREG both are for regular files what exactly causes a virtual file to be created?. Perhaps st_size -1?. Are they relevant for what I'm trying to do?. It seems Bacula handles delta sequentiation so... perhaps for this purpose I shouldn't need "virtual files"?. - I'm planning to implement delta encoding by checking the previous day file signature done by librsync. Instead of looking at the filesystem it would be nice if I could take a look at that signature in the last backup done (yesterday backup). Could it be possible in some manner, that if I see a file passed in EventHandleBackupFile() to check if yesterdays signature exists in the backup of yesterday, and then read the yesterday signature from the own backup?. I mean, instead of having to leave the signature in the being backed server's filesystem. - The last one :) . For restoring, and for the code seen (for instance in insert_missing_delta()) I assume Bacula detects we are restoring a delta compressed file. Then I assume Bacula restores apart from the own initial file, patches to arrive to the day we want to restore to. Am I wrong?. Perhaps later in a post-restore job I could run a shell script that tries to find patches pending to be applied to a parent file. I suppose then I could apply and the backup would become finally restored. Does some other more elegant way you could advise me?. Thank you so much really, Best regards,___ Bacula-devel mailing list Bacula-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-devel
Re: [Bacula-devel] Questions about delta encoding implementation and virtual files
No one 😀 Any help wouldvery highly appreciated… really!! Some parts… are not 100% clear in the doc… and I would be very proud of sharing my work 🙂🙂👍 Cheers! Egoitz, > El 7 mar 2022, a las 12:45, egoitz--- via Bacula-devel > escribió: > > > ATENCION: Este correo se ha enviado desde fuera de la organización. No pinche > en los enlaces ni abra los adjuntos a no ser que reconozca el remitente y > sepa que el contenido es seguro. > > Hi!, > > > > After digging in plugin building documentation and checking the provided > examples, I have some doubts I have not been able to clarify by my own. I > describe them below in case you could give me a hand :). I would be very > thankful if you could help me a little clarifying this doubts :) > > > > - I'm working in trying to create an open source version of the delta > encoding plugin by using the bacula-fd plugin api. When working on it I have > seen Bacula's source is aware of delta and delta file sequentiation. I have > seen for instance, even a .bvfs command exists for showing deltas of a file > id. But, what I have not found is that Bacula works on that Delta files > generation (patch generation, signatures, etc...). I assume that Bacula in > the non-fd part, acts just as a just delta file holder keeping the files and > stores the patch sequentiation just that. Bacula keeps records of deltas in > database (and file storages) but only fd works with them (with probably a > library like librsync in the delta plugin) in the sense of applying patches > over an original file or even generating deltas when backup. Am I wrong?. Was > just for understanding the nice work done and what's already written and free > in Bacula's source for this purpose. > > > > - By the way, I have one question about virtual files. I have not seen very > clear (perhaps my problem as don't understand it) how to work with them. I > understand the concept, but have not seen a clear example of how for instance > in the backup you create a virtual file, how do you see it in bvfs and > finally... what you get after restoring. In page 36/146 of Bacula 11 for > developers pdf, you say "This will create a virtual file." but really you are > entering in the structure : > > sp->type = FT_REG; > sp->statp.st_mode = 0700 | S_IFREG; > > FT_REG and S_IFREG both are for regular files what exactly causes a > virtual file to be created?. Perhaps st_size -1?. > > Are they relevant for what I'm trying to do?. It seems Bacula handles delta > sequentiation so... perhaps for this purpose I shouldn't need "virtual > files"?. > > > > - I'm planning to implement delta encoding by checking the previous day file > signature done by librsync. Instead of looking at the filesystem it would be > nice if I could take a look at that signature in the last backup done > (yesterday backup). Could it be possible in some manner, that if I see a file > passed in EventHandleBackupFile() to check if yesterdays signature exists in > the backup of yesterday, and then read the yesterday signature from the own > backup?. I mean, instead of having to leave the signature in the being backed > server's filesystem. > > > > - The last one :) . For restoring, and for the code seen (for instance in > insert_missing_delta()) I assume Bacula detects we are restoring a delta > compressed file. Then I assume Bacula restores apart from the own initial > file, patches to arrive to the day we want to restore to. Am I wrong?. > Perhaps later in a post-restore job I could run a shell script that tries to > find patches pending to be applied to a parent file. I suppose then I could > apply and the backup would become finally restored. Does some other more > elegant way you could advise me?. > > > > Thank you so much really, > > Best regards, > > > > > > =3DUTF-8" /> eva,sans-serif'> > Hi!, > > After digging in plugin building documentation and checking the provided= > examples, I have some doubts I have not been able to clarify by my own. I = > describe them below in case you could give me a hand :). I would be very th= > ankful if you could help me a little clarifying this doubts :) > > - I'm working in trying to create an open source version of the delta en= > coding plugin by using the bacula-fd plugin api. When working on it I have = > seen Bacula's source is aware of delta and delta file sequentiation. I have= > seen for instance, even a .bvfs command exists for showing deltas of a fil= > e id. But, what I have not found is that Bacula works on that Delta f= > iles generati
Re: [Bacula-devel] Questions about delta encoding implementation and virtual files
Hi Eric!! Thank you so much for your answer. Give me please some minutes to read your mail relaxedly. I'm not just interested in developing plugins!!! I would love that Bacula would triumph over Veeam... and I wanted to contribute for that!!! Let me please read carefully and please I emphasize on being extremely thankful for your help really Eric :) :) I answer later :) :) Cheers!! El 2022-03-08 09:56, Eric Bollengier escribió: > ATENCION > ATENCION > ATENCION!!! Este correo se ha enviado desde fuera de la organizacion. No > pinche en los enlaces ni abra los adjuntos a no ser que reconozca el > remitente y sepa que el contenido es seguro. > > Hello, > > On 3/7/22 12:44, egoitz--- via Bacula-devel wrote: > >> Hi!, >> >> After digging in plugin building documentation and checking the provided >> examples, I have some doubts I have not been able to clarify by my own. >> I describe them below in case you could give me a hand :). I would be >> very thankful if you could help me a little clarifying this doubts :) > > I very pleased to see someone interested to develop Bacula plugins, > specially around data compression. > >> - I'm working in trying to create an open source version of the delta >> encoding plugin by using the bacula-fd plugin api. When working on it I >> have seen Bacula's source is aware of delta and delta file >> sequentiation. > > Yes, Bacula can manage "delta" or "patches" for a given file. The first > Full backup should take the entire file, the delta_seq will be 0. > > During the first Incr or the Differential, the plugin can generate a > "patch" (can be done with "diff", or "xdelta", or rsync, or whatever). > > This new files is based on the original file, and the delta_seq will > be set to 1 automatically (Accurate mode should be turned on). The plugin > must set a variable in the save_pkt to indicates that it has saved a > patch, and at the restore time, bacula must send back the version of > the file included in the Full, then in Diff (if any) and all Incrementals. > > On the plugin side, the restore code will be called for each part of > the file. > >> I have seen for instance, even a .bvfs command exists for >> showing deltas of a file id. But, what I have not found is that Bacula >> works on that Delta files generation (patch generation, signatures, >> etc...). I assume that Bacula in the non-fd part, acts just as a just >> delta file holder keeping the files and stores the patch sequentiation >> just that. Bacula keeps records of deltas in database (and file >> storages) but only fd works with them (with probably a library like >> librsync in the delta plugin) in the sense of applying patches over an >> original file or even generating deltas when backup. Am I wrong?. Was >> just for understanding the nice work done and what's already written and >> free in Bacula's source for this purpose. > > I think that you got the concept, the "delta" in Bacula means that you > must restore all parts of a file, and not just the last copy. > > With a program such as VMware, CBT helps the backup software to generate > "patches" as well for example. > >> - By the way, I have one question about virtual files. I have not seen >> very clear (perhaps my problem as don't understand it) how to work with >> them. I understand the concept, but have not seen a clear example of how >> for instance in the backup you create a virtual file, how do you see it >> in bvfs and finally... what you get after restoring. In page 36/146 of >> Bacula 11 for developers pdf, you say "This will create a virtual file." >> but really you are entering in the structure : >> >> sp->type = FT_REG; >> sp->statp.st_mode = 0700 | S_IFREG; >> >> FT_REG and S_IFREG both are for regular files what exactly causes a >> virtual file to be created?. Perhaps st_size -1?. > > A virtual file is generated by plugins, they don't have to exist on disk. > The name can be anything, it can also point to an existing file. > > The plugin code will be executed to restore the "virtual file", the result > can be a real file on disk, or a virtual machine on Proxmox for example. > > In this example, we have a regular file, but it's a virtual file that may > or may not exist on the filesystem. > >> Are they relevant for what I'm trying to do?. It seems Bacula handles >> delta sequentiation so... perhaps for this purpose I shouldn't need >> "virtual files"?. > > In your case, it will be virtual files
Re: [Bacula-devel] Questions about delta encoding implementation and virtual files
Hi Eric!! Sorry for answering so late. I answer between lines in color green bold for instance, for better clarification. El 2022-03-08 09:56, Eric Bollengier escribió: > Hello, > > On 3/7/22 12:44, egoitz--- via Bacula-devel wrote: > >> Hi!, >> >> After digging in plugin building documentation and checking the provided >> examples, I have some doubts I have not been able to clarify by my own. >> I describe them below in case you could give me a hand :). I would be >> very thankful if you could help me a little clarifying this doubts :) > > I very pleased to see someone interested to develop Bacula plugins, > specially around data compression. > > I WROTE SOMETIME AGO TOO A VERY LITTLE PATCH FOR BACULA, FOR SPEEDING UP THE > BVFS_CACHE ON POSTGRESQL (IT HAPPENED AT LEAST IN THE POSTGRESQL VERSION WE > WERE USING AT LEAST IN THAT MOMENT). VERY HAPPY AND PROUD OF DOING THAT AND > THIS NEW ONE :) . > > MY PLAN IS TO FIRST WRITE ABOUT DATA COMPRESSION BUT I WOULD LOVE TOO... TO > KNOW BETTER THE BACULA FD PLUGIN API, IN ORDER TO WRITE SOME OTHER CUSTOM > PLUGINS FOR US... THE DELTA ONE, IS IMPORTANT FOR US, BUT IT'S TOO THE FACT > OF LEARNING HOW THE API WORKS. YOU KNOW, KNOWLEDGE ENDS UP BEING FLEXIBILITY > AND SO... AND BECAUSE I WANTED TOO DOING IT TOO REALLY :) :) :) :) LOL > >> - I'm working in trying to create an open source version of the delta >> encoding plugin by using the bacula-fd plugin api. When working on it I >> have seen Bacula's source is aware of delta and delta file >> sequentiation. > > Yes, Bacula can manage "delta" or "patches" for a given file. The first > Full backup should take the entire file, the delta_seq will be 0. > > YEP I SUPPOSED BY THE CODE READ... > > I ASSUME NOW IT'S JUST FOR REGISTRATION AND HELPING DEVELOPERS :) :) :) > > BUT BASICALLY PREVIOUSLY WANTED TO KNOW IF YOU WERE JUST STORING THE DELTA > SEQUENCE, BECAUSE DELTA ENCONDING IS USEFUL IF YOU NEED IT OR YOU WERE > STORING THAT DELTA DATA FOR LATER USING IT WITH SOME OTHER CODE THAT I DON'T > HAVE OR WHATEVER > > During the first Incr or the Differential, the plugin can generate a > "patch" (can be done with "diff", or "xdelta", or rsync, or whatever). > > OK, YES I SUPPOSE I WILL USE RDIFF DIRECTLY. THE BINARY BUILT AND PROVIDED > WITH LIBRSYNC BY DEFAULT. IT SEEMS TO RUN PRETTY FINE EVEN ON WINDOWS WITH > MSYS2.ORG PACKAGE SO > > This new files is based on the original file, and the delta_seq will > be set to 1 automatically (Accurate mode should be turned on). > > YEP I HAVE READ BEFORE THIS TOO... > > The plugin > must set a variable in the save_pkt to indicates that it has saved a > patch, > > I ASSUME, YOU ARE TALKING ABOUT THE PRESENCE OF THE FO_DELTA FLAG IN > SP->FLAGS? > > and at the restore time, bacula must send back the version of > the file included in the Full, then in Diff (if any) and all Incrementals. > > YES THIS PART IT'S CLEAR TOO... > > On the plugin side, the restore code will be called for each part of > the file. > OK, SO EACH THE INITIAL FILE COPIED USING DELTA PLUS ALL IT'S PATCHES WILL > GET RESTORED AS TOTALLY NORMAL FILES. FOR INSTANCE, WHEN BACULA IS GOING TO > RESTORE A FILE WHICH IS BACKED UP USING DELTA ENCODING, BACULA WILL DO : > - RESTORE INITIAL FILE > - RESTORE FIRST PATCH > - RESTORE SECOND PATCH > - AND SO ON...? > >> I have seen for instance, even a .bvfs command exists for >> showing deltas of a file id. But, what I have not found is that Bacula >> works on that Delta files generation (patch generation, signatures, >> etc...). I assume that Bacula in the non-fd part, acts just as a just >> delta file holder keeping the files and stores the patch sequentiation >> just that. Bacula keeps records of deltas in database (and file >> storages) but only fd works with them (with probably a library like >> librsync in the delta plugin) in the sense of applying patches over an >> original file or even generating deltas when backup. Am I wrong?. Was >> just for understanding the nice work done and what's already written and >> free in Bacula's source for this purpose. > > I think that you got the concept, the "delta" in Bacula means that you > must restore all parts of a file, and not just the last copy. > > With a program such as VMware, CBT helps the backup software to generate > "patches" as well for example. > > YEP I HAVE READ ABOUT CBT TOO... AND EVEN I HAVE DONE SOME WORK FOR BACKING > IP XCP-NG USING DELTAS OF VM :) (THERE IN XCP-NG WITHOUT
Re: [Bacula-devel] Questions about delta encoding implementation and virtual files
Hello Radoslaw, I'm extremely busy this days. I promise I will read this email tomorrow and will answer you. Thank you so much in advance. I'm extremely thankful for your help mates :) :) really I answer tomorrow when I have been able to read it carefully :) :) :) Cheers!!! El 2022-03-08 11:10, Radosław Korzeniewski escribió: > ATENCION: Este correo se ha enviado desde fuera de la organización. No pinche > en los enlaces ni abra los adjuntos a no ser que reconozca el remitente y > sepa que el contenido es seguro. > > Hello, > > pon., 7 mar 2022 o 12:44 egoitz--- via Bacula-devel > napisał(a): > >> - I'm working in trying to create an open source version of the delta >> encoding plugin by using the bacula-fd plugin api. When working on it I have >> seen Bacula's source is aware of delta and delta file sequentiation. I have >> seen for instance, even a .bvfs command exists for showing deltas of a file >> id. But, what I have not found is that Bacula works on that Delta files >> generation (patch generation, signatures, etc...). > > "delta files generation" - whatever it is is solely responsible for the > plugin. > Bacula provides a delta sequencing mechanism only. > >> I assume that Bacula in the non-fd part, acts just as a just delta file >> holder keeping the files and stores the patch sequentiation just that. > > It is stored in the catalog, so during restore the Director knows that it has > to include all incremental files in sequence instead of the latest one. > >> Bacula keeps records of deltas in database (and file storages) but only fd >> works with them (with probably a library like librsync in the delta plugin) > > No, the delta sequencing - a.k.a. block level incrementals is handled by fd > and dir where a fd/plugin is responsible for all diff + patches. > >> in the sense of applying patches over an original file or even generating >> deltas when backup. Am I wrong?. Was just for understanding the nice work >> done and what's already written and free in Bacula's source for this purpose. > > Yes, the fd/plugin is responsible for all "the dirty work" and Bacula > "framework" helps organize it. > >> - By the way, I have one question about virtual files. I have not seen very >> clear (perhaps my problem as don't understand it) how to work with them. I >> understand the concept, but have not seen a clear example of how for >> instance in the backup you create a virtual file, how do you see it in bvfs >> and finally... what you get after restoring. In page 36/146 of Bacula 11 for >> developers pdf, you say "This will create a virtual file." but really you >> are entering in the structure : >> >> sp->type = FT_REG; >> sp->statp.st_mode = 0700 | S_IFREG; >> >> FT_REG and S_IFREG both are for regular files what exactly causes a >> virtual file to be created?. Perhaps st_size -1?. > > In this sense a "virtual" is a file which does not exist on a backup server > and is created programmatically with a plugin API. This file is seen by a > Bacula as an ordinary file with a slightly different backup stream id, so fd > will know what tool to use to restore it. > You can fill "struct stat" with whatever acceptable values you want, where > st_size should match as close as possible the real size of the saved file. > The size == -1 will confuse users. > >> Are they relevant for what I'm trying to do?. It seems Bacula handles delta >> sequentiation so... perhaps for this purpose I shouldn't need "virtual >> files"?. > > No, the virtual files are available in command plugins api only and are used > mainly for creating backups of different applications, i.e. running > databases, where a standard file backup is useless, not optimal or simply > unavailable. > >> - I'm planning to implement delta encoding by checking the previous day file >> signature done by librsync. Instead of looking at the filesystem it would be >> nice if I could take a look at that signature in the last backup done >> (yesterday backup). > > What "signature" are you thinking of? > There is an "accurate catalog query api" in Bacula but as far I know it is > not handling checksums (md5, sha, etc). > You can extend this code if you wish. > >> Could it be possible in some manner, that if I see a file passed in >> EventHandleBackupFile() to check if yesterdays signature exists in the >> backup of yesterday, and then read the yesterday signature from the own &
Re: [Bacula-devel] Questions about delta encoding implementation and virtual files
Hi Heitor!!! Are you serious???. Really??? :) :) I will anyway learn how the plugin API goes because I wanted to do some custom plugins for my company you know... very customized ones :) :) But are you serious??? :) :) Thanks mate!! :) El 2022-03-08 13:30, Heitor Faria escribió: > ATENCION: Este correo se ha enviado desde fuera de la organización. No pinche > en los enlaces ni abra los adjuntos a no ser que reconozca el remitente y > sepa que el contenido es seguro. > > Hello Egoitz, > >> Thank you so much for your answer. Give me please some minutes to read your >> mail relaxedly. I'm not just interested in developing plugins!!! I would >> love that Bacula would triumph over Veeam... and I wanted to contribute for >> that!!! > > Unfortunatelly the Bacula version that competes with Veeam is the Enterprise, > even though the Community version is very good and is adequate for a lot of > customers depending on their requirements. > >> Let me please read carefully and please I emphasize on being extremely >> thankful for your help really Eric :) :) > > Regarding your development quest, effortwise thinking, it would be much > better if Bacula Systems was convinced to port the Delta Plugin to the > Community version. And then maybe you can work on something else. > March 21th is Kern's birthday. He is retired but you could ask for this gift. > =) > > Atte. > -- > > MSc Heitor Faria (Miami/USA) > Bacula LATAM CIO > > mobile1: + 1 909 655-8971 > mobile2: + 55 61 98268-4220 > > [1] > > [2] > > América Latina > > bacula.lat [3] | bacula.com.br [4] Links: -- [1] https://www.linkedin.com/in/msc-heitor-faria-5ba51b3 [2] Http://www.bacula.com.br [3] http://bacula.lat [4] http://www.bacula.com.br___ Bacula-devel mailing list Bacula-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-devel
Re: [Bacula-devel] Fwd: Open source Bacula plugin for deduplication
Hello Bob, Same as for Radoslaw... I'm extremely busy this days. I promise I will read this email tomorrow and will answer you. Thank you so much in advance. I'm extremely thankful for your help mates :) :) really I answer tomorrow when I have been able to read it carefully :) :) :) Cheers!!! El 2022-03-08 16:42, Bob Hetzel escribió: > ATENCION: Este correo se ha enviado desde fuera de la organización. No pinche > en los enlaces ni abra los adjuntos a no ser que reconozca el remitente y > sepa que el contenido es seguro. > > Hi, > > I'm not a developer but I have a lot of familiarity with Microsoft SQL > Server. I'm not sure if you meant Microsoft or not. In Microsoft SQL Server > you normally back up the db using using a SQL call. That can be set to > compress the backup so a 150gb db file will typically only produce backup > files of 5gb to 15gb in size. Additionally, SQL Server supports log and > differential backups so you would not need to do a full backup every time. > > So you could basically just run db engine backups then skip the in-use db > files and back up the backups files instead. > > I realize this is basically a 2-step backup rather than a simple 1-step so > definitely has some drawbacks, but figured I'd mention that as an option > off-list. > > The way enterprises datacenters have changed over the last 20 years, hardly > any backups are now using file system agent based scans. Most everything is > a VM so I would not expect a lot of usage for your enhancement, because > backups in most cases are of the entire VM using the change block tracking > that was mentioned in this thread. > > I could go into more detail on any of this if you have questions about any of > the things I've mentioned. > > Bob > > Begin forwarded message: > >> FROM: egoitz--- via Bacula-devel >> DATE: March 3, 2022 at 6:33:25 AM CST >> TO: Radosław Korzeniewski >> CC: bacula-devel@lists.sourceforge.net >> SUBJECT: RE: [BACULA-DEVEL] OPEN SOURCE BACULA PLUGIN FOR DEDUPLICATION >> REPLY-TO: ego...@ramattack.net > > Hello Radoslaw, > > I will answer below in green color for instance... just for discerning better > what both have spoke... :) > > El 2022-03-03 12:46, Radosław Korzeniewski escribió: > > ATENCION: Este correo se ha enviado desde fuera de la organización. No pinche > en los enlaces ni abra los adjuntos a no ser que reconozca el remitente y > sepa que el contenido es seguro. > > Hello, > > czw., 3 mar 2022 o 12:09 egoitz--- via Bacula-devel > napisał(a): > > Good morning, > > I know Bacula enterprise provides deduplicacion plugins, but sadly we can't > afford it. No problem, we will try to create an open source deduplication > plugin for bacula file daemon. I would use rdiff (part of librsync) for delta > patching and signature generation. > What signatures rdiff is using? > > BASICALLY HERE IS DOCUMENTED EXACTLY... > HTTPS://LIBRSYNC.GITHUB.IO/PAGE_FORMATS.HTML > > IT'S FOR BEING ABLE TO GENERATE DELTA PATCHES, WITHOUT THE NEED OF HAVING OLD > AND NEW VERSION OF A FILE... AND SO... FOR AVOID DOUBLING THE SPACE USED OR > REQUIRED FOR BACKING UP... > > I would love to create a Bacula plugin for deduplicating content at fd level. > This way, even if the backup is sent crypted by fd to sd, the deduplication > could be done obtaining the best results as the deduplication takes place > when the files are not crypted yet. > Yes, for proper encryption you would always get different bits for the same > data block making deduplication totally useless. :) > > I THINK THAT TOO.. YES... > > The deduplication, would only be applied to files, let's say larger than > 10GB. > ??? > > I designed Bacula deduplication to handle blocks (files) larger than 1k > because indexing overhead for such small blocks was too high. The larger the > block you use the lower chance to get a good deduplication ratio. So it is a > trade-off - small blocks == good deduplication ratio but higher indexing > overhead; larger blocks == weak deduplication ratio but lower indexing > overhead. So it was handling block levels from 1K up to 64k (the default > bacula block size, but could be extended to any size). > > I UNDERSTAND WHAT YOU SAY BUT THE PROBLEM WE ARE FACING IS THE FOLLOWING ONE. > IMAGINE, A MACHINE WITH A SQL SERVER AND 150GB OF DATABASES. OUR PROBLEM IS > TO HAVE TO INCREMENTALLY COPY THAT EACH DAY. WE DON'T REALLY MIND COPYING 5GB > OF "WASTED" SPACE PER DAY... EVEN WHEN NON NECESSARY (JUST FOR > UNDERSTANDING) BUT OBVIOUSLY 100GB PER DAY OR 200GB... ARE
Re: [Bacula-devel] Questions about delta encoding implementation and virtual files
Good morning, As said yesterday, thank you so much for your answer. This is very important for me, really. I answer below, in green bold for instance... for being able to be distinguished better my comments... El 2022-03-08 11:10, Radosław Korzeniewski escribió: > ATENCION: Este correo se ha enviado desde fuera de la organización. No pinche > en los enlaces ni abra los adjuntos a no ser que reconozca el remitente y > sepa que el contenido es seguro. > > Hello, > > pon., 7 mar 2022 o 12:44 egoitz--- via Bacula-devel > napisał(a): > >> - I'm working in trying to create an open source version of the delta >> encoding plugin by using the bacula-fd plugin api. When working on it I have >> seen Bacula's source is aware of delta and delta file sequentiation. I have >> seen for instance, even a .bvfs command exists for showing deltas of a file >> id. But, what I have not found is that Bacula works on that Delta files >> generation (patch generation, signatures, etc...). > > "delta files generation" - whatever it is is solely responsible for the > plugin. > Bacula provides a delta sequencing mechanism only. > > THIS IS WHAT I WANTED TO CONFIRM :) :) YEP :) :) > >> I assume that Bacula in the non-fd part, acts just as a just delta file >> holder keeping the files and stores the patch sequentiation just that. > > It is stored in the catalog, so during restore the Director knows that it has > to include all incremental files in sequence instead of the latest one. > > I SEE... IT WAS IMPORTANT FOR ME TO CLARIFY THIS... > >> Bacula keeps records of deltas in database (and file storages) but only fd >> works with them (with probably a library like librsync in the delta plugin) > > No, the delta sequencing - a.k.a. block level incrementals is handled by fd > and dir where a fd/plugin is responsible for all diff + patches. > > I SEE... > >> in the sense of applying patches over an original file or even generating >> deltas when backup. Am I wrong?. Was just for understanding the nice work >> done and what's already written and free in Bacula's source for this purpose. > > Yes, the fd/plugin is responsible for all "the dirty work" and Bacula > "framework" helps organize it. > > VERY CLEAR NOW RADOSLAW... I NEEDED TO CONFIRM IT :) :) BECUASE YOU KNOW... > PERHAPS THERE IS A NON-AVAILABLE CODE THAT DOES SOMETHING ELSE WITH THAT INFO > STORED IN CATALOG... AND JUST... FOR HAVING A VISION OF HOW IS BEING USED > THAT STORED DATA FOR LATER WRITTING IN SOME MANNER THE PENDING CODE :) . > THANKS MATE :) > >> - By the way, I have one question about virtual files. I have not seen very >> clear (perhaps my problem as don't understand it) how to work with them. I >> understand the concept, but have not seen a clear example of how for >> instance in the backup you create a virtual file, how do you see it in bvfs >> and finally... what you get after restoring. In page 36/146 of Bacula 11 for >> developers pdf, you say "This will create a virtual file." but really you >> are entering in the structure : >> >> sp->type = FT_REG; >> sp->statp.st_mode = 0700 | S_IFREG; >> >> FT_REG and S_IFREG both are for regular files what exactly causes a >> virtual file to be created?. Perhaps st_size -1?. > > In this sense a "virtual" is a file which does not exist on a backup server > and is created programmatically with a plugin API. This file is seen by a > Bacula as an ordinary file with a slightly different backup stream id, so fd > will know what tool to use to restore it. > You can fill "struct stat" with whatever acceptable values you want, where > st_size should match as close as possible the real size of the saved file. > The size == -1 will confuse users. > > OK, OK SO A VIRTUAL FILE IS WRITTEN IN THE STRUCT THE SAME MANNER AS A > REGULAR FILE. YOU MEAN THEN, THERE'S NOTHING SHOULD BE DONE SPECIALLY IN THE > STRUCTS, FOR BACULA TO KNOW WHETHER IT'S HANDLING A VIRTUAL OR A REGULAR FILE > (AND OBVIOUSLY FOR ACTING ACCORDINGLY?. > > BUT THEN... I CAN'T REALLY UNDERSTAND HOW DOES BACULA DO THE CORRECT THING?. > I MEAN... IT HAS SOME SORT OF TABLE, WITH THE FILEID, REFERENCING TO OTHER X > FILEIDS TO BE RESTORED TO DISK OR SIMILAR?. > > PLEASE, LET ME SET AN EXAMPLE : > > I WANTED TO BACKED THE THREE FILES IN A VIRTUAL FILE, I SHOULD DO SOMETHING > AS THIS? > > static const char *files[] = { > "/filea", > "/fileb", > "/filec" > }; > static int nb_files = 3; > STATIC
Re: [Bacula-devel] Fwd: Open source Bacula plugin for deduplication
Hi Bob, Thanks a lot in advance for your time and help. Absolutely appreciated mate!. Answering below, in green bold for instance :) :) ... El 2022-03-08 16:42, Bob Hetzel escribió: > ATENCION: Este correo se ha enviado desde fuera de la organización. No pinche > en los enlaces ni abra los adjuntos a no ser que reconozca el remitente y > sepa que el contenido es seguro. > > Hi, > > I'm not a developer but I have a lot of familiarity with Microsoft SQL > Server. I'm not sure if you meant Microsoft or not. In Microsoft SQL Server > you normally back up the db using using a SQL call. That can be set to > compress the backup so a 150gb db file will typically only produce backup > files of 5gb to 15gb in size. Additionally, SQL Server supports log and > differential backups so you would not need to do a full backup every time. > > I KNOW BUT AS FAR AS I KNOW, IT'S POSSIBLE TO BACK UP SQL SERVER WITH VSS > WITHOUT DOING DUMPS BEFORE. I THINK IT IS... I WOULD DO TOO THE DUMP. YES I > KNEW YOU CAN DO INCREMENTAL BACKUPS AND SO > > WAS JUST... AS WE TRY TO BACKUP THE WHOLE FILESYSTEM FOR SAVING SOME > SPACE IN THE BACKUP > > So you could basically just run db engine backups then skip the in-use db > files and back up the backups files instead. > > I AGREE WITH YOU... > > I realize this is basically a 2-step backup rather than a simple 1-step so > definitely has some drawbacks, but figured I'd mention that as an option > off-list. > > TRUE... > > The way enterprises datacenters have changed over the last 20 years, hardly > any backups are now using file system agent based scans. Most everything is > a VM so I would not expect a lot of usage for your enhancement, because > backups in most cases are of the entire VM using the change block tracking > that was mentioned in this thread. > > WELL YES... YOU COULD USE CBT IN VMWARE OR SOME OTHER WAYS TOO IN > XCP-NG/XENSERVER AT VIRTUAL DISK LEVEL BUT YES... TODAY THIS METHOD HAS > BECOME MORE AND MORE POPULAR. EVEN THAT... I WOULD SAY I PREFER DOING THE > BACKUP FROM "INSIDE" THE MACHINE... INSTEAD OF USING VIRTUAL BLOCK DISKS > OR ELSE... USE BOTH... A WHOLE BACKUP FROM VIRTUAL DISK SIDE... BUT ANOTHER > BACKUP TOO AT FILE LEVEL INSIDE THE VM, WITH FOR INSTANCE THE DATABASE DUMPS > ONLY... > > I could go into more detail on any of this if you have questions about any of > the things I've mentioned. > > THANK YOU SO MUCH BOB. I TAKE NOTE AND WOULD TELL YOU IF I NEEDED TO SHARE > WITH YOU AGAIN SOME PLAN... > > CHEERS!! > > Bob > > Begin forwarded message: > >> FROM: egoitz--- via Bacula-devel >> DATE: March 3, 2022 at 6:33:25 AM CST >> TO: Radosław Korzeniewski >> CC: bacula-devel@lists.sourceforge.net >> SUBJECT: RE: [BACULA-DEVEL] OPEN SOURCE BACULA PLUGIN FOR DEDUPLICATION >> REPLY-TO: ego...@ramattack.net > > Hello Radoslaw, > > I will answer below in green color for instance... just for discerning better > what both have spoke... :) > > El 2022-03-03 12:46, Radosław Korzeniewski escribió: > > ATENCION: Este correo se ha enviado desde fuera de la organización. No pinche > en los enlaces ni abra los adjuntos a no ser que reconozca el remitente y > sepa que el contenido es seguro. > > Hello, > > czw., 3 mar 2022 o 12:09 egoitz--- via Bacula-devel > napisał(a): > > Good morning, > > I know Bacula enterprise provides deduplicacion plugins, but sadly we can't > afford it. No problem, we will try to create an open source deduplication > plugin for bacula file daemon. I would use rdiff (part of librsync) for delta > patching and signature generation. > What signatures rdiff is using? > > BASICALLY HERE IS DOCUMENTED EXACTLY... > HTTPS://LIBRSYNC.GITHUB.IO/PAGE_FORMATS.HTML > > IT'S FOR BEING ABLE TO GENERATE DELTA PATCHES, WITHOUT THE NEED OF HAVING OLD > AND NEW VERSION OF A FILE... AND SO... FOR AVOID DOUBLING THE SPACE USED OR > REQUIRED FOR BACKING UP... > > I would love to create a Bacula plugin for deduplicating content at fd level. > This way, even if the backup is sent crypted by fd to sd, the deduplication > could be done obtaining the best results as the deduplication takes place > when the files are not crypted yet. > Yes, for proper encryption you would always get different bits for the same > data block making deduplication totally useless. :) > > I THINK THAT TOO.. YES... > > The deduplication, would only be applied to files, let's say larger than > 10GB. > ??? > > I designed Bacula deduplication to handle blocks (files) larger than 1k
[Bacula-devel] An special day for honoring someone who has done a extremely nice job in the open source world
Good morning people :) Today is a special day. Is the birthday of someone now deservedly retired. This person is Kern Sibbald. I had the opportunity of knowing about his birthday, through one nice person in this mailing lists. Kern is much more than a nice codder, who has write a extremely important tool which is the base of most of our backups. At least for me is one of my mentors, one of the persons from which I would like to learn because it has done a nice job with Bacula. For all these reasons, I wanted to say "thank you so much Kern and have an extremely happy birthday!!". I wanted too, for making the most of this lines, to ask a little gift to Kern :) :). I'm working as some of you know, in building a Bacula pluggin for creating an open source delta encoding plugin for the fd. It would be hugely nice :) :) (as someone told me too in this list) if official Delta plugin of Bacula, could be distributed with the Community source of Bacula. I'll go on writting my own plugin anyway, because I wanted to learn how it works and for being able to customize some of the backups I do here, but it would be really nice to have the Delta plugin as part of the Community edition of this nice piece of software. I had to try :) :) . Anyway and independently of what Kern decides about the gift I have asked :) :) (I had to try it... mainly after someone encouraged asking it :) :) ), I wanted to emphasize my recognition about Kern's person, due to all his contributions to the open source world. These ideas, anyway, would never change in my mind about you. So for ending this email, I think there are not more appropiate words for being remarked as the following ones : "Kern, Thank you so much". Cheers :)___ Bacula-devel mailing list Bacula-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-devel
Re: [Bacula-devel] An special day for honoring someone who has done a extremely nice job in the open source world
Hi Kern I didn't see your email!!!. How could it be!. I'm so sorry for it!!. I'm really busy these days. I answer below, between the lines in bold blue for instance, for a more clear and clean understanding of my answers :) El 2022-03-24 17:55, Kern Sibbald escribió: > Hello Egoitz, > > Thanks for your very kind email with birthday wishes. I am really honored > that you call me a mentor, thanks. I have always done my best to produce a > high end backup product (though it is a bit complicated) that is robust and > reliable with all the features needed by the community as well as small > enterprises. Though it is sometimes hard, I feel that we succeeded in > creating a friendly and helpful email list. > > IT WOULD BE IMPOSSIBLE TO NOT CONSIDER YOU A MENTOR!!. YOU ARE THE CREATOR OF > BACULA KERN!!. YOU REALLY SUCCEEDED A LOT YES :) :) > > Thanks for using Bacula and for your email. > > IT IS THE LESS I CAN DO AND KNOWING IS YOU BIRTHDAY!! > > Concerning your plugin: in general we accept all contributions that are > useful, follow our current design conditions (in the Developers document) and > for which we have a signed CAA (copyright assignment agreement). This > agreement allows Bacula Systems to use the code, but even more important for > me, it protect all Bacula users from someone introducing code then claiming > we are using his proprietary code (this is what Bareos attempted to do to > Bacula). I can remember only on case where we we not able to use the code, > and that was code developed by UKFast (ISP) and released by the developer > without permission. We we never able to get permission to use the code from > UKFast. This code implemented quotas. Instead of integrating unauthorized > code, we wrote our own simpler and more efficient quota code. Note, Bareos > integrated the unauthorized code, so perhaps one day their users may be > exposed to license problems. > > INCREDIBLE... REALLY THE PROBLEMS WITH BAREOS... IT'S A PITTY. > > I SEE... I NORMALLY OFFER THE CODE AVAILABLE FOR DOWNLOADING AND WITH BSD > LICENSE... I DON'T MIND PEOPLE TO USE MY CODE IF IT HELPS IMPROVING > SOMETHING... FOR ME IT'S ALL FINE!! > > If you finish your code, and you are willing to submit a CAA, then I > recommend that you submit it to Bacula (Now that I am retired, Eric decides > exactly what is integrated and what is not). In my opinion, it would be a > very nice addition. > > I'M WORKING ON :) :) AND CONVINCING MY BOSS TO GIVE DEVEL HOURS :) :) > > I am very pleased that Bacula Systems has agreed to look after maintaining > the community version and continue adding the Bacula Systems new features. > This means that Bacula will be getting better and better and continue to > adapt to the ever changing IT backup/restore needs. > > THE ONE PLEASED WITH YOUR ANSWER IS ME KERN. IT'S A GREAT HONOR FOR ME... YOU > TO HAVE ANSWERED TO MY EMAIL :) :) REALLY > > Thank you very much for recognizing my contributions, and above all thank you > for using Bacula and developing code for it. I wish you all the best in the > future. > > SAME FOR YOU AND CONGRATULATIONS FOR SUCH A SUCCESSS CAREER!!!. I WISH I > WOULD HAD THE SAME SUCCESS AS THE HALF OF YOU :) :) > > Kind regards, > > BEST REGARDS :) :) > Kern > (Currently in Puerto Rico until May then back to Barcelona). > I'M FROM BILBAO!!! SLIGHTLY FAR FROM HERE ! :) > > On 3/21/2022 4:40 AM, ego...@ramattack.net wrote: > >> Good morning people :) >> >> Today is a special day. Is the birthday of someone now deservedly retired. >> This person is Kern Sibbald. I had the opportunity of knowing about his >> birthday, through one nice person in this mailing lists. Kern is much more >> than a nice codder, who has write a extremely important tool which is the >> base of most of our backups. At least for me is one of my mentors, one of >> the persons from which I would like to learn because it has done a nice job >> with Bacula. For all these reasons, I wanted to say "thank you so much Kern >> and have an extremely happy birthday!!". >> >> I wanted too, for making the most of this lines, to ask a little gift to >> Kern :) :). I'm working as some of you know, in building a Bacula pluggin >> for creating an open source delta encoding plugin for the fd. It would be >> hugely nice :) :) (as someone told me too in this list) if official Delta >> plugin of Bacula, could be distributed with the Community source of Bacula. >> I'll go on writting my own plugin anyway, because I wanted to learn how it >> works and for being able to customize some of the backups I do here, but it >> would be really nice to have the Delta plugin as part of the Community >> edition of this nice piece of software. I had to try :) :) . >> >> Anyway and independently of what Kern decides about the gift I have asked :) >> :) (I had to try it... mainly after someone encouraged asking it :) :) ), I >> wanted to emphasi
[Bacula-devel] Statically linking file daemon
Hi!, I have the need to build the file daemon statically linked because it would run in a non-managed by me servers, which would be upgraded (apt package upgrades) and then the library versions could change, even could become uninstalled. I run configure as : ./configure --prefix='/opt/bacula-11.0.5-fd' --enable-client-only --enable-static --enable-static-fd --disable-libtool --with-openssl --with-lzo When I issue a make I see warnings like : /opt/bacula-community-Release-11.0.5/bacula/src/lib/priv.c:76: warning: Using 'getgrnam' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking /usr/bin/ld: ../lib/libbac.a(bsys.o): in function `get_group_members(char const*, alist*)': Can the glibc libraries be copied by some configure flag or option, so that they to become available for the bacula-fd and I suppose then on bacula-fd start, that you would have to specify LD_LIBRARY_PATH and LD_RUN_PATH pointing to that directory where you preserve the glibc libraries used in the linkage process of building bacula-fd?. What's the recommended way to end up by having a static fd without surprises or problems later?. Best regards,___ Bacula-devel mailing list Bacula-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-devel