Hello,

On 3/7/22 12:44, egoitz--- via Bacula-devel wrote:
Hi!,

After digging in plugin building documentation and checking the provided
examples, I have some doubts I have not been able to clarify by my own.
I describe them below in case you could give me a hand :). I would be
very thankful if you could help me a little clarifying this doubts :)

I very pleased to see someone interested to develop Bacula plugins,
specially around data compression.

- I'm working in trying to create an open source version of the delta
encoding plugin by using the bacula-fd plugin api. When working on it I
have seen Bacula's source is aware of delta and delta file
sequentiation.

Yes, Bacula can manage "delta" or "patches" for a given file. The first
Full backup should take the entire file, the delta_seq will be 0.

During the first Incr or the Differential, the plugin can generate a
"patch" (can be done with "diff", or "xdelta", or rsync, or whatever).

This new files is based on the original file, and the delta_seq will
be set to 1 automatically (Accurate mode should be turned on). The plugin
must set a variable in the save_pkt to indicates that it has saved a
patch, and at the restore time, bacula must send back the version of
the file included in the Full, then in Diff (if any) and all Incrementals.

On the plugin side, the restore code will be called for each part of
the file.

I have seen for instance, even a .bvfs command exists for
showing deltas of a file id. But, what  I have not found is that Bacula
works on that Delta files generation (patch generation, signatures,
etc...). I assume that Bacula in the non-fd part, acts just as a just
delta file holder keeping the files and stores the patch sequentiation
just that. Bacula keeps records of deltas in database (and file
storages) but only fd works with them (with probably a library like
librsync in the delta plugin) in the sense of applying patches over an
original file or even generating deltas when backup. Am I wrong?. Was
just for understanding the nice work done and what's already written and
free in Bacula's source for this purpose.

I think that you got the concept, the "delta" in Bacula means that you
must restore all parts of a file, and not just the last copy.

With a program such as VMware, CBT helps the backup software to generate
"patches" as well for example.


- By the way, I have one question about virtual files. I have not seen
very clear (perhaps my problem as don't understand it) how to work with
them. I understand the concept, but have not seen a clear example of how
for instance in the backup you create a virtual file, how do you see it
in bvfs and finally... what you get after restoring. In page 36/146 of
Bacula 11 for developers pdf, you say "This will create a virtual file."
but really you are entering in the structure :
>
sp->type = FT_REG;
sp->statp.st_mode = 0700 | S_IFREG;

FT_REG and S_IFREG both are for regular files.... what exactly causes a
virtual file to be created?. Perhaps st_size -1?.


A virtual file is generated by plugins, they don't have to exist on disk.
The name can be anything, it can also point to an existing file.

The plugin code will be executed to restore the "virtual file", the result
can be a real file on disk, or a virtual machine on Proxmox for example.

In this example, we have a regular file, but it's a virtual file that may
or may not exist on the filesystem.


Are they relevant for what I'm trying to do?. It seems Bacula handles
delta sequentiation so... perhaps for this purpose I shouldn't need
"virtual files"?.

In your case, it will be virtual files that points to regular files.

- I'm planning to implement delta encoding by checking the previous day
file signature done by librsync. Instead of looking at the filesystem it
would be nice if I could take a look at that signature in the last
backup done (yesterday backup). Could it be possible in some manner,
that if I see a file passed in EventHandleBackupFile() to check if
yesterdays signature exists in the backup of yesterday, and then read
the yesterday signature from the own backup?. I mean, instead of having
to leave the signature in the being backed server's filesystem.

You can store information in the save_pkt structure and the plugin can check
the last version of that information with the accurate mode.

In general, you can use a couple of bytes with this technique.

I don't think you have enough space to store a file signature, you will
have to use an other way to store it (a local file, a database record, ...)


- The last one :) . For restoring, and for the code seen (for instance
in insert_missing_delta()) I assume Bacula detects we are restoring a
delta compressed file. Then I assume Bacula restores apart from the own
initial file, patches to arrive to the day we want to restore to. Am I
wrong?.

This is correct, bacula will send back to the plugin the data that was
produced. Up to the plugin to reassemble the data. If one delta piece is
missing, the restore will stop to the last correct one.

Perhaps later in a post-restore job I could run a shell script
that tries to find patches pending to be applied to a parent file. I
suppose then I could apply and the backup would become finally restored.
Does some other more elegant way you could advise me?.

Normally, the plugin will receive patches one after the other, you can
re-open the file on disk and apply the patch. It is also possible to
store everything on disk and call a script to do the work at the end,
it depends.

Good luck,

Best Regards,
Eric


_______________________________________________
Bacula-devel mailing list
Bacula-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-devel

Reply via email to