I've checked in the fix which makes everything work properly.  I've tested
every combination of encryption, sparse and compression, both backup and
restore.  I've also compared the restored contents against the original to
ensure that there is no corruption of the data.

Any previous backups that were encrypted aren't readable.  All other
previous backups can be restored fine.

-----Original Message-----
From: Kern Sibbald [mailto:[EMAIL PROTECTED] 
Sent: Friday, November 03, 2006 4:23 PM
To: Robert Nelson
Cc: 'Landon Fuller'; 'Michael Brennen'; [EMAIL PROTECTED];
bacula-users@lists.sourceforge.net
Subject: Re: [Bacula-devel] [Bacula-users] Encryption/Compression Conflict
in CVS


> The problem is that currently there are three filters defined:
> compression,
> encryption, and sparse file handling.  The current implementation of
> compression and sparse file handling both require block boundary
> preservation.  Even if zlib streaming could handle the existing block
> based
> data, sparse file handling would be broken.

It seems to me that it is probably time to come up with a better way to
handle filters, but it is probably too late for 1.40 to make any major
changes to the code.

I think the most important two points are:
1. Ensure that old Volumes are readable wherever possible.
2. Fix 1.40 so that it works correctly.

As far as point 2 is concerned, if it is not possible to fix it easily or
correctly, we could consider disallowing certain combinations of options
-- at least until we can find a better way to handle multiple filters.

>
> -----Original Message-----
> From: Landon Fuller [mailto:[EMAIL PROTECTED]
> Sent: Thursday, November 02, 2006 11:06 AM
> To: Robert Nelson
> Cc: 'Michael Brennen'; [EMAIL PROTECTED];
> bacula-users@lists.sourceforge.net
> Subject: Re: [Bacula-users] Encryption/Compression Conflict in CVS
>
>
> On Nov 2, 2006, at 08:30, Robert Nelson wrote:
>
>> Landon,
>>
>> I've changed the code so that the encryption code prefixes the data
>> block with a block length prior to encryption.
>>
>> The decryption code accumulates data until a full data block is
>> decrypted before passing it along to the decompression code.
>>
>> The code now works for all four scenarios with encryption and
>> compression:
>> none, encryption, compression, and encryption + compression.
>> Unfortunately
>> the code is no longer compatible for previously encrypted backups.
>>
>> I could add some more code to make the encryption only case work like
>> before.  However, since this is a new feature in 1.39 and there
>> shouldn't be a lot of existing backups, I would prefer to invalidate
>> the previous backups and keep the code simpler.
>>
>> Also I think we should have a design rule that says any data filters
>> like encryption, compression, etc must maintain the original buffer
>> boundaries.
>>
>> This will allow us to define arbitrary, dynamically extensible filter
>> stacks in the future.
>>
>> What do you think?
>
> I was thinking about this on the way to work. My original assumption was
> that Bacula used the zlib streaming API to maintain state during file
> compression/decompression, but this is not the case. Reality is something
> more like this:
>
> Backup:
>       - Set up the zlib stream context
>       - For each file block (not each file), compress the block via
> deflate (stream, Z_FINISH); and reinitialize the stream.
>       - After all files (and blocks) are compressed, destroy the stream
> context
>
> Restore:
>       - For each block, call "uncompress()", which does not handle
> streaming.
>
> This is a unfortunate -- reinitializing the stream for each block
> significantly degrades compression efficiency, as 1) block boundaries are
> dynamic and may be set arbitrarily, 2) the LZ77 algorithm may cross block
> boundaries, referring up to 32k of previous input data.
> (http://www.gzip.org/zlib/rfc-deflate.html#overview), 3) The huffman
> coding
> context comprises the entire block, 4) There's no need to limit zlib block
> size to bacula's block size.
>
> The next question is this -- as we *should* stream the data, does it make
> sense to enforce downstream block boundaries in the upstream filter? I'm
> siding in favor requiring streaming support, and thus allowing the
> individual filter implementor to worry about their own block buffering,
> since they can far better encapsulate necessary state and implementation
> --
> and most already do.
>
> The one other thing I am unsure of is whether the zlib streaming API
> correctly handles streams that have been written as per above -- each
> bacula
> data block as an independent 'stream'. If zlib DOES handle this, it should
> be possible to modify the backup and restore implementation to use the
> stream API correctly while maintaining backwards compatibility. This would
> fix the encryption problem AND increase compression efficiency.
>
> With my extremely large database backups, I sure wouldn't mind increased
> compression efficiency =)
>
> Some documentation on the zlib API is available here (I had a little
> difficulty googling this):
>
>
http://www.freestandards.org/spec/booksets/LSB-Core-generic/LSB-Core-generic
> /libzman.html
>
> Cheers,
> Landon
>
>
>
> -------------------------------------------------------------------------
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job
> easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> _______________________________________________
> Bacula-devel mailing list
> [EMAIL PROTECTED]
> https://lists.sourceforge.net/lists/listinfo/bacula-devel
>


Best regards, Kern



-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to