Hello, On Tuesday 04 January 2011 19:45:53 Radosław Korzeniewski wrote: > 2011/1/1 Kern Sibbald <k...@sibbald.com> > > Hello Kern and others, > > > The first thing that one must do is specify what problem of deduplication > > one > > is trying to resolve: > > > > 1. Deduplication by the Bacula Storage daemon > > > > 2. Deduplication in the Bacula Client (File daemon) > > > > 3. Deduplication by the underlying filesystem where the SD writes data > > (e.g. > > ZFS). > > 4. Global deduplication performed on File Daemon but with dictionary > maintained on Bacula Director/Storage Daemon > - backup of particular data block isn't performed when SD already has a > such data block, no matter which client is an original owner of the block - > reduces data stored on SD like p.1 or p.3 approaches AND reduces network > traffic like p.2 approach > > Use case: A company has one production database (or vm image file) and > multiply test/development environments, all with backup. In most cases > difference between all of those databases (vm images) is less then 1% of > data blocks. During backup only 1% of data blocks is backuped and send > through network.
Yes. I had considered this to be one of two options of item 2. The dedup hashes are either kept on the FD or on the Director. > > > (...) > > Item 1 is probably something that will never be needed due to the fact > > that there are more and more very good filesystems that already do the > > job especially if a new (additional) Volume format were to be > > implemented. > > As Howard mentioned earlier currently there are no serious dedup enabled fs > at production stage (excluding solaris/zfs which is not opensource any > more). You can use dedicated appliance like Data Domain's products but it > is a different kind of solution. > > > I've noticed that a few months after we discussed various features, the > > same > > thing was implemented by Zmanda, so I am a bit reluctant to give any > > details. > > Wow, sounds like a some kind of conspiracy :) I hope it was just a coincidence, but it made me a bit more cautious. > > > However, if there are programmers that want do development, we would be > > happy > > to discuss off list. Please keep in mind that we sometimes receive > > patches that programmers have made without discussing it with us, and > > often such patches are not appropriate for Bacula for lots of reasons: > > limited to a particular OS, doesn't respect coding standards, is not > > scalable, doesn't fit > > Bacula way of doing things, doesn't use Bacula "infrastructure" (mostly > > libbac.so), ... > > Is it possible to publish those patches somewhere? It could be useful to > others. To the best of my knowledge all such patches are sent openly to either or both of the bacula-users or bacula-devel lists -- so they should be available for everyone. Best regards, Kern ------------------------------------------------------------------------------ Learn how Oracle Real Application Clusters (RAC) One Node allows customers to consolidate database storage, standardize their database environment, and, should the need arise, upgrade to a full multi-node Oracle RAC database without downtime or disruption http://p.sf.net/sfu/oracle-sfdevnl _______________________________________________ Bacula-devel mailing list Bacula-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-devel