On Thu, Dec 29, 2011 at 9:53 AM, Brad Diggs <brad.di...@oracle.com> wrote: > Jim, > > You are spot on. I was hoping that the writes would be close enough to > identical that > there would be a high ratio of duplicate data since I use the same record > size, page size, > compression algorithm, … etc. However, that was not the case. The main > thing that I > wanted to prove though was that if the data was the same the L1 ARC only > caches the > data that was actually written to storage. That is a really cool thing! I > am sure there will > be future study on this topic as it applies to other scenarios. > > With regards to directory engineering investing any energy into optimizing > ODSEE DS > to more effectively leverage this caching potential, that won't happen. OUD > far out > performs ODSEE. That said OUD may get some focus in this area. However, > time will > tell on that one.
Databases are not as likely to benefit from dedup as virtual machines, indeed, DBs are likely to not benefit at all from dedup. The VM use case benefits from dedup for the obvious reason that many VMs will have the same exact software installed most of the time, using the same filesystems, and the same patch/update installation order, so if you keep data out of their root filesystems then you can expect enormous deduplicatiousness. But databases, not so much. The unit of deduplicable data in a VM use case is the guest's preferred block size, while in a DB the unit of deduplicable data might be a variable-sized table row, or even smaller: a single row/column value -- and you have no way to ensure alignment of individual deduplicable units nor ordering of sets of deduplicable units into larger ones. When it comes to databases your best bets will be: a) database-level compression or dedup features (e.g., Oracle's column-level compression feature) or b) ZFS compression. (Dedup makes VM management easier, because the alternative is to patch one master guest VM [per-guest type] then re-clone and re-configure all instances of that guest type, in the process possibly losing any customizations in those guests. But even before dedup, the ability to snapshot and clone datasets was an impressive dedup-like tool for the VM use-case, just not as convenient as dedup.) Nico -- _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss