Re: [zfs-discuss] ZFS deduplication

Mike Gerdts Mon, 07 Jul 2008 21:04:36 -0700

On Mon, Jul 7, 2008 at 9:24 PM, Bob Friesenhahn
<[EMAIL PROTECTED]> wrote:
> On Mon, 7 Jul 2008, Mike Gerdts wrote:
>>  There tend to be organizational walls between those that manage
>>  storage and those that consume it.  As storage is distributed across
>>  a network (NFS, iSCSI, FC) things like delegated datasets and RBAC
>>  are of limited practical use.  Due to these factors and likely
>
> It seems that deduplication on the server does not provide much benefit to
> the client since the client always sees a duplicate.  It does not know that
> it doesn't need to cache or copy a block twice because it is a duplicate.
>  Only the server benefits from the deduplication except that maybe
> server-side caching improves and provides the client with a bit more
> performance.


I want the deduplication to happen where it can be most efficient.
Just like with snapshots and clones, the client will have no idea that
multiple metadata sets point to the same data.  If deduplication makes
it so that each GB of perceived storage is cheaper, clients benefit
because the storage provider is (or should be) charging less.

> While deduplication can obviously save server storage space, it does not
> seem to help much for backups, and it does not really help the user manage
> all of that data.  It does help the user in terms of less raw storage space
> but there is surely a substantial run-time cost associated with the
> deduplication mechanism.  None of the existing applications (based on POSIX
> standards) has any understanding of deduplication so they won't benefit from
> it.  If you use tar, cpio, or 'cp -r', to copy the contents of a directory
> tree, they will transmit just as much data as before and if the destintation
> does real-time deduplication, then the copy will be slower.  If the copy is
> to another server, then the copy time will be huge, just like before.

I agree.  Follow-on work needs to happen in the backup and especially
restore areas.  The first phase of work in this area is complete when
a full restore of all data (including snapshots and clones) takes the
same amount of space as was occupied during backup.

I suspect that if you take a look at the processor utililzation on
most storage devices you will find that there are lots of times that
the processors are relatively idle.  Deduplication can happen real
time in when the processors are not very busy, but dirty block
analysis should be queued during times of high processor utilization.
If you find that the processor can't keep up with the deduplication
workload it suggests that your processors aren't fast/plentiful enough
or you have deduplication enabled on inappropriate data sets.  The
same goes for I/O induced by the dedupe process.

In another message it was suggested that the size of the checksum
employed by zfs is so large that maintaining a database of the
checksums would be too costly.  It may be that a multi-level checksum
scheme is needed.  That is, perhaps the database of checksums uses a
32-bit or 64-bit hash of the 256 bit checksum.  If a hash collision
occurs then normal I/O routines are used for comparing the checksums.
If they are also the same, then compare the data.  It may be that the
intermediate comparison is more overhead than is needed because one
set of data is already in cache and in the worst case an I/O is needed
for the checksum or the data.  Why do two I/O's if only one is needed?

> Unless the backup system fully understands and has access to the filesystem
> deduplication mechanism, it will be grossly inefficient just like before.
>  Recovery from a backup stored in a sequential (e.g. tape) format which does
> understand deduplication would be quite interesting indeed.

Right now it is a mess.  Take a look at the situation for restoring
snapshots/clones and you will see that unless you use deduplication
during restore you need to go out and buy a lot of storage to do a
restore or highly duplicate data.

> Raw storage space is cheap.  Managing the data is what is expensive.

The systems that make the raw storage scale to petabytes of fault
tolerant storage are very expensive and sometimes quite complex.
Needing fewer or smaller spindles should mean less energy consumption,
less space, lower MTTR, higher MTTDL, and less complexity in all the
hardware used to string it all together.

>
> Perhaps deduplication is a response to an issue which should be solved
> elsewhere?

Perhaps.  However, I take a look at my backup and restore options for
zfs today and don't think the POSIX API is the right way to go - at
least as I've seen it used so far.  Unless something happens that
makes restores of clones retain the initial space efficiency or
deduplication hides the problem, clones are useless in most
environments.  If this problem is solved via fixing backups and
restores, deduplication seems even more like the next step to take for
storage efficiency.  If it is solved by adding deduplication then we
get the other benefits of deduplication at the same time.

And after typing this message, deduplication is henceforth known as "d11n".  :)

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS deduplication

Reply via email to