michael schuster wrote:
Roland Rambau wrote:
gang,
actually a simpler version of that idea would be a "zcp":
if I just cp a file, I know that all blocks of the new file
will be duplicates; so the cp could take full advantage for
the dedup without a need to check/read/write anz actual data
I think they call it 'ln' ;-) and that even works on ufs.
Michael
+1
More and more it sounds like an optimization that will either
A. not add much over dedup
or
B. have value only in specific situations - and completely misbehave in
other situations (even the same situations after passage of time)
Why not just make a special-purpose application (completely user-land)
for it? I know, 'ln' is remotely kin of this idea but, 'ln' is POSIX and
people know what to expect.
What you'd practically need to do is whip up a vfs layer that exposes
the underlying blocks of a filesystem and possibly name them by their
SHA256 or MD5 hash. Then you'd need (another?) vfs abstraction that
allows 'virtual' files to be assembled from these blocks in multiple
independent chains.
I know there is already a fuse implementation of the first vfs driver
(the name evades me, but I think it was something like chunkfs[1]) and
one could at least whip up a reasonable read-only Proof-of-Concept of
the second part.
The reason _I_ wouldn't do that is because, I'm already happy with e.g.:
mkfifo /var/run/my_part_collector
(while true; do cat /local/data/my_part_* >
/var/run/my_part_collector; done)&
wc -l /var/run/my_part_collector
The equivalent of this could be (better) expressed in C, perl or any
language of your choice). I believe this is all POSIX.
[1] The reason this exists is obviously for backup and synchronization
implementations: it will make it possible to backup files using rsync
when the encryption key is not available to the backup process (with a
EBC mode crypto algo); it should make it 'simple' to synchronize ones
large monolythic files with e.g. Amazon S3 cloud storage etc. etc.
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss