Re: [zfs-discuss] Dedup question

2009-11-23 Thread Michael Schuster
Colin Raven wrote: Folks, I've been reading Jeff Bonwick's fascinating dedup post. This is going to sound like either the dumbest or the most obvious question ever asked, but, if you don't know and can't produce meaningful RTFM resultsask...so here goes: Assuming you have a dataset in a

[zfs-discuss] Dedup question

2009-11-23 Thread Colin Raven
Folks, I've been reading Jeff Bonwick's fascinating dedup post. This is going to sound like either the dumbest or the most obvious question ever asked, but, if you don't know and can't produce meaningful RTFM resultsask...so here goes: Assuming you have a dataset in a zfs pool that's been dedu

Re: [zfs-discuss] dedup question

2009-11-03 Thread Jeff Savit
On 11/ 2/09 07:42 PM, Craig S. Bell wrote: I just stumbled across a clever visual representation of deduplication: http://loveallthis.tumblr.com/post/166124704 It's a flowchart of the lyrics to "Hey Jude". =-) Nothing is compressed, so you can still read all of the words. Instead, all of th

Re: [zfs-discuss] dedup question

2009-11-03 Thread Toby Thain
On 2-Nov-09, at 3:16 PM, Nicolas Williams wrote: On Mon, Nov 02, 2009 at 11:01:34AM -0800, Jeremy Kitchen wrote: forgive my ignorance, but what's the advantage of this new dedup over the existing compression option? Wouldn't full-filesystem compression naturally de-dupe? ... There are man

Re: [zfs-discuss] dedup question

2009-11-02 Thread Craig S. Bell
I just stumbled across a clever visual representation of deduplication: http://loveallthis.tumblr.com/post/166124704 It's a flowchart of the lyrics to "Hey Jude". =-) Nothing is compressed, so you can still read all of the words. Instead, all of the duplicates have been folded together. -ch

Re: [zfs-discuss] dedup question

2009-11-02 Thread Mike Gerdts
On Mon, Nov 2, 2009 at 2:16 PM, Nicolas Williams wrote: > On Mon, Nov 02, 2009 at 11:01:34AM -0800, Jeremy Kitchen wrote: >> forgive my ignorance, but what's the advantage of this new dedup over >> the existing compression option?  Wouldn't full-filesystem compression >> naturally de-dupe? > > If

Re: [zfs-discuss] dedup question

2009-11-02 Thread Nicolas Williams
On Mon, Nov 02, 2009 at 11:01:34AM -0800, Jeremy Kitchen wrote: > forgive my ignorance, but what's the advantage of this new dedup over > the existing compression option? Wouldn't full-filesystem compression > naturally de-dupe? If you snapshot/clone as you go, then yes, dedup will do little

Re: [zfs-discuss] dedup question

2009-11-02 Thread Victor Latushkin
Jeremy Kitchen wrote: On Nov 2, 2009, at 9:07 AM, Victor Latushkin wrote: Enda O'Connor wrote: it works at a pool wide level with the ability to exclude at a dataset level, or the converse, if set to off at top level dataset can then set lower level datasets to on, ie one can include and ex

Re: [zfs-discuss] dedup question

2009-11-02 Thread roland
>forgive my ignorance, but what's the advantage of this new dedup over >the existing compression option? it may provide another space saving advantage. depending on your data, the savings can be very significant. >Wouldn't full-filesystem compression >naturally de-dupe? no. compression doesn`t

Re: [zfs-discuss] dedup question

2009-11-02 Thread Cyril Plisko
On Mon, Nov 2, 2009 at 9:01 PM, Jeremy Kitchen wrote: > > forgive my ignorance, but what's the advantage of this new dedup over the > existing compression option?  Wouldn't full-filesystem compression naturally > de-dupe? No, the compression works on the block level. If there are two identical bl

Re: [zfs-discuss] dedup question

2009-11-02 Thread Jeremy Kitchen
On Nov 2, 2009, at 9:07 AM, Victor Latushkin wrote: Enda O'Connor wrote: it works at a pool wide level with the ability to exclude at a dataset level, or the converse, if set to off at top level dataset can then set lower level datasets to on, ie one can include and exclude depending on t

Re: [zfs-discuss] dedup question

2009-11-02 Thread Victor Latushkin
Enda O'Connor wrote: it works at a pool wide level with the ability to exclude at a dataset level, or the converse, if set to off at top level dataset can then set lower level datasets to on, ie one can include and exclude depending on the datasets contents. so largefile will get deduped in t

Re: [zfs-discuss] dedup question

2009-11-02 Thread Breandan Dezendorf
On Mon, Nov 2, 2009 at 9:41 AM, Enda O'Connor wrote: > it works at a pool wide level with the ability to exclude at a dataset > level, or the converse, if set to off at top level dataset can then set > lower level datasets to on, ie one can include and exclude depending on the > datasets contents.

Re: [zfs-discuss] dedup question

2009-11-02 Thread Enda O'Connor
it works at a pool wide level with the ability to exclude at a dataset level, or the converse, if set to off at top level dataset can then set lower level datasets to on, ie one can include and exclude depending on the datasets contents. so largefile will get deduped in the example below. End

[zfs-discuss] dedup question

2009-11-02 Thread Breandan Dezendorf
Does dedup work at the pool level or the filesystem/dataset level? For example, if I were to do this: bash-3.2$ mkfile 100m /tmp/largefile bash-3.2$ zfs set dedup=off tank bash-3.2$ zfs set dedup=on tank/dir1 bash-3.2$ zfs set dedup=on tank/dir2 bash-3.2$ zfs set dedup=on tank/dir3 bash-3.2$ cp /t