Enda O'Connor wrote:
it works at a pool wide level with the ability to exclude at a dataset level, or the converse, if set to off at top level dataset can then set lower level datasets to on, ie one can include and exclude depending on the datasets contents.

so largefile will get deduped in the example below.

And you can use 'zdb -S' (which is a lot better now than it used to be before dedup) to see how much benefit is there (without even turning dedup on):

bash-3.2# zdb -S rpool
Simulated DDT histogram:

bucket              allocated                       referenced
______   ______________________________   ______________________________
refcnt   blocks   LSIZE   PSIZE   DSIZE   blocks   LSIZE   PSIZE   DSIZE
------   ------   -----   -----   -----   ------   -----   -----   -----
     1     625K    9.9G   7.90G   7.90G     625K    9.9G   7.90G   7.90G
     2     9.8K    184M    132M    132M    20.7K    386M    277M    277M
     4    1.21K   16.6M   10.8M   10.8M    5.71K   76.9M   48.6M   48.6M
     8      395    764K    745K    745K    3.75K   6.90M   6.69M   6.69M
    16      125   2.71M    888K    888K    2.60K   54.2M   17.9M   17.9M
    32       56   2.10M    750K    750K    2.33K   85.6M   29.8M   29.8M
    64        9   22.0K   22.0K   22.0K      778   2.04M   2.04M   2.04M
   128        4   6.00K   6.00K   6.00K      594    853K    853K    853K
   256        2      8K      8K      8K      711   2.78M   2.78M   2.78M
   512        2   4.50K   4.50K   4.50K    1.47K   3.52M   3.52M   3.52M
    8K        1    128K    128K    128K    15.9K   1.99G   1.99G   1.99G
   16K        2      8K      8K      8K    50.7K    203M    203M    203M
 Total     637K   10.1G   8.04G   8.04G     730K   12.7G   10.5G   10.5G

dedup = 1.30, compress = 1.22, copies = 1.00, dedup * compress / copies = 1.58

bash-3.2#

Be careful - can eat lots of RAM!


Many thanks to Jeff and all the team!

Regards,
Victor

Enda

Breandan Dezendorf wrote:
Does dedup work at the pool level or the filesystem/dataset level?
For example, if I were to do this:

bash-3.2$ mkfile 100m /tmp/largefile
bash-3.2$ zfs set dedup=off tank
bash-3.2$ zfs set dedup=on tank/dir1
bash-3.2$ zfs set dedup=on tank/dir2
bash-3.2$ zfs set dedup=on tank/dir3
bash-3.2$ cp /tmp/largefile /tank/dir1/largefile
bash-3.2$ cp /tmp/largefile /tank/dir2/largefile
bash-3.2$ cp /tmp/largefile /tank/dir3/largefile

Would largefile get dedup'ed?  Would I need to set dedup on for the
pool, and then disable where it isn't wanted/needed?

Also, will we need to move our data around (send/recv or whatever your
preferred method is) to take advantage of dedup?  I was hoping the
blockpointer rewrite code would allow an admin to simply turn on dedup
and let ZFS process the pool, eliminating excess redundancy as it
went.



_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to