Brandon High wrote:
On Fri, Jul 9, 2010 at 5:18 PM, Brandon High <bh...@freaks.com
<mailto:bh...@freaks.com>> wrote:
I think that DDT entries are a little bigger than what you're
using. The size seems to range between 150 and 250 bytes depending
on how it's calculated, call it 200b each. Your 128G dataset would
require closer to 200M (+/- 25%) for the DDT if your data was
completely unique. 1TB of unique data would require 600M - 1000M
for the DDT.
Using 376b per entry, it's 376M for 128G of unique data, or just under
3GB for 1TB of unique data.
A 1TB zvol with 8k blocks would require almost 24GB of memory to hold
the DDT. Ouch.
-B
To reduce RAM requirements, consider an offline or idle time dedupe. I
suggested a variation of this in regards to compress a while ago,
probably on this list.
In either case, you have the system write the data whichever way is fastest.
If there is enough unused CPU power, run maximum compression, otherwise
use fast compression. If new data type specific compression algorithms
are added, attempt compression with those as well (e.g. lossless JPEG
recompression that can save 20-25% space). Store the block in whichever
compression format works best.
If there is enough RAM to maintain a live dedupe table, dedupe right away.
If CPU and RAM pressures are too high, defer dedupe and compression to a
periodic scrub (or some other new periodically run command). In the
deferred case, the dedupe table entries could be generated as blocks are
filled/change and then kept on disk. Periodically that table would be
quicksorted by the hash, and then any duplicates would be found next to
each other. The blocks for the duplicates would be looked up, verified
as truly identical, and then re-written (probably also using BP
rewrite). Quicksort is parallelable and sorting a multi-gigabyte table
is a plausible operation, even on disk. Quicksort 100mb pieces of it in
RAM and iterate until the whole table ends up sorted.
The end result of all this idle time compression and deduping is that
the initially allocated storage space becomes the upper bound storage
requirement, and that the data will end up packing tighter over time.
The phrasing on bulk packaged items comes to mind: "Contents may have
settled during shipping".
Now a theoretical question about dedupe...what about the interaction
with defragmentation (this also probably needs BP rewrite)? The first
file will be completely defragmented, but the second file that is a
slight variation of the first will have at least two fragments (the
deduped portion, and the unique portion). Probably the performance
impact will be minor as long as each fragment is a decent minimum size
(multiple MB).
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss