Roy Sigurd Karlsbakk wrote:
----- "Haudy Kazemi" <kaze0...@umn.edu> skrev:
In this file system, 2.75 million blocks are allocated. The in-core
size
of a DDT entry is approximately 250 bytes. So the math is pretty
simple:
in-core size = 2.63M * 250 = 657.5 MB
If your dedup ratio is 1.0, then this number will scale linearly with
size.
If the dedup rate > 1.0, then this number will not scale linearly, it
will be
less. So you can use the linear scale as a worst-case approximation.
How large was this filesystem?
Are there any good ways of planning memory or SSDs for this?
roy
If you mean figuring out how big memory should be BEFORE you write any
data, You need to guesstimate the average block size for the files you
are storing in the zpool, which is highly data-dependent. In general,
consider that zfs will write a file of size X using a block size of Y
where Y a power of 2 and the minimum amount needed such that X < Y, up
to a maximum of Y=128k. So, look at your (potential) data, and
consider how big files are.
DDT requirements for RAM/L2ARC would be: 250 bytes * # blocks
So, let's say I'm considering a 1TB pool, where I think I'm going to be
storing 200GB worth of MP3s, 200GB of source code, 200GB of misc Office
docs, 200GB of various JPEG image files from my 8 megapixel camera.
(don't want more than 80% full!)
Assumed block sizes & thus number of blocks for:
Data Block Size # Blocks per 200GB
MP3 128k ~1.6 million
Source Code 1k ~200 million
Office docs 32k ~6.5 million
Pictures 4k ~52 million
Thus, total number of blocks you'll need = ~260 million
DDT tables size = 260 million * 250 bytes = 65GB
Note that the source code takes up 20% of the space, but requires 80% of
the DDT entries.
Given that the above is the worst case for that file mix (actual
dedup/compression will lower the total block count), I would use it for
the max L2ARC size you want.
RAM sizing is dependent on the size of your *active* working set of
files; I'd want enough RAM to cache both all my writes and my most
commonly-read files into RAM all at once.
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss