On 6/28/2010 12:33 PM, valrh...@gmail.com wrote:
I'm putting together a new server, based on a Dell PowerEdge T410.
I have simple SAS controller, with six 2TB Hitachi DeskStar 7200 RPM SATA
drives. The processor is a quad-core 2 GHz Core i7-based Xeon.
I will run the drives as one set of three mirror pairs striped together, for 6
TB of homogeneous storage.
I'd like to run Dedup, but right now the server has only 4 GB of RAM. It has
been pointed out to me several times that this is far too little. So how much
should I buy? A few considerations:
1. I would like to run dedup on old copies of backups (dedup ratio for these
filesystems are 3+). Basically I have a few years of backups onto tape, and
will consolidate these. I need to have the data there on disk, but I rarely
need to access it (maybe once a month). So those filesystems can be exported,
and effectively shut off. Am I correct in guessing that, if a filesystem has
been exported, its dedup table is not in RAM, and therefore is not relevant to
RAM requirements? I don't mind if it's really slow to do the first and only
copy to the file system, as I can let it run for a week without a problem.
That's correct. An exported pool is effectively ignored by the system,
so it won't contribute to any ARC requirements.
2. Are the RAM requirements for ZFS with dedup based on the total available
zpool size (I'm not using thin provisioning), or just on how much data is in
the filesystem being deduped? That is, if I have 500 GB of deduped data but 6
TB of possible storage, which number is relevant for calculating RAM
requirements?
Requirements are based on *current* BLOCK usage, after dedup has
occurred. That is, ZFS needs an entry in the DDT for each block
actually allocated in the filesystem. The number of times that block is
referenced won't influence the DDT size, nor will the *potential* size
of the pool matter (other than for capacity planning). Remember that
ZFS uses variable size blocks, so you need to determine what your
average block size is in order to estimate your DDT usage.
3. What are the RAM requirements for ZFS in the absence of dedup? That is, if I
only have deduped filesystems in an exported state, and all that is active is
non-deduped, is 4 GB enough?
It of course depends heavily on your usage pattern, and the kind of
files you are serving up. ZFS requires at a bare minimum a couple of
dozen MB for its own usage. Everything above that is caching. Heavy
write I/O will also eat up RAM, as ZFS needs to cache the writes in RAM
before doing a large write I/O to backing store. Take a look at the
amount of data you expect to be using heavily - your RAM should probably
exceed this amount, plus an additional 1GB or so for the
OS/ZFS/kernel/etc use. That is assuming you are doing nothing but
fileserving on the system.
4. How does the L2ARC come into play? I can afford to buy a fast Intel X25M G2,
for instance, or any of the newer SandForce-based MLC SSDs to cache the dedup
table. But does it work that way? It's not really affordable for me to get more
than 16 GB of RAM on this system, because there are only four slots available,
and the 8 GB DIMMS are a bit pricey.
L2ARC is "secondary" ARC. ZFS attempts to cache all reads in the ARC
(Adaptive Read Cache) - should it find that it doesn't have enough space
in the ARC (which is RAM-resident), it will evict some data over to the
L2ARC (which in turn will simply dump the least-recently-used data when
it runs out of space). Remember, however, every time something gets
written to the L2ARC, a little bit of space is taken up in the ARC
itself (a pointer to the L2ARC entry needs to be kept in ARC). So, it's
not possible to have a giant L2ARC and tiny ARC. As a rule of thump, I
try not to have my L2ARC exceed my main RAM by more than 10-15x (with
really bigMem machines, I'm a bit looser and allow 20-25x or so, but
still...). So, if you are thinking of getting a 160GB SSD, it would be
wise to go for at minimum 8GB of RAM. Once again, the amount of ARC
space reserved for a L2ARC entry is fixed, and independent of the actual
block size stored in L2ARC. The jist of this is that tiny files eat up
a disproportionate amount of systems resources for their size (smaller
size = larger % overhead vis-a-vis large files).
5. Could I use one of the PCIe-based SSD cards for this purpose, such as the
brand-new OCZ Revo? That should be somewhere between a SATA-based SSD and RAM.
Thanks in advance for all of your advice and help.
ZFS doesn't care what you use for the L2ARC. Some of us actually use
Hard drives, so a PCI-E Flash card is entirely possible. The Revo is
possibly the first PCI-E Flash card that wasn't massively expensive,
otherwise, I don't think they'd be a good option. They're going to be
more expensive than even an SLC SSD, however. In addition, given that
L2ARC is heavily read-biased, cheap MLC SSDs are hard to beat for
performance/$. The biggest problem with PCI-E cards is that they
require a OS-specific drivers, and OpenSolaris doesn't always make the
cut for support.
In your specific case, I'd consider upgrading to 8GB RAM, and looking at
an 80GB MLC SSD. That's just blind guessing, since I don't know what
your usage (and file) pattern is.
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss