Richard Elling wrote:
Buy a large, read-optimized SSD (or several) and add it as a cache
device :-)
-- richard
On Nov 20, 2009, at 8:44 AM, Jesse Stroik wrote:
I'm migrating to ZFS and Solaris for cluster computing storage, and
have some completely static data sets that need to be as fast as
possible. One of the scenarios I'm testing is the addition of vdevs
to a pool.
Starting out, I populated a pool that had 4 vdevs. Then, I added 3
more vdevs and would like to balance this data across the pool for
performance. The data may be in subdirectories like this:
/proxy_data/instrument_X/domain_Y. Because of the access pattern
across the cluster, I need these subdirectories each spread across as
many disks as possible. Simply putting the data evenly on all vdevs
is suboptimal because it is likely the case that different files
within a single domain from a single instrument may be used with 200
jobs at once.
Because this particular data is 100% static, I cannot count on
reads/writes automatically balancing the pool.
Best,
Jesse Stroik
OK, maybe I'm missing something here, but ZFS should spread ALL data
across ALL vdevs, with the caveaut that very small files (under the min
stripe size) will only be found on a portion of the vdevs - that is,
such small files will only take up a portion of a single stripe. The
directory structure is irrelevant to how the data is written.
That is, the only determination as to how the file /foo/bar/baz is
located on disk is the actual file size of baz itself, and the level of
fragmentation of the zpool. For static write-once data like yours,
fragmentation shouldn't be an issue.
In your case, where you had a 4 vdev stripe, and then added 3 vdevs, I
would recommend re-copying the existing data to make sure it now covers
all 7 vdevs.
Thus, I'd do something like:
% cd /proxy_data
% for i in instrument_*
do
mkdir ${i}.new
rsync -a $i/ ${i}.new/
rm -rf $i
mv ${i}.new $i
done
Richard's suggestion, while tongue-in-cheek, has much merit. If you are
only going to be doing work on a small portion of your total data set at
once, but heavily hit that section, then you want to read cache (L2ARC)
as much of that as possible. Which means, either buy lots of RAM, or
get yourself an SSD. Good news is that you can likely use one of the
"cheaper" SSDs - the Intel X25-M is a good fit here for a Readzilla.
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss