Re: [zfs-discuss] Data balance across vdevs

Jesse Stroik Fri, 20 Nov 2009 10:17:41 -0800

Thanks for the suggestions thus far,

Erik:

In your case, where you had a 4 vdev stripe, and then added 3 vdevs, Iwould recommend re-copying the existing data to make sure it now coversall 7 vdevs.

Yes, this was my initial reaction as well, but I am concerned with thefact that I do not know how zfs populates the vdevs. My naive guess isthat it either fills the most empty, or (and more likely) fills them ata rate relative to their amount of free space -- that is, the newdevices with more free space will get a disproportionate amount of someof the data.

Richard's suggestion, while tongue-in-cheek, has much merit. If you areonly going to be doing work on a small portion of your total data set atonce, but heavily hit that section, then you want to read cache (L2ARC)as much of that as possible. Which means, either buy lots of RAM, orget yourself an SSD. Good news is that you can likely use one of the"cheaper" SSDs - the Intel X25-M is a good fit here for a Readzilla.

The problem is that caching the data may often not help: we're storingtens of terabytes of data for some instruments, and we may only need toread each job worth of data once. So you could cache the data, but itsimply wouldn't be read again.

There are, of course, job types where you use the same set of data formultiple jobs, but having even a small amount of extra memory seems tobe very helpful in that case, as you'll have several nodes reading thesame data at roughly the same time.


Best,
Jesse
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Data balance across vdevs

Reply via email to