On Wed, 26 Sep 2007, Jason P. Warr wrote: > Hi all, > > I have an interesting project that I am working on. It is a large > volume file download service that is in need of a new box. There > current systems are not able to handle the load because for various > reasons they have become very I/O limited. We currently run on > Debian Linux with 3ware hardware RAID5. I am not sure of the exact > disk config as it is a leased system that I have never seen. > > The usage patterns are usually pretty consistent in that out of > 2-300 concurrent downloads you will find between 15 and 20 different > files being fetched. You can guess that when the machine is trying > to push a total of 2-300Mbit the disks are going crazy and due to > file sizes and only having 4GB of ram on the system caching is of > little use. The systems will regularly get into a 80-90% I/O wait > mode. The disk write speed is of almost no concern as there are > only a few files added each day. > > The system we have come up with is pretty robust. 2 dual core > Opterons, 32GB of ram, 8 750GB SATA disks. The disks are going to > be paired off in 2 disk RAID0 sets each with a complete copy of the > data. Essentially a manually replicated 4 way mirror set. The > download manager would then use each set in a round robin fashion. > This should substantially reduce the amount of frantic disk head > dancing. The second item is to dedicate 50-75% of the ram to a > ramdisk that would be the fetch path for the top 10 and new, hot > downloads. This should again reduce the seeking on files that are > being downloaded many times concurrently. > > My question for this list is with ZFS is there a way to access the > individual mirror sets in a pool if I were to create a 4 way > mirrored stripe set with the 8 disks? Even better would be if zfs > would manage the mirror set "load balancing" by intelligently > splitting up the reads amongst the 4 sets. Either way would make > for a more elegant solution than replicating the sets with > cron/rsync. >
Your basic requirement is for good/excellent random read performance with many concurrent IOPS (I/O Operations/Sec). The metric I use for disk IOPS are: disk drive IOPS ---------- ---- 7,200 RPM SATA 300 15k RPM 700 These numbers can be debated/argued - but that is what I use. So, to solve your problem, my first recommendation would be to use 15k RPM SAS drives, rather than SATA drives, for random I/O. Next, ZFS will automatically load balance read requests among the members of a multi-way mirror set. So if you were to form a pool like: zpool create fastrandom mirror disk1 disk2 disk3 disk4 you'd have a 4-way mirror that would sustain approx 1,200 (4 * 300) reads/Sec with SATA disks and 2,800 reads/Sec with 15k SAS disks. In addition, ZFS will make intelligent use of available RAM to cache data. I would suggest that you use the above config as a starting point and measure the resulting performance running the anticipated workload - without using any system memory for use as system buffering or as a RAM disk. Since its very convenient/fast to configure/reconfigure storage pools using the zfs interface, you can also experiment with 5-way, 6-way etc. pools - or form one pool using 4 devices for fast random access and assign the remaining disks as a raidz pool to maximize storage space. PS: If you take a look at genunix.org, you'll see I have some experience with this type of workload. We'll be deploying a zfs based storage system there next month - we had to resolve your type of issue when Belenix 0.6 was released - except for one twist - we had to use the existing infrastructure and would not accept any downtime. Feel free to email me offlist if I can help. Regards, Al Hopper Logical Approach Inc, Plano, TX. [EMAIL PROTECTED] Voice: 972.379.2133 Fax: 972.379.2134 Timezone: US CDT OpenSolaris Governing Board (OGB) Member - Apr 2005 to Mar 2007 http://www.opensolaris.org/os/community/ogb/ogb_2005-2007/ _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss