On Wed, 26 Sep 2007, Jason P. Warr wrote:

> Hi all,
>
> I have an interesting project that I am working on.  It is a large 
> volume file download service that is in need of a new box.  There 
> current systems are not able to handle the load because for various 
> reasons they have become very I/O limited.  We currently run on 
> Debian Linux with 3ware hardware RAID5.  I am not sure of the exact 
> disk config as it is a leased system that I have never seen.
>
> The usage patterns are usually pretty consistent in that out of 
> 2-300 concurrent downloads you will find between 15 and 20 different 
> files being fetched.  You can guess that when the machine is trying 
> to push a total of 2-300Mbit the disks are going crazy and due to 
> file sizes and only having 4GB of ram on the system caching is of 
> little use.  The systems will regularly get into a 80-90% I/O wait 
> mode.  The disk write speed is of almost no concern as there are 
> only a few files added each day.
>
> The system we have come up with is pretty robust.  2 dual core 
> Opterons, 32GB of ram, 8 750GB SATA disks.  The disks are going to 
> be paired off in 2 disk RAID0 sets each with a complete copy of the 
> data.  Essentially a manually replicated 4 way mirror set.  The 
> download manager would then use each set in a round robin fashion. 
> This should substantially reduce the amount of frantic disk head 
> dancing.  The second item is to dedicate 50-75% of the ram to a 
> ramdisk that would be the fetch path for the top 10 and new, hot 
> downloads.  This should again reduce the seeking on files that are 
> being downloaded many times concurrently.
>
> My question for this list is with ZFS is there a way to access the 
> individual mirror sets in a pool if I were to create a 4 way 
> mirrored stripe set with the 8 disks?  Even better would be if zfs 
> would manage the mirror set "load balancing" by intelligently 
> splitting up the reads amongst the 4 sets.  Either way would make 
> for a more elegant solution than replicating the sets with 
> cron/rsync.
>

Your basic requirement is for good/excellent random read performance 
with many concurrent IOPS (I/O Operations/Sec).  The metric I use for 
disk IOPS are:

disk drive        IOPS
----------        ----
7,200 RPM SATA    300
15k RPM           700

These numbers can be debated/argued - but that is what I use.  So, to 
solve your problem, my first recommendation would be to use 15k RPM 
SAS drives, rather than SATA drives, for random I/O.

Next, ZFS will automatically load balance read requests among the 
members of a multi-way mirror set.  So if you were to form a pool 
like:

zpool create fastrandom mirror  disk1 disk2 disk3 disk4

you'd have a 4-way mirror that would sustain approx 1,200 (4 * 
300) reads/Sec with SATA disks and 2,800 reads/Sec with 15k SAS disks. 
In addition, ZFS will make intelligent use of available RAM to cache 
data.

I would suggest that you use the above config as a starting point and 
measure the resulting performance running the anticipated workload - 
without using any system memory for use as system buffering or as a 
RAM disk.

Since its very convenient/fast to configure/reconfigure storage pools 
using the zfs interface, you can also experiment with 5-way, 6-way 
etc. pools - or form one pool using 4 devices for fast random access 
and assign the remaining disks as a raidz pool to maximize storage 
space.

PS: If you take a look at genunix.org, you'll see I have some 
experience with this type of workload.  We'll be deploying a zfs based 
storage system there next month - we had to resolve your type of issue 
when Belenix 0.6 was released - except for one twist - we had to use 
the existing infrastructure and would not accept any downtime.

Feel free to email me offlist if I can help.

Regards,

Al Hopper  Logical Approach Inc, Plano, TX.  [EMAIL PROTECTED]
            Voice: 972.379.2133 Fax: 972.379.2134  Timezone: US CDT
OpenSolaris Governing Board (OGB) Member - Apr 2005 to Mar 2007
http://www.opensolaris.org/os/community/ogb/ogb_2005-2007/
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to