One word of caution about random writes. From my experience, they are not nearly as fast as sequential writes (like 10 to 20 times slower) unless they are carefully aligned on the same boundary as the file system record size. Otherwise, there is a heavy read penalty that you can easily observe by doing a zpool iostat. So, depending on the workload, it's really a stretch to say random writes can be done at sequential speed.
Chuck -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of James C. McPherson Sent: Wednesday, May 10, 2006 5:18 PM To: Boyd Adamson Cc: ZFS filesystem discussion list Subject: Re: [zfs-discuss] ZFS and databases Hi Boyd, Boyd Adamson wrote: > One question that has come up a number of times when I've been > speaking with people (read: evangelizing :) ) about ZFS is about > database storage. In conventional use storage has separated redo logs > from table space, on a spindle basis. > I'm not a database expert but I believe the reasons boil down to a > combination of: > - Separation for redundancy correct > - Separation for reduction of bottlenecks (most write ops touch both > the logs and the table) correct > - Separation of usage patterns (logs are mostly sequential writes, > tables are random). correct > The question then comes up about whether in a ZFS world this > separation is still needed. I don't think it is. > It seems to me that each of the above reasons is to > some extent ameliorated by ZFS: > - Redundancy is performed at the filesystem level, probably on all > disks in the pool. more at the pool level iirc, but yes, over all the disks where you have them mirrored or raid/raidZ-ed > - Dynamic striping and copy-on-write mean that all write ops can be > striped across vdevs and the log writes can go right next to the table > writes Yes. No need to separate metadata (and archive/rollback logs are just that) > - Copy-on-write also turns almost all writes into sequential writes anyway. yup. > So it seems that the old reasoning may no longer apply. Is my thinking > correct here? Have I missed something? Do we have any information to > support either the use of a single pool or of separate pools for > database usage? To my way of thinking, you can still separate things out if you're not comfortable with having everything all together in the one pool. My take on that though is that it stems from an inability to appreciate just how different zfs is - a lack of paradigm shifting lets you down. If I was setting up a db server today and could use ZFS, then I'd be making sure that the DBAs didn't get a say in how the filesystems were laid out. I'd ask them what they want to see in a directory structure and provide that. If they want raw ("don't you know that everything is faster on raw?!?!") then I'd carve a zvol for them. Anything else would be carefully delineated - they stick to the rdbms and don't tell me how to do my job, and vice versa. cheers, James C. McPherson -- Solaris Datapath Engineering Data Management Group Sun Microsystems _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss