On Mon, 30 Jun 2008, Richard Elling wrote: > > There is a general feeling that COW, as used by ZFS, will cause > all sorts of badness for database scans. Alas, there is a dearth of > real-world data on any impacts (I'm anxiously awaiting...)
It seems like the primary badness from ZFS as pertains to databases is the fact that it checksums each block and prefers large blocks. If the filesystem block size perfectly matches the database block size and blocks are perfectly aligned (database dependent), then performance should be pretty good. If 8K of a 128K block needs to be updated, then the 128K block needs to be read, checksummed, updated, checksummed, allocated (for COW), and then written. Clearly this cost is reduced if the amount of data involved is reduced. But 8k blocks will increase the cost of any fragmentation for sequential access since then there may be an extra seek for every 8K rather than for every 128K. An extent-based filesystem will also incur costs and may increase fragmentation. There is a maximum limit on ZFS fragmentation which is determined by the blocksize. It seems that load-shared mirrors will suffer least from fragmentation. The DTrace Toolkit provides a script called 'iopattern' which is quite helpful to understand how much I/O is random vs sequential and the type/size of the I/Os. Lots of random I/O while doing a sequential scan likely indicates fragmentation. > In this particular case, it would be cost effective to just buy a > bunch of RAM and not worry too much about disk I/O during > scans. In the future, if you significantly outgrow the RAM, then RAM is definitely your friend. Bob ====================================== Bob Friesenhahn [EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/ _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss