Luke Lonergan wrote: > ZFS works marvelously well for data warehouse and analytic DBs. For lots of > small updates scattered across the breadth of the persistent working set, > it's not going to work well IMO. >
Actually, it does seem to work quite well when you use a read optimized SSD for the L2ARC. In that case, "random" read workloads have very fast access, once the cache is warm. -- richard > Note that we're using ZFS to host databases as large as 10,000 TB - that's > 10PB (!!). Solaris 10 U5 on X4540. That said - it's on 96 servers running > Greenplum DB. > > With SSD, the randomness won't matter much I expect, though the filesystem > won't be helping by virtue of this fragmentation effect of COW. > > - Luke > > ----- Original Message ----- > From: [EMAIL PROTECTED] <[EMAIL PROTECTED]> > To: zfs-discuss@opensolaris.org <zfs-discuss@opensolaris.org> > Sent: Sat Nov 22 16:43:53 2008 > Subject: Re: [zfs-discuss] ZFS fragmentation with MySQL databases > > Kees Nuyt wrote: > >> My explanation would be: Whenever a block within a file >> changes, zfs has to write it at another location ("copy on >> write"), so the previous version isn't immediately lost. >> >> Zfs will try to keep the new version of the block close to >> the original one, but after several changes on the same >> database page, things get pretty messed up and logical >> sequential I/O becomes pretty much physically random indeed. >> >> The original blocks will eventually be added to the freelist >> and reused, so proximity can be restored, but it will never >> be 100% sequential again. >> The effect is larger when many snapshots are kept, because >> older block versions are not freed, or when the same block >> is changed very often and freelist updating has to be >> postponed. >> >> That is the trade-off between "always consistent" and >> "fast". >> >> > Well, does that mean ZFS is not best suited for database engines as > underlying > filesystem? With databases it will always be fragmented, hence slow > performance? > > Because this way it would be best to use it for large file server that > don't usually change frequently. > > Thanks, > Tamer > _______________________________________________ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > _______________________________________________ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss