> Actually, it does seem to work quite > well when you use a read optimized > SSD for the L2ARC. In that case, > "random" read workloads have very > fast access, once the cache is warm.
One would expect so, yes. But the usefulness of this is limited to the cases where the entire working set will fit into an SSD cache. In other words, for random access across a working set larger (by say X%) than the SSD-backed L2 ARC, the cache is useless. This should asymptotically approach truth as X grows and experience shows that X=200% is where it's about 99% true. As time passes and SSDs get larger while many OLTP random workloads remain somewhat constrained in size, this becomes less important. Modern DB workloads are becoming hybridized, though. A 'mixed workload' scenario is now common where there are a mix of updated working sets and indexed access alongside heavy analytical 'update rarely if ever' kind of workloads. - Luke ----- Original Message ----- From: [EMAIL PROTECTED] <[EMAIL PROTECTED]> To: Luke Lonergan Cc: [EMAIL PROTECTED] <[EMAIL PROTECTED]>; zfs-discuss@opensolaris.org <zfs-discuss@opensolaris.org> Sent: Sat Nov 22 20:28:54 2008 Subject: Re: [zfs-discuss] ZFS fragmentation with MySQL databases Luke Lonergan wrote: > ZFS works marvelously well for data warehouse and analytic DBs. For lots of > small updates scattered across the breadth of the persistent working set, > it's not going to work well IMO. > Actually, it does seem to work quite well when you use a read optimized SSD for the L2ARC. In that case, "random" read workloads have very fast access, once the cache is warm. -- richard > Note that we're using ZFS to host databases as large as 10,000 TB - that's > 10PB (!!). Solaris 10 U5 on X4540. That said - it's on 96 servers running > Greenplum DB. > > With SSD, the randomness won't matter much I expect, though the filesystem > won't be helping by virtue of this fragmentation effect of COW. > > - Luke > > ----- Original Message ----- > From: [EMAIL PROTECTED] <[EMAIL PROTECTED]> > To: zfs-discuss@opensolaris.org <zfs-discuss@opensolaris.org> > Sent: Sat Nov 22 16:43:53 2008 > Subject: Re: [zfs-discuss] ZFS fragmentation with MySQL databases > > Kees Nuyt wrote: > >> My explanation would be: Whenever a block within a file >> changes, zfs has to write it at another location ("copy on >> write"), so the previous version isn't immediately lost. >> >> Zfs will try to keep the new version of the block close to >> the original one, but after several changes on the same >> database page, things get pretty messed up and logical >> sequential I/O becomes pretty much physically random indeed. >> >> The original blocks will eventually be added to the freelist >> and reused, so proximity can be restored, but it will never >> be 100% sequential again. >> The effect is larger when many snapshots are kept, because >> older block versions are not freed, or when the same block >> is changed very often and freelist updating has to be >> postponed. >> >> That is the trade-off between "always consistent" and >> "fast". >> >> > Well, does that mean ZFS is not best suited for database engines as > underlying > filesystem? With databases it will always be fragmented, hence slow > performance? > > Because this way it would be best to use it for large file server that > don't usually change frequently. > > Thanks, > Tamer > _______________________________________________ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > _______________________________________________ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss