Tony Galway writes: > I have a few questions regarding ZFS, and would appreciate if someone > could enlighten me as I work my way through. > > First write cache. >
We often use write cache to designate the cache present at the disk level. Lets call this "disk write cache". Most FS will cache information in host memory. Let's call this "FS cache". I think your questions are more about FS cache behavior for differnet types of loads.... > If I look at traditional UFS / VxFS type file systems, they normally > cache metadata to RAM before flushing it to disk. This helps increase > their perceived write performance (perceived in the sense that if a > power outage occurs, data loss can occur). > Correct and application can influence the behavior this with O_DSYNC,Fsync... > ZFS on the other hand, performs copy-on-write to ensure that the disk > is always consistent, I see this as sort of being equivalent to using > a directio option. I understand that the data is written first, then > the points are updated, but if I were to use the directio analogy, > would this be correct? As pointed out by Anton. That's a no here. The COW ensures that ZFS is always consistent but it's not really related to application consistency (that's the job of O_DSYNC,fsync)... So ZFS caches data on writes like most FS. > > If that is the case, then is it true that ZFS really does not use a > write cache at all? And if it does, then how is it used? > you write to cache and every 5 seconds, all the dirty data if shipped to disk in a transaction group. On low memory we also will not wait for the 5 second clock to hit and issue a txg. The problem you and many face, is lack of write throttling. This is being worked on and should be fix I hope soon. The perception that ZFS is Ram hungry will have to be reevaluated at that time. See: 6429205 each zpool needs to monitor it's throughput and throttle heavy writers > Read Cache. > > Any of us that have started using or benchmakring ZFS, have seen its > voracious appetite for memory, an appetite that is fully shared with > VxFS for example, as I am not singling out ZFS (I'm rather a fan). On > reboot of my T2000 test server (32GB Ram) I see that the arc cache max > size is set to 30.88GB - a sizeable piece of memory. > > Now, is all that cache space only for read cache? (given my assumption > regarding write cache) > > Tuneable Parameters: > I know that the philosophy of ZFS is that you should never have to > tune your file system, but might I suggest, that tuning the FS is not > always a bad thing. You can't expect a FS to be all things for all > people. If there are variables that can be modified to provide > different performance characteristics and profiles, then I would > contend that it could strengthen ZFS and lead to wider adoption and > acceptance if you could, for example, limit the amount of memory used > by items like the cache without messing with c_max / c_min directly in > the kernel. > Once we have write throttling, we will be better equipped to see if the ARC dynamical adjustments works or not. I believe most problems will go away and there will be less demand for such a tunable... On to your next mail... > -Tony > > > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss