On Sun, 15 Feb 2009, Colin Raven wrote:
Pardon me for jumping into this discussion. I invariably lurk and keep mouth firmly shut. In this case however, curiosity and a degree of alarm bade me to jump in....could you elaborate on 'fragmentation' since the only context I know this is Windows. Now surely, ZFS doesn't suffer from the same sickness?
ZFS is "fragmented by design". Regardless, it takes steps to minimize fragmentation, and the costs of fragmentation. Files written sequentially at a reasonable rate of speed are usually contiguous on disk as well. A "slab" allocator is used in order to allocate space in larger units, and then dice this space up into ZFS 128K blocks so that related blocks will be close together on disk. The use of larger block sizes (default 128K vs 4K, or 8K) dramatically reduces the amount of disk seeking required for sequential I/O when fragmentation is present. Written data is buffered in RAM for up to 5 seconds before being written so that opportunities for contiguous storage are improved. When the pool has multiple vdevs, then ZFS's "load share" can also intelligently allocate file blocks across multiple disks such that there is minimal head movement, and multiple seeks can take place at once.
As a followup; is there any ongoing sensible way to defend against the dreaded fragmentation? A [shudder] "defrag" routine of some kind perhaps? Forgive the "silly questions" from the sidelines.....ignorance knows no bounds apparently :)
The most important thing is to never operate your pool close to 100% full. Always leave a reserve so that ZFS can use reasonable block allocation policies, and is not forced to allocate blocks in a way which causes additional performance penalty. Installing more RAM in the system is likely to decrease fragmentation since then ZFS can defer writes longer and make better choices about where to put the data.
Updating already written portions of files "in place" will convert a completely contiguous file into a fragmented file due to ZFS's copy-on-write design.
Bob ====================================== Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/ _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss