Regardless of the merit of the rest of your proposal, I think you have put your finger on the core of the problem: aside from some apparent reluctance on the part of some of the ZFS developers to believe that any problem exists here at all (and leaving aside the additional monkey wrench that using RAID-Z here would introduce, because one could argue that files used in this manner are poor candidates for RAID-Z anyway hence that there's no need to consider reorganizing RAID-Z files), the *only* down-side (other than a small matter of coding) to defragmenting files in the background in ZFS is the impact that would have on run-time performance (which should be minimal if the defragmentation is performed at lower priority) and the impact it would have on the space consumed by a snapshot that existed while the defragmentation was being done.
One way to eliminate the latter would be simply not to reorganize while any snapshot (or clone) existed: no worse than the situation today, and better whenever no snapshot or clone is present. That would change the perceived 'expense' of a snapshot, though, since you'd know you were potentially giving up some run-time performance whenever one existed - and it's easy to imagine installations which might otherwise like to run things such that a snapshot was *always* present. Another approach would be just to accept any increased snapshot space overhead. So many sequentially-accessed files are just written once and read-only thereafter that a lot of installations might not see any increased snapshot overhead at all. Some files are never accessed sequentially (or done so only in situations where performance is unimportant), and if they could be marked "Don't reorganize" then they wouldn't contribute any increased snapshot overhead either. One could introduce controls to limit the times when reorganization was done, though my inclination is to suspect that such additional knobs ought to be unnecessary. One way to eliminate almost completely the overhead of the additional disk accesses consumed by background defragmentation would be to do it as part of the existing background scrubbing activity, but for actively-updated files one might want to defragment more often than one needed to scrub. In any event, background defragmention should be a relatively easy feature to introduce and try out if suitable multi-block contiguous allocation mechanisms already exist to support ZFS's existing batch writes. Use of ZIL to perform opportunistic defragmentation while updated data was still present in the cache might be a bit more complex, but could still be worth investigating. - bill This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss