Peter Cudhea wrote: > Your point is well taken that ZFS should not duplicate functionality > that is already or should be available at the device driver level. In > this case, I think it misses the point of what ZFS should be doing that > it is not. > > ZFS does its own periodic commits to the disk, and it knows if those > commit points have reached the disk or not, or whether they are getting > errors. In this particular case, those commits to disk are presumably > failing, because one of the disks they depend on has been removed from > the system. (If the writes are not being marked as failures, that > would definitely be an error in the device driver, as you say.) In this > case, however, the ZIL log has stopped being updated, but ZFS does > nothing to announce that this has happened, or to indicate that a remedy > is required.
I think you have some misconceptions about how the ZIL works. It doesn't provide journalling like UFS. The following might help: http://blogs.sun.com/perrin/entry/the_lumberjack The ZIL isn't used at all unless there's fsync/O_DSYNC activity. > > At the very least, it would be extremely helpful if ZFS had a status to > report that indicates that the ZIL log is out of date, or that there are > troubles writing to the ZIL log, or something like that. If the ZIL cannot be written then we force a transaction group (txg) commit. That is the only recourse to force data to stable storage before returning to the application. > > An additional feature would be to have user-selectable behavior when the > ZIL log is significantly out of date. For example, if the ZIL log is > more than X seconds out of date, then new writes to the system should > pause, or give errors or continue to silently succeed. Again this doesn't make sense given how the ZIL works. > > In an earlier phase of my career when I worked for a database company, I > was responsible for a similar bug. It caused a major customer to lose > a major amount of data when a system rebooted when not all good data had > been successfully committed to disk. The resulting stink caused us to > add a feature to detect the cases when the writing-to-disk process had > fallen too far behind, and to pause new writes to the database until the > situation was resolved. > > Peter > > Bob Friesenhahn wrote: >> While I do believe that device drivers. or the fault system, should >> notify ZFS when a device fails (and ZFS should appropriately react), I >> don't think that ZFS should be responsible for fault monitoring. ZFS >> is in a rather poor position for device fault monitoring, and if it >> attempts to do so then it will be slow and may misbehave in other >> ways. The software which communicates with the device (i.e. the >> device driver) is in the best position to monitor the device. >> >> The primary goal of ZFS is to be able to correctly read data which was >> successfully committed to disk. There are programming interfaces >> (e.g. fsync(), msync()) which may be used to ensure that data is >> committed to disk, and which should return an error if there is a >> problem. If you were performing your tests over an NFS mount then the >> results should be considerably different since NFS requests that its >> data be committed to disk. >> >> Bob >> ====================================== >> Bob Friesenhahn >> [EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/ >> GraphicsMagick Maintainer, http://www.GraphicsMagick.org/ >> >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss@opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >> > _______________________________________________ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss