Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

Bob Friesenhahn Sat, 25 Jul 2009 08:39:48 -0700

On Sat, 25 Jul 2009, roland wrote:

When that happens, ZFS believes the data is safely written, but apower cut or >crash can cause severe problems with the pool.
didn`t i read a million times that zfs ensures an "always consistentstate" and is self healing, too?
so, if new blocks are always written at new positions - why can`t wejust roll back to a point in time (for example last snapshot) whichis known to be safe/consistent ?

As soon as you have more then one disk in the equation, then it isvital that the disks commit their data when requested since otherwisethe data on disk will not be in a consistent state. If the diskssimply do whatever they want then some disks will have written thedata while other disks will still have it cached. This blows the"consistent state on disk" even though zfs wrote the data in order anddid all the right things. Any uncommitted data in disk cache will beforgotten if the system loses power.

There is an additional problem if when the disks finally get around towriting the cached data that they write it in a different order thanrequested while ignoring the commit request. It is common that thedisks write data in the most efficient order, but it absolutely mustcommit all of the data when requested so that the checkpoint is valid.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work

Reply via email to