On 30 jun 2010, at 22.46, Garrett D'Amore wrote: > On Wed, 2010-06-30 at 22:28 +0200, Ragnar Sundblad wrote: > >> To be safe, the protocol needs to be able to discover that the devices >> (host or disk) has been disconnected and reconnected or has been reset >> and that either parts assumptions about the state of the other has to >> be invalidated. >> >> I don't know enough about either SAS or SATA to say if they guarantee that >> you will be noticed. But if they don't, they aren't safe for cached writes. > > Generally, ZFS will only notice a removed disk when it is trying to > write to it -- or when it probes. ZFS does not necessarily get notified > on hot device removal -- certainly not immediately.
That should be fine, as soon as it is informed on the next access. > (I've written some > code so that *will* notice, even if no write ever goes there... that's > the topic of another message.) > > The other thing is that disk writes are generally idempotent. So, if a > drive was removed between the time an IO was finished but before the > time the response was returned to the host, it isn't a problem. When > the disk is returned, ZFS should automatically retry the I/O. (In fact, > ZFS automatically retries failed I/O operations several times before > finally "failing".) I was referring to the case where zfs has written data to the drive but still hasen't issued a cache flush, and before the cache flush the drive is reset. If zfs finally issues a cache flush and then isn't informed that the drive has been reset, data is lost. I hope this is not the case, on any SCSI-based protocol or SATA. > The nasty race that occurs is if your system crashes or is powered off > *after* the log has acknowledged the write, but before the bits get > shoved to main pool storage. This is a data loss situation. With "log", do you mean the ZIL (with or without a slog device)? If so, that should not be an issue and is exactly with the ZIL is for - it will be replayed at the next filesystem attach and the data will be pushed to the main pool storage. Do I misunderstand you? /ragge _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss