On 30 jun 2010, at 22.46, Garrett D'Amore wrote:

> On Wed, 2010-06-30 at 22:28 +0200, Ragnar Sundblad wrote:
> 
>> To be safe, the protocol needs to be able to discover that the devices
>> (host or disk) has been disconnected and reconnected or has been reset
>> and that either parts assumptions about the state of the other has to
>> be invalidated.
>> 
>> I don't know enough about either SAS or SATA to say if they guarantee that
>> you will be noticed. But if they don't, they aren't safe for cached writes.
> 
> Generally, ZFS will only notice a removed disk when it is trying to
> write to it -- or when it probes.  ZFS does not necessarily get notified
> on hot device removal -- certainly not immediately.

That should be fine, as soon as it is informed on the next access.

>  (I've written some
> code so that *will* notice, even if no write ever goes there... that's
> the topic of another message.)
> 
> The other thing is that disk writes are generally idempotent.  So, if a
> drive was removed between the time an IO was finished but before the
> time the response was returned to the host, it isn't a problem.   When
> the disk is returned, ZFS should automatically retry the I/O.  (In fact,
> ZFS automatically retries failed I/O operations several times before
> finally "failing".)

I was referring to the case where zfs has written data to the drive but
still hasen't issued a cache flush, and before the cache flush the drive
is reset. If zfs finally issues a cache flush and then isn't informed
that the drive has been reset, data is lost.

I hope this is not the case, on any SCSI-based protocol or SATA.

> The nasty race that occurs is if your system crashes or is powered off
> *after* the log has acknowledged the write, but before the bits get
> shoved to main pool storage.  This is a data loss situation.

With "log", do you mean the ZIL (with or without a slog device)?
If so, that should not be an issue and is exactly with the ZIL
is for - it will be replayed at the next filesystem attach and the
data will be pushed to the main pool storage. Do I misunderstand you?

/ragge

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to