Note that the bad disk on the node caused a normal reboot to hang. I also verified that sync from the command line hung. I don't know how ZFS (or Solaris) handles situations involving bad disks...does a bad disk block proper ZFS/OS handling of all IO, even to the other healthy disks?

Is it reasonable to have assumed that after 60 seconds the data would have been on persistent disk even without an explicit sync? I confess I don't know how the underlying layers are implemented. Are there mount options or other config parameters we should tweak to get more reliable behavior in this case?

Hey Peter,

The first thing i would do is see if any I/O is happening ('zpool iostat 1'). If there's none, then perhaps the machine is hung (which you then would want to grab a couple of '::threadlist -v 10's from mdb to figure out if there are hung threads).

60 seconds should be plenty of time for the async write(s) to complete. We try to push out txg (transaction groups) every 5 seconds. However, if the system is overloaded, then the txgs could take longer.

They 'sync' hanging is intriguing. Perhaps the system is just overloaded and sync command is making it worse. Seeing what 'fsync' would do would be interesting.


So far as I've seen, this behavior is reproducible, if someone on the ZFS team wishes to take a closer look at this scenario.

What else is the machine doing?

eric

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to