[zfs-discuss] file not persistent after node bounce when there is a bad disk?

Peter Buckingham Mon, 22 Jan 2007 14:12:46 -0800

Hi All,

I noticed a behavior on a ZFS filesystem that was confusing to me andwas hoping someone can shed some light on it. The summary is that Icreated two files, waited one minute, bounced the node, and noticed thefiles weren't there when the node came back. There was a bad disk atthe time, which I believe is contributing to this problem. Details below.


thanks,

peter

--

Our platform is a modified x2100 system with 4 disks. We are runningthis version of Solaris:


$ more /etc/release
                       Solaris 10 11/06 s10x_u3wos_05a X86
           Copyright 2006 Sun Microsystems, Inc.  All Rights Reserved.
                        Use is subject to license terms.
                           Assembled 13 September 2006

One of my 4 disks is a flaky disk (/dev/dsk/c1t0d0) that is emittingthese sorts of errors:

Jan 19 00:32:55 somehost scsi: [ID 107833 kern.warning] WARNING:/[EMAIL PROTECTED],0/pci108e,[EMAIL PROTECTED]/[EMAIL PROTECTED],0 (sd1):

Jan 19 00:32:55 somehost  Error for Command: read(10) Error Level: Retryable

Jan 19 00:32:55 somehost scsi: [ID 107833 kern.notice] RequestedBlock: 23676213 Error Block: 1761607680Jan 19 00:32:55 somehost scsi: [ID 107833 kern.notice] Vendor: ATASerial Number:Jan 19 00:32:55 somehost scsi: [ID 107833 kern.notice] Sense Key:Media ErrorJan 19 00:32:55 somehost scsi: [ID 107833 kern.notice] ASC: 0x11(unrecovered read error), ASCQ: 0x0, FRU: 0x0



This disk participates in a pool:

$ zpool list
NAME                    SIZE    USED   AVAIL    CAP  HEALTH     ALTROOT
tank                   20.5G   1.11G   19.4G     5%  ONLINE     -

$ zpool status
  pool: tank
 state: ONLINE
 scrub: none requested
config:

        NAME          STATE     READ WRITE CKSUM
        tank          ONLINE       0     0     0
          mirror      ONLINE       0     0     0
            c0t0d0s3  ONLINE       0     0     0
            c0t1d0s3  ONLINE       0     0     0
            c1t0d0s3  ONLINE       0     0     0
            c1t1d0s3  ONLINE       0     0     0
          mirror      ONLINE       0     0     0
            c0t0d0s5  ONLINE       0     0     0
            c0t1d0s5  ONLINE       0     0     0
            c1t0d0s5  ONLINE       0     0     0
            c1t1d0s5  ONLINE       0     0     0

errors: No known data errors


The filesystem is mounted like this:

$ mount
...

/config on tank/config read/write/setuid/devices/exec/atime/dev=2d50003on Fri Jan 19 00:39:31 2007

...

I created two files and waited 60 seconds, thinking this would be enoughtime for the data to sync to disk before bouncing the node.


$ echo hi > /config/file
$ cat /config/file
hi
$ ls -l  /config/file
-rw-r--r--   1 root     root           3 Jan 19 00:35 /config/file
$ echo bye > /config/otherfile
$ ls -l /config/otherfile
-rw-r--r--   1 root     root           4 Jan 19 00:35 /config/otherfile
$ more /config/otherfile
bye
$ date
Fri Jan 19 00:36:06 GMT 2007
$ sleep 60
$ date
Fri Jan 19 00:37:13 GMT 2007
$ cat /config/file
hi
$ cat  /config/otherfile
bye

I caused the system to reboot abruptly (using remote power control, sono sync during reboot happened). What I noticed is that the file wasnot there after the node bounce:


$ Read from remote host somehost: Connection reset by peer
Connection to somehost closed.
$ ssh somehost
Sun Microsystems Inc.   SunOS 5.10      Generic January 2005
$ ls -l  /config/file
/config/file: No such file or directory
$ ls -l /config/otherfile
/config/otherfile: No such file or directory
$ zpool status
  pool: tank
 state: ONLINE
 scrub: none requested
config:

        NAME          STATE     READ WRITE CKSUM
        tank          ONLINE       0     0     0
          mirror      ONLINE       0     0     0
            c0t0d0s3  ONLINE       0     0     0
            c0t1d0s3  ONLINE       0     0     0
            c1t0d0s3  ONLINE       0     0     0
            c1t1d0s3  ONLINE       0     0     0
          mirror      ONLINE       0     0     0
            c0t0d0s5  ONLINE       0     0     0
            c0t1d0s5  ONLINE       0     0     0
            c1t0d0s5  ONLINE       0     0     0
            c1t1d0s5  ONLINE       0     0     0

errors: No known data errors
$ zpool list
NAME                    SIZE    USED   AVAIL    CAP  HEALTH     ALTROOT
tank                   20.5G   1.11G   19.4G     5%  ONLINE     -

Note that the bad disk on the node caused a normal reboot to hang. Ialso verified that sync from the command line hung. I don't know howZFS (or Solaris) handles situations involving bad disks...does a baddisk block proper ZFS/OS handling of all IO, even to the other healthydisks?

Is it reasonable to have assumed that after 60 seconds the data wouldhave been on persistent disk even without an explicit sync? I confess Idon't know how the underlying layers are implemented. Are there mountoptions or other config parameters we should tweak to get more reliablebehavior in this case?

So far as I've seen, this behavior is reproducible, if someone on theZFS team wishes to take a closer look at this scenario.

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] file not persistent after node bounce when there is a bad disk?

Reply via email to