Hi All,

I noticed a behavior on a ZFS filesystem that was confusing to me and was hoping someone can shed some light on it. The summary is that I created two files, waited one minute, bounced the node, and noticed the files weren't there when the node came back. There was a bad disk at the time, which I believe is contributing to this problem. Details below.

thanks,

peter

--



Our platform is a modified x2100 system with 4 disks. We are running this version of Solaris:

$ more /etc/release
                       Solaris 10 11/06 s10x_u3wos_05a X86
           Copyright 2006 Sun Microsystems, Inc.  All Rights Reserved.
                        Use is subject to license terms.
                           Assembled 13 September 2006


One of my 4 disks is a flaky disk (/dev/dsk/c1t0d0) that is emitting these sorts of errors:

Jan 19 00:32:55 somehost scsi: [ID 107833 kern.warning] WARNING: /[EMAIL PROTECTED],0/pci108e,[EMAIL PROTECTED]/[EMAIL PROTECTED],0 (sd1):
Jan 19 00:32:55 somehost  Error for Command: read(10) Error Level: Retryable
Jan 19 00:32:55 somehost scsi: [ID 107833 kern.notice] Requested Block: 23676213 Error Block: 1761607680 Jan 19 00:32:55 somehost scsi: [ID 107833 kern.notice] Vendor: ATA Serial Number: Jan 19 00:32:55 somehost scsi: [ID 107833 kern.notice] Sense Key: Media Error Jan 19 00:32:55 somehost scsi: [ID 107833 kern.notice] ASC: 0x11 (unrecovered read error), ASCQ: 0x0, FRU: 0x0


This disk participates in a pool:

$ zpool list
NAME                    SIZE    USED   AVAIL    CAP  HEALTH     ALTROOT
tank                   20.5G   1.11G   19.4G     5%  ONLINE     -

$ zpool status
  pool: tank
 state: ONLINE
 scrub: none requested
config:

        NAME          STATE     READ WRITE CKSUM
        tank          ONLINE       0     0     0
          mirror      ONLINE       0     0     0
            c0t0d0s3  ONLINE       0     0     0
            c0t1d0s3  ONLINE       0     0     0
            c1t0d0s3  ONLINE       0     0     0
            c1t1d0s3  ONLINE       0     0     0
          mirror      ONLINE       0     0     0
            c0t0d0s5  ONLINE       0     0     0
            c0t1d0s5  ONLINE       0     0     0
            c1t0d0s5  ONLINE       0     0     0
            c1t1d0s5  ONLINE       0     0     0

errors: No known data errors


The filesystem is mounted like this:

$ mount
...
/config on tank/config read/write/setuid/devices/exec/atime/dev=2d50003 on Fri Jan 19 00:39:31 2007
...


I created two files and waited 60 seconds, thinking this would be enough time for the data to sync to disk before bouncing the node.

$ echo hi > /config/file
$ cat /config/file
hi
$ ls -l  /config/file
-rw-r--r--   1 root     root           3 Jan 19 00:35 /config/file
$ echo bye > /config/otherfile
$ ls -l /config/otherfile
-rw-r--r--   1 root     root           4 Jan 19 00:35 /config/otherfile
$ more /config/otherfile
bye
$ date
Fri Jan 19 00:36:06 GMT 2007
$ sleep 60
$ date
Fri Jan 19 00:37:13 GMT 2007
$ cat /config/file
hi
$ cat  /config/otherfile
bye


I caused the system to reboot abruptly (using remote power control, so no sync during reboot happened). What I noticed is that the file was not there after the node bounce:

$ Read from remote host somehost: Connection reset by peer
Connection to somehost closed.
$ ssh somehost
Sun Microsystems Inc.   SunOS 5.10      Generic January 2005
$ ls -l  /config/file
/config/file: No such file or directory
$ ls -l /config/otherfile
/config/otherfile: No such file or directory
$ zpool status
  pool: tank
 state: ONLINE
 scrub: none requested
config:

        NAME          STATE     READ WRITE CKSUM
        tank          ONLINE       0     0     0
          mirror      ONLINE       0     0     0
            c0t0d0s3  ONLINE       0     0     0
            c0t1d0s3  ONLINE       0     0     0
            c1t0d0s3  ONLINE       0     0     0
            c1t1d0s3  ONLINE       0     0     0
          mirror      ONLINE       0     0     0
            c0t0d0s5  ONLINE       0     0     0
            c0t1d0s5  ONLINE       0     0     0
            c1t0d0s5  ONLINE       0     0     0
            c1t1d0s5  ONLINE       0     0     0

errors: No known data errors
$ zpool list
NAME                    SIZE    USED   AVAIL    CAP  HEALTH     ALTROOT
tank                   20.5G   1.11G   19.4G     5%  ONLINE     -



Note that the bad disk on the node caused a normal reboot to hang. I also verified that sync from the command line hung. I don't know how ZFS (or Solaris) handles situations involving bad disks...does a bad disk block proper ZFS/OS handling of all IO, even to the other healthy disks?

Is it reasonable to have assumed that after 60 seconds the data would have been on persistent disk even without an explicit sync? I confess I don't know how the underlying layers are implemented. Are there mount options or other config parameters we should tweak to get more reliable behavior in this case?

So far as I've seen, this behavior is reproducible, if someone on the ZFS team wishes to take a closer look at this scenario.
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to