I don't really have an explanation. Perhaps flaky second controller hardware that only works sometimes and can corrupt pools? Have you seen any other strangeness/instability on this computer? Did you use zpool export before moving the disks the first time to the second controller, or did you just move them without exporting?

If you dd zero wipe the disks that made up this test pool, and then recreate the test pool, does it behave the same way the second time?



Jan Hellevik wrote:
Ok - this is really strange. I did a test. Wiped my second pool (4 disks like 
the other pool), and used them to create a pool similar to the one I have 
problems with.

Then i powered off, moved the disks and powered on. Same error message as 
before. Moved the disks back to the original controller. Pool is ok. Moved the 
disks to the new controller.  At first it is exactly like my original problem, 
but when i did a second zpool import, the pool is imported ok.

Zpool status reports the same as before. I run the same command as I did the 
first time:
zpool status
zpool import
zpool export
format
cfgadm
zpool status
zpool import ---> now it imports the pool!

How can this be? The only difference (as far as I can tell) is that the cache/log is on a 
2.5" Samsung disk insted of a 2.5" OCZ SSD.

Details follow (it is long - sorry):

Also note below - I did a zpool destroy mpool before poweroff - when I powered 
on and did a zpool status it show the pool as UNAVAIL. It should not be there 
at all, if I understand correctly?

----- create the partitions for log and cache

             Total disk size is 30401 cylinders
             Cylinder size is 16065 (512 byte) blocks

                                               Cylinders
      Partition   Status    Type          Start   End   Length    %
      =========   ======    ============  =====   ===   ======   ===
          1                 Solaris2          1   608     608      2
          2                 Solaris2        609  3040    2432      8

format> quit
j...@opensolaris:~# zpool destroy mpool
j...@opensolaris:~# poweroff

Last login: Sun May 16 17:07:15 2010 from macpro.janhelle
Sun Microsystems Inc.   SunOS 5.11      snv_134 February 2010
j...@opensolaris:~$ pfexec bash
j...@opensolaris:~# format
Searching for disks...done

AVAILABLE DISK SELECTIONS:
       0. c8d0 <DEFAULT cyl 6394 alt 2 hd 255 sec 63>
          /p...@0,0/pci-...@14,1/i...@0/c...@0,0
       1. c10d0 <DEFAULT cyl 606 alt 2 hd 255 sec 63>
          /p...@0,0/pci-...@11/i...@0/c...@0,0
       2. c10d1 <SAMSUNG-S0MUJ1KP98569-0001-465.76GB>
          /p...@0,0/pci-...@11/i...@0/c...@1,0
       3. c11d0 <SAMSUNG-S0MUJ1MP91161-0001-465.76GB>
          /p...@0,0/pci-...@11/i...@1/c...@0,0
       4. c12d0 <SAMSUNG-S0MUJ1MP91161-0001-465.76GB>
          /p...@0,0/pci-...@14,1/i...@1/c...@0,0
       5. c12d1 <SAMSUNG-S0MUJ1KP98569-0001-465.76GB>
          /p...@0,0/pci-...@14,1/i...@1/c...@1,0
Specify disk (enter its number): ^C
j...@opensolaris:~# zpool create vault2 raidz c10d1 c11d0 c12d0 c12d1
j...@opensolaris:~# zpool status

------ this pool is the one I destroyed - why is it here now?

  pool: mpool
 state: UNAVAIL
status: One or more devices could not be opened.  There are insufficient
        replicas for the pool to continue functioning.
action: Attach the missing device and online it using 'zpool online'.
   see: http://www.sun.com/msg/ZFS-8000-3C
 scrub: none requested
config:

        NAME         STATE     READ WRITE CKSUM
        mpool        UNAVAIL      0     0     0  insufficient replicas
          mirror-0   UNAVAIL      0     0     0  insufficient replicas
            c13t2d0  UNAVAIL      0     0     0  cannot open
            c13t0d0  UNAVAIL      0     0     0  cannot open
          mirror-1   UNAVAIL      0     0     0  insufficient replicas
            c13t3d0  UNAVAIL      0     0     0  cannot open
            c13t1d0  UNAVAIL      0     0     0  cannot open

  pool: rpool
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        rpool       ONLINE       0     0     0
          c8d0s0    ONLINE       0     0     0

errors: No known data errors

  pool: vault2
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        vault2      ONLINE       0     0     0
          raidz1-0  ONLINE       0     0     0
            c10d1   ONLINE       0     0     0
            c11d0   ONLINE       0     0     0
            c12d0   ONLINE       0     0     0
            c12d1   ONLINE       0     0     0

errors: No known data errors
j...@opensolaris:~# zpool destroy mpool
cannot open 'mpool': I/O error
j...@opensolaris:~# zpool status -x
all pools are healthy
j...@opensolaris:~# j...@opensolaris:~# j...@opensolaris:~# zpool status

------ and now the pool is vanished

  pool: rpool
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        rpool       ONLINE       0     0     0
          c8d0s0    ONLINE       0     0     0

errors: No known data errors

  pool: vault2
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        vault2      ONLINE       0     0     0
          raidz1-0  ONLINE       0     0     0
            c10d1   ONLINE       0     0     0
            c11d0   ONLINE       0     0     0
            c12d0   ONLINE       0     0     0
            c12d1   ONLINE       0     0     0

errors: No known data errors
j...@opensolaris:~#
dmesg
4 times these messages
May 16 20:36:19 opensolaris fmd: [ID 377184 daemon.error] SUNW-MSG-ID: 
ZFS-8000-D3, TYPE: Fault, VER: 1, SEVERITY: Major
May 16 20:36:19 opensolaris EVENT-TIME: Sun May 16 20:36:18 CEST 2010
May 16 20:36:19 opensolaris PLATFORM: System-Product-Name, CSN: 
System-Serial-Number, HOSTNAME: opensolaris
May 16 20:36:19 opensolaris SOURCE: zfs-diagnosis, REV: 1.0
May 16 20:36:19 opensolaris EVENT-ID: f5f9feb6-34e9-6465-a15e-a3f4724c6f25
May 16 20:36:19 opensolaris DESC: A ZFS device failed.  Refer to 
http://sun.com/msg/ZFS-8000-D3 for more information.
May 16 20:36:19 opensolaris AUTO-RESPONSE: No automated response will occur.
May 16 20:36:19 opensolaris IMPACT: Fault tolerance of the pool may be 
compromised.
May 16 20:36:19 opensolaris REC-ACTION: Run 'zpool status -x' and replace the 
bad device.
and then
May 16 20:36:19 opensolaris fmd: [ID 377184 daemon.error] SUNW-MSG-ID: 
ZFS-8000-CS, TYPE: Fault, VER: 1, SEVERITY: Major
May 16 20:36:19 opensolaris EVENT-TIME: Sun May 16 20:36:19 CEST 2010
May 16 20:36:19 opensolaris PLATFORM: System-Product-Name, CSN: 
System-Serial-Number, HOSTNAME: opensolaris
May 16 20:36:19 opensolaris SOURCE: zfs-diagnosis, REV: 1.0
May 16 20:36:19 opensolaris EVENT-ID: 57db7aa6-658a-ef83-875b-b2af77e4493a
May 16 20:36:19 opensolaris DESC: A ZFS pool failed to open.  Refer to 
http://sun.com/msg/ZFS-8000-CS for more information.
May 16 20:36:19 opensolaris AUTO-RESPONSE: No automated response will occur.
May 16 20:36:19 opensolaris IMPACT: The pool data is unavailable
May 16 20:36:19 opensolaris REC-ACTION: Run 'zpool status -x' and attach any 
missing devices, follow
May 16 20:36:19 opensolaris      any provided recovery instructions or restore 
from backup.

May 16 20:48:48 opensolaris zfs: [ID 249136 kern.info] created version 22 pool 
vault2 using 22


------ these are the same commands I used before

j...@opensolaris:~# zpool add vault2 log /dev/dsk/c10d0p1
j...@opensolaris:~# zpool add vault2 cache /dev/dsk/c10d0p0
j...@opensolaris:~# zpool status pool: rpool
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        rpool       ONLINE       0     0     0
          c8d0s0    ONLINE       0     0     0

errors: No known data errors

  pool: vault2
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        vault2      ONLINE       0     0     0
          raidz1-0  ONLINE       0     0     0
            c10d1   ONLINE       0     0     0
            c11d0   ONLINE       0     0     0
            c12d0   ONLINE       0     0     0
            c12d1   ONLINE       0     0     0
        logs
          c10d0p1   ONLINE       0     0     0
        cache
          c10d0p0   ONLINE       0     0     0

errors: No known data errors
j...@opensolaris:~#
poweroff

moved the 4 disks to the other controller

power on

Last login: Sun May 16 20:37:29 2010 from macpro.janhelle
Sun Microsystems Inc.   SunOS 5.11      snv_134 February 2010
j...@opensolaris:~$ zpool status
  pool: rpool
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        rpool       ONLINE       0     0     0
          c8d0s0    ONLINE       0     0     0

errors: No known data errors

  pool: vault2
 state: UNAVAIL
status: One or more devices could not be opened.  There are insufficient
        replicas for the pool to continue functioning.
action: Attach the missing device and online it using 'zpool online'.
   see: http://www.sun.com/msg/ZFS-8000-3C
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        vault2      UNAVAIL      0     0     0  insufficient replicas
          raidz1-0  UNAVAIL      0     0     0  insufficient replicas
            c10d1   UNAVAIL      0     0     0  cannot open
            c11d0   UNAVAIL      0     0     0  cannot open
            c12d0   UNAVAIL      0     0     0  cannot open
            c12d1   UNAVAIL      0     0     0  cannot open
        logs
          c10d0p1   ONLINE       0     0     0
j...@opensolaris:~$ pfexec poweroff

---- moved the disk back to the original controller to see if it ok to just 
move them

j...@opensolaris:~$ zpool status
  pool: rpool
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        rpool       ONLINE       0     0     0
          c8d0s0    ONLINE       0     0     0

errors: No known data errors

  pool: vault2
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        vault2      ONLINE       0     0     0
          raidz1-0  ONLINE       0     0     0
            c10d1   ONLINE       0     0     0
            c11d0   ONLINE       0     0     0
            c12d0   ONLINE       0     0     0
            c12d1   ONLINE       0     0     0
        logs
          c10d0p1   ONLINE       0     0     0
        cache
          c10d0p0   ONLINE       0     0     0

errors: No known data errors
j...@opensolaris:~$
j...@opensolaris:~$ pfexec poweroff

-- move the disks to the new controller again

Sun Microsystems Inc.   SunOS 5.11      snv_134 February 2010
j...@opensolaris:~$ pfexec bash
j...@opensolaris:~# zpool status
  pool: rpool
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        rpool       ONLINE       0     0     0
          c8d0s0    ONLINE       0     0     0

errors: No known data errors

  pool: vault2
 state: UNAVAIL
status: One or more devices could not be opened.  There are insufficient
        replicas for the pool to continue functioning.
action: Attach the missing device and online it using 'zpool online'.
   see: http://www.sun.com/msg/ZFS-8000-3C
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        vault2      UNAVAIL      0     0     0  insufficient replicas
          raidz1-0  UNAVAIL      0     0     0  insufficient replicas
            c10d1   UNAVAIL      0     0     0  cannot open
            c11d0   UNAVAIL      0     0     0  cannot open
            c12d0   UNAVAIL      0     0     0  cannot open
            c12d1   UNAVAIL      0     0     0  cannot open
        logs
          c10d0p1   ONLINE       0     0     0
j...@opensolaris:~# zpool import vault2
cannot import 'vault2': a pool with that name is already created/imported,
and no additional pools with that name were found
j...@opensolaris:~# zpool export vault2
cannot open 'vault2': I/O error
j...@opensolaris:~# format
Searching for disks...done


AVAILABLE DISK SELECTIONS:
       0. c8d0 <DEFAULT cyl 6394 alt 2 hd 255 sec 63>
          /p...@0,0/pci-...@14,1/i...@0/c...@0,0
       1. c10d0 <DEFAULT cyl 606 alt 2 hd 255 sec 63>
          /p...@0,0/pci-...@11/i...@0/c...@0,0
       2. c13t0d0 <ATA-SAMSUNG HD501LJ-0-11-465.76GB>
          /p...@0,0/pci1022,9...@2/pci1000,3...@0/s...@0,0
       3. c13t1d0 <ATA-SAMSUNG HD501LJ-0-11-465.76GB>
          /p...@0,0/pci1022,9...@2/pci1000,3...@0/s...@1,0
       4. c13t2d0 <ATA-SAMSUNG HD501LJ-0-11-465.76GB>
          /p...@0,0/pci1022,9...@2/pci1000,3...@0/s...@2,0
       5. c13t3d0 <ATA-SAMSUNG HD501LJ-0-11-465.76GB>
          /p...@0,0/pci1022,9...@2/pci1000,3...@0/s...@3,0
Specify disk (enter its number): ^C
j...@opensolaris:~# cfgadm
Ap_Id                          Type         Receptacle   Occupant     Condition
c13                            scsi-sas     connected    configured   unknown
usb5/1                         unknown      empty        unconfigured ok
...
j...@opensolaris:~# zpool status
  pool: rpool
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        rpool       ONLINE       0     0     0
          c8d0s0    ONLINE       0     0     0

errors: No known data errors
j...@opensolaris:~# zpool import vault2
j...@opensolaris:~# zpool status
  pool: rpool
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        rpool       ONLINE       0     0     0
          c8d0s0    ONLINE       0     0     0

errors: No known data errors

  pool: vault2
 state: ONLINE
 scrub: none requested
config:

        NAME         STATE     READ WRITE CKSUM
        vault2       ONLINE       0     0     0
          raidz1-0   ONLINE       0     0     0
            c13t2d0  ONLINE       0     0     0
            c13t1d0  ONLINE       0     0     0
            c13t3d0  ONLINE       0     0     0
            c13t0d0  ONLINE       0     0     0
        logs
          c10d0p1    ONLINE       0     0     0
        cache
          c10d0p0    ONLINE       0     0     0

errors: No known data errors
j...@opensolaris:~#

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to