Below I create zpools isolating one card at a time
 - when just card#1 - it works
 - when just card #2 - it fails
 - when just card #3 - it works

And then again using the two cards that seem to work:
 - when cards #1 and #3 - it fails

So, at first I thought I narrowed it down to a card, but my last test shows that it still fails when the zpool uses two cards that succeed individually...

The only thing I can think to point out here is that those two cards on on different buses - one connected to a NECuPD720400 and the other connected to a AIC-7902, which itself is then connected to the NECuPD720400

Any ideas?

Thanks,
Kent





OK, doing it again using just card #1 (i.e. "c3") works!

   # zpool destroy tank
   # zpool create tank raidz2 c3t0d0 c3t4d0 c3t1d0 c3t5d0
   # cp -r /usr /tank/usr
   cp: cycle detected: /usr/ccs/lib/link_audit/32
   cp: cannot access /usr/lib/amd64/libdbus-1.so.2


Doing it again using just card #2 (i.e. "c4") still fails:

   # zpool destroy tank
# zpool create tank raidz2 c4t0d0 c4t4d0 c4t1d0 c4t5d0 # cp -r /usr /tank/usr
   cp: cycle detected: /usr/ccs/lib/link_audit/32
   cp: cannot access /usr/lib/amd64/libdbus-1.so.2
   WARNING: marvell88sx1: error on port 1:
           ATA UDMA data parity error
   WARNING: marvell88sx1: error on port 1:
           ATA UDMA data parity error
   WARNING: marvell88sx1: error on port 1:
           ATA UDMA data parity error
   WARNING: marvell88sx1: error on port 1:
           ATA UDMA data parity error
   WARNING: marvell88sx1: error on port 1:
           ATA UDMA data parity error
   WARNING: marvell88sx1: error on port 1:
           ATA UDMA data parity error

   SUNW-MSG-ID: SUNOS-8000-0G, TYPE: Error, VER: 1, SEVERITY: Major
   EVENT-TIME: 0x478f6148.0x376ebd4b (0xbf8f86652d)
   PLATFORM: i86pc, CSN: -, HOSTNAME: san
   SOURCE: SunOS, REV: 5.11 snv_78
   DESC: Errors have been detected that require a reboot to ensure system
   integrity.  See http://www.sun.com/msg/SUNOS-8000-0G for more
   information.
   AUTO-RESPONSE: Solaris will attempt to save and diagnose the error
   telemetry
   IMPACT: The system will sync files, save a crash dump if needed, and
   reboot
   REC-ACTION: Save the error summary below in case telemetry cannot be
   saved


   panic[cpu3]/thread=ffffff000f7bcc80: pcie_pci-0: PCI(-X) Express
   Fatal Error

   ffffff000f7bcbc0 pcie_pci:pepb_err_msi_intr+d2 ()
   ffffff000f7bcc20 unix:av_dispatch_autovect+78 ()
   ffffff000f7bcc60 unix:dispatch_hardint+2f ()
   ffffff000f786ac0 unix:switch_sp_and_call+13 ()
   ffffff000f786b10 unix:do_interrupt+a0 ()
   ffffff000f786b20 unix:cmnint+ba ()
   ffffff000f786c10 unix:mach_cpu_idle+b ()
   ffffff000f786c40 unix:cpu_idle+c8 ()
   ffffff000f786c60 unix:idle+10e ()
   ffffff000f786c70 unix:thread_start+8 ()

   syncing file systems... done
   ereport.io.pciex.rc.fe-msg ena=bf8f828ea700c01 detector=[ version=0
   scheme=
    "dev" device-path="/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]" ] 
rc-status=800007c
   source-id=200
    source-valid=1

   ereport.io.pciex.rc.mue-msg ena=bf8f828ea700c01 detector=[ version=0
   scheme=
    "dev" device-path="/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]" ] 
rc-status=800007c

   ereport.io.pci.sec-rserr ena=bf8f828ea700c01 detector=[ version=0
   scheme="dev"
    device-path="/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]" ] 
pci-sec-status=6000
   pci-bdg-ctrl=3

   ereport.io.pci.sec-ma ena=bf8f828ea700c01 detector=[ version=0
   scheme="dev"
    device-path="/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]" ] 
pci-sec-status=6000
   pci-bdg-ctrl=3

   ereport.io.pciex.bdg.sec-perr ena=bf8f828ea700c01 detector=[
   version=0 scheme=
    "dev" device-path="/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1033,[EMAIL 
PROTECTED]" ]
   sue-status=1800
    source-id=200 source-valid=1

   ereport.io.pciex.bdg.sec-serr ena=bf8f828ea700c01 detector=[
   version=0 scheme=
    "dev" device-path="/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1033,[EMAIL 
PROTECTED]" ]
   sue-status=1800

   ereport.io.pci.sec-rserr ena=bf8f828ea700c01 detector=[ version=0
   scheme="dev"
    device-path="/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1033,[EMAIL 
PROTECTED]" ]
   pci-sec-status=6420
    pci-bdg-ctrl=7

   dumping to /dev/dsk/c2t0d0s1, offset 215547904, content: kernel
   NOTICE: /[EMAIL PROTECTED],0/pci15d9,[EMAIL PROTECTED]:
    port 0: device reset

   100% done:


And doing it again using just card #3 (i.e. "c5") works!

   # zpool destroy tank
cannot open 'tank': no such pool (interesting) # zpool create tank raidz2 c5t0d0 c5t4d0 c5t1d0 c5t5d0 # cp -r /usr /tank/usr




And doing it again using cards #1 and #3 (i.e. "c3" and "c5") fails!

   # zpool destroy tank
   # zpool create tank raidz2 c3t0d0 c3t4d0 c3t1d0 c3t5d0 raidz2 c5t0d0
   c5t4d0 c5t1d0 c5t5d0
   # cp -r /usr /tank/usr
   cp: cycle detected: /usr/ccs/lib/link_audit/32
   cp: cannot access /usr/lib/amd64/libdbus-1.so.2
   WARNING: marvell88sx2: error on port 4:
           ATA UDMA data parity error
   WARNING: marvell88sx2: error on port 4:
           ATA UDMA data parity error
   WARNING: marvell88sx2: error on port 4:
           ATA UDMA data parity error
   WARNING: marvell88sx2: error on port 4:
           ATA UDMA data parity error
   WARNING: marvell88sx2: error on port 4:
           ATA UDMA data parity error
   WARNING: marvell88sx2: error on port 4:

   SUNW-MSG-ID: SUNOS-8000-0G, TYPE: Error, VER: 1, SEVERITY: Major
   EVENT-TIME: 0x478f6307.0x20c8668b (0x643e118fd4)
   PLATFORM: i86pc, CSN: -, HOSTNAME: san
   SOURCE: SunOS, REV: 5.11 snv_78
   DESC: Errors have been detected that require a reboot to ensure system
   integrity.  See http://www.sun.com/msg/SUNOS-8000-0G for more
   information.
   AUTO-RESPONSE: Solaris will attempt to save and diagnose the error
   telemetry
   IMPACT: The system will sync files, save a crash dump if needed, and
   reboot
   REC-ACTION: Save the error summary below in case telemetry cannot be
   saved


   panic[cpu3]/thread=ffffff000f7c2c80: pcie_pci-0: PCI(-X) Express
   Fatal Error

   ffffff000f7c2bc0 pcie_pci:pepb_err_msi_intr+d2 ()
   ffffff000f7c2c20 unix:av_dispatch_autovect+78 ()
   ffffff000f7c2c60 unix:dispatch_hardint+2f ()
   ffffff000f78cac0 unix:switch_sp_and_call+13 ()
   ffffff000f78cb10 unix:do_interrupt+a0 ()
   ffffff000f78cb20 unix:cmnint+ba ()
   ffffff000f78cc10 unix:mach_cpu_idle+b ()
   ffffff000f78cc40 unix:cpu_idle+c8 ()
   ffffff000f78cc60 unix:idle+10e ()
   ffffff000f78cc70 unix:thread_start+8 ()

   syncing file systems... done
   ereport.io.pciex.rc.fe-msg ena=643e0d446400c01 detector=[ version=0
   scheme=
    "dev" device-path="/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]" ] 
rc-status=800007c
   source-id=201
    source-valid=1

   ereport.io.pciex.rc.mue-msg ena=643e0d446400c01 detector=[ version=0
   scheme=
    "dev" device-path="/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]" ] 
rc-status=800007c

   ereport.io.pci.sec-rserr ena=643e0d446400c01 detector=[ version=0
   scheme="dev"
    device-path="/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]" ] 
pci-sec-status=6000
   pci-bdg-ctrl=3

   ereport.io.pci.sec-ma ena=643e0d446400c01 detector=[ version=0
   scheme="dev"
    device-path="/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]" ] 
pci-sec-status=6000
   pci-bdg-ctrl=3

   ereport.io.pciex.bdg.sec-perr ena=643e0d446400c01 detector=[
   version=0 scheme=
    "dev" device-path="/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1033,[EMAIL 
PROTECTED],1" ]
   sue-status=1800
    source-id=201 source-valid=1

   ereport.io.pciex.bdg.sec-serr ena=643e0d446400c01 detector=[
   version=0 scheme=
    "dev" device-path="/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1033,[EMAIL 
PROTECTED],1" ]
   sue-status=1800

   ereport.io.pci.sec-rserr ena=643e0d446400c01 detector=[ version=0
   scheme="dev"
    device-path="/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1033,[EMAIL 
PROTECTED],1" ]
   pci-sec-status=6420
    pci-bdg-ctrl=7

   dumping to /dev/dsk/c2t0d0s1, offset 215547904, content: kernel
   NOTICE: /[EMAIL PROTECTED],0/pci15d9,[EMAIL PROTECTED]:
    port 0: device reset

   100% done: 178114 pages dumped, compression ratio 2.44, dump succeeded
   rebooting...





_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to