Below I create zpools isolating one card at a time
- when just card#1 - it works
- when just card #2 - it fails
- when just card #3 - it works
And then again using the two cards that seem to work:
- when cards #1 and #3 - it fails
So, at first I thought I narrowed it down to a card, but my last test
shows that it still fails when the zpool uses two cards that succeed
individually...
The only thing I can think to point out here is that those two cards on
on different buses - one connected to a NECuPD720400 and the other
connected to a AIC-7902, which itself is then connected to the NECuPD720400
Any ideas?
Thanks,
Kent
OK, doing it again using just card #1 (i.e. "c3") works!
# zpool destroy tank
# zpool create tank raidz2 c3t0d0 c3t4d0 c3t1d0 c3t5d0
# cp -r /usr /tank/usr
cp: cycle detected: /usr/ccs/lib/link_audit/32
cp: cannot access /usr/lib/amd64/libdbus-1.so.2
Doing it again using just card #2 (i.e. "c4") still fails:
# zpool destroy tank
# zpool create tank raidz2 c4t0d0 c4t4d0 c4t1d0 c4t5d0
# cp -r /usr /tank/usr
cp: cycle detected: /usr/ccs/lib/link_audit/32
cp: cannot access /usr/lib/amd64/libdbus-1.so.2
WARNING: marvell88sx1: error on port 1:
ATA UDMA data parity error
WARNING: marvell88sx1: error on port 1:
ATA UDMA data parity error
WARNING: marvell88sx1: error on port 1:
ATA UDMA data parity error
WARNING: marvell88sx1: error on port 1:
ATA UDMA data parity error
WARNING: marvell88sx1: error on port 1:
ATA UDMA data parity error
WARNING: marvell88sx1: error on port 1:
ATA UDMA data parity error
SUNW-MSG-ID: SUNOS-8000-0G, TYPE: Error, VER: 1, SEVERITY: Major
EVENT-TIME: 0x478f6148.0x376ebd4b (0xbf8f86652d)
PLATFORM: i86pc, CSN: -, HOSTNAME: san
SOURCE: SunOS, REV: 5.11 snv_78
DESC: Errors have been detected that require a reboot to ensure system
integrity. See http://www.sun.com/msg/SUNOS-8000-0G for more
information.
AUTO-RESPONSE: Solaris will attempt to save and diagnose the error
telemetry
IMPACT: The system will sync files, save a crash dump if needed, and
reboot
REC-ACTION: Save the error summary below in case telemetry cannot be
saved
panic[cpu3]/thread=ffffff000f7bcc80: pcie_pci-0: PCI(-X) Express
Fatal Error
ffffff000f7bcbc0 pcie_pci:pepb_err_msi_intr+d2 ()
ffffff000f7bcc20 unix:av_dispatch_autovect+78 ()
ffffff000f7bcc60 unix:dispatch_hardint+2f ()
ffffff000f786ac0 unix:switch_sp_and_call+13 ()
ffffff000f786b10 unix:do_interrupt+a0 ()
ffffff000f786b20 unix:cmnint+ba ()
ffffff000f786c10 unix:mach_cpu_idle+b ()
ffffff000f786c40 unix:cpu_idle+c8 ()
ffffff000f786c60 unix:idle+10e ()
ffffff000f786c70 unix:thread_start+8 ()
syncing file systems... done
ereport.io.pciex.rc.fe-msg ena=bf8f828ea700c01 detector=[ version=0
scheme=
"dev" device-path="/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]" ]
rc-status=800007c
source-id=200
source-valid=1
ereport.io.pciex.rc.mue-msg ena=bf8f828ea700c01 detector=[ version=0
scheme=
"dev" device-path="/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]" ]
rc-status=800007c
ereport.io.pci.sec-rserr ena=bf8f828ea700c01 detector=[ version=0
scheme="dev"
device-path="/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]" ]
pci-sec-status=6000
pci-bdg-ctrl=3
ereport.io.pci.sec-ma ena=bf8f828ea700c01 detector=[ version=0
scheme="dev"
device-path="/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]" ]
pci-sec-status=6000
pci-bdg-ctrl=3
ereport.io.pciex.bdg.sec-perr ena=bf8f828ea700c01 detector=[
version=0 scheme=
"dev" device-path="/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1033,[EMAIL
PROTECTED]" ]
sue-status=1800
source-id=200 source-valid=1
ereport.io.pciex.bdg.sec-serr ena=bf8f828ea700c01 detector=[
version=0 scheme=
"dev" device-path="/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1033,[EMAIL
PROTECTED]" ]
sue-status=1800
ereport.io.pci.sec-rserr ena=bf8f828ea700c01 detector=[ version=0
scheme="dev"
device-path="/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1033,[EMAIL
PROTECTED]" ]
pci-sec-status=6420
pci-bdg-ctrl=7
dumping to /dev/dsk/c2t0d0s1, offset 215547904, content: kernel
NOTICE: /[EMAIL PROTECTED],0/pci15d9,[EMAIL PROTECTED]:
port 0: device reset
100% done:
And doing it again using just card #3 (i.e. "c5") works!
# zpool destroy tank
cannot open 'tank': no such pool
(interesting)
# zpool create tank raidz2 c5t0d0 c5t4d0 c5t1d0 c5t5d0
# cp -r /usr /tank/usr
And doing it again using cards #1 and #3 (i.e. "c3" and "c5") fails!
# zpool destroy tank
# zpool create tank raidz2 c3t0d0 c3t4d0 c3t1d0 c3t5d0 raidz2 c5t0d0
c5t4d0 c5t1d0 c5t5d0
# cp -r /usr /tank/usr
cp: cycle detected: /usr/ccs/lib/link_audit/32
cp: cannot access /usr/lib/amd64/libdbus-1.so.2
WARNING: marvell88sx2: error on port 4:
ATA UDMA data parity error
WARNING: marvell88sx2: error on port 4:
ATA UDMA data parity error
WARNING: marvell88sx2: error on port 4:
ATA UDMA data parity error
WARNING: marvell88sx2: error on port 4:
ATA UDMA data parity error
WARNING: marvell88sx2: error on port 4:
ATA UDMA data parity error
WARNING: marvell88sx2: error on port 4:
SUNW-MSG-ID: SUNOS-8000-0G, TYPE: Error, VER: 1, SEVERITY: Major
EVENT-TIME: 0x478f6307.0x20c8668b (0x643e118fd4)
PLATFORM: i86pc, CSN: -, HOSTNAME: san
SOURCE: SunOS, REV: 5.11 snv_78
DESC: Errors have been detected that require a reboot to ensure system
integrity. See http://www.sun.com/msg/SUNOS-8000-0G for more
information.
AUTO-RESPONSE: Solaris will attempt to save and diagnose the error
telemetry
IMPACT: The system will sync files, save a crash dump if needed, and
reboot
REC-ACTION: Save the error summary below in case telemetry cannot be
saved
panic[cpu3]/thread=ffffff000f7c2c80: pcie_pci-0: PCI(-X) Express
Fatal Error
ffffff000f7c2bc0 pcie_pci:pepb_err_msi_intr+d2 ()
ffffff000f7c2c20 unix:av_dispatch_autovect+78 ()
ffffff000f7c2c60 unix:dispatch_hardint+2f ()
ffffff000f78cac0 unix:switch_sp_and_call+13 ()
ffffff000f78cb10 unix:do_interrupt+a0 ()
ffffff000f78cb20 unix:cmnint+ba ()
ffffff000f78cc10 unix:mach_cpu_idle+b ()
ffffff000f78cc40 unix:cpu_idle+c8 ()
ffffff000f78cc60 unix:idle+10e ()
ffffff000f78cc70 unix:thread_start+8 ()
syncing file systems... done
ereport.io.pciex.rc.fe-msg ena=643e0d446400c01 detector=[ version=0
scheme=
"dev" device-path="/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]" ]
rc-status=800007c
source-id=201
source-valid=1
ereport.io.pciex.rc.mue-msg ena=643e0d446400c01 detector=[ version=0
scheme=
"dev" device-path="/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]" ]
rc-status=800007c
ereport.io.pci.sec-rserr ena=643e0d446400c01 detector=[ version=0
scheme="dev"
device-path="/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]" ]
pci-sec-status=6000
pci-bdg-ctrl=3
ereport.io.pci.sec-ma ena=643e0d446400c01 detector=[ version=0
scheme="dev"
device-path="/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]" ]
pci-sec-status=6000
pci-bdg-ctrl=3
ereport.io.pciex.bdg.sec-perr ena=643e0d446400c01 detector=[
version=0 scheme=
"dev" device-path="/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1033,[EMAIL
PROTECTED],1" ]
sue-status=1800
source-id=201 source-valid=1
ereport.io.pciex.bdg.sec-serr ena=643e0d446400c01 detector=[
version=0 scheme=
"dev" device-path="/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1033,[EMAIL
PROTECTED],1" ]
sue-status=1800
ereport.io.pci.sec-rserr ena=643e0d446400c01 detector=[ version=0
scheme="dev"
device-path="/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/pci1033,[EMAIL
PROTECTED],1" ]
pci-sec-status=6420
pci-bdg-ctrl=7
dumping to /dev/dsk/c2t0d0s1, offset 215547904, content: kernel
NOTICE: /[EMAIL PROTECTED],0/pci15d9,[EMAIL PROTECTED]:
port 0: device reset
100% done: 178114 pages dumped, compression ratio 2.44, dump succeeded
rebooting...
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss