Hi folks,

One of the servers (running 4.0, generic, fully patched) I'm responsible for
has had a panic (see title line). I'll confess right away that I wasn't able
to run trace or ps; I was away from the machine at the time and had to guide
a colleague by phone through restarting the machine in a hurry - he had an
office full of users breathing down his neck...
Briefly: this machine runs an external 3Tb RAID array (a Nexsan ATAboy) via
an Adaptec 29160 SCSI card; the RAID array is configured as four logical
drives. Checking the logs, I see a bunch of parity errors a few days before,
and then another bunch immediately prior to the panic. (The log lines, and
the dmesg, follow my sig.) After restarting, the ATAboy self-diagnostics
reported no errors. (I've run other tests which have reassured me we've lost
no data.) The log shows errors on three of the four drives, which perhaps is
unsurprising if it's the SCSI connection which wobbled.

Are there any known issues with this SCSI card or driver (ahc)? Or do we
just have flakey hardware? I've run memtest86+ ad nauseam etc etc with no
issues at all, so I'm fairly confident about the base machine, but now
unsure about the Adaptec card. The machine has otherwise been running
happily with no errors or issues for several months now. Perhaps
significantly, a large amount of data was being copied to the RAID array at
the time, but this had been done many times before without issue.

All cluebats gratefully received.

Steve
http://www.fivetrees.com

*** Extracts from /var/log/messages:

May 18 04:27:30 hglserver /bsd: sd3(ahc0:4:4): parity error detected in
Data-in phase. SEQADDR(0x55) SCSIRATE(0xc2)
May 18 04:27:30 hglserver /bsd:         CRC Value Mismatch
May 18 04:27:30 hglserver /bsd: sd3(ahc0:4:4): parity error detected in
Data-in phase. SEQADDR(0x63) SCSIRATE(0xc2)
May 18 04:27:30 hglserver /bsd:         CRC Value Mismatch
May 18 04:27:30 hglserver /bsd: sd3(ahc0:4:4): parity error detected in
Data-in phase. SEQADDR(0x63) SCSIRATE(0xc2)
May 18 04:27:30 hglserver /bsd:         CRC Value Mismatch
May 18 04:27:30 hglserver /bsd: sd3(ahc0:4:4): parity error detected in
Data-in phase. SEQADDR(0x4e) SCSIRATE(0xc2)
May 18 04:27:30 hglserver /bsd:         CRC Value Mismatch

(note: 4:27 corresponds to a time during which I run a crontab'ed rsync from
another machine for partial offsite backup.)

... <snip> ...

May 23 16:53:56 hglserver /bsd: sd1(ahc0:4:2): parity error detected in
Data-in phase. SEQADDR(0x1a7) SCSIRATE(0xc2)
May 23 16:53:56 hglserver /bsd:         CRC Value Mismatch
May 23 16:54:22 hglserver /bsd: sd2(ahc0:4:3): parity error detected in
Data-in phase. SEQADDR(0x84) SCSIRATE(0xc2)
May 23 16:54:22 hglserver /bsd:         CRC Value Mismatch
May 23 16:54:25 hglserver /bsd: sd2(ahc0:4:3): parity error detected in
Data-in phase. SEQADDR(0x54) SCSIRATE(0xc2)
May 23 16:54:25 hglserver /bsd:         CRC Value Mismatch
May 23 16:54:27 hglserver /bsd: sd2(ahc0:4:3): parity error detected in
Data-in phase. SEQADDR(0x54) SCSIRATE(0xc2)
May 23 16:54:27 hglserver /bsd:         CRC Value Mismatch
May 23 16:54:27 hglserver /bsd: sd2(ahc0:4:3): parity error detected in
Data-in phase. SEQADDR(0x54) SCSIRATE(0xc2)
May 23 16:54:27 hglserver /bsd:         CRC Value Mismatch
May 23 16:54:38 hglserver /bsd: sd1(ahc0:4:2): parity error detected in
Data-in phase. SEQADDR(0x1a7) SCSIRATE(0xc2)
May 23 16:54:38 hglserver /bsd:         CRC Value Mismatch
May 23 18:31:21 hglserver syslogd: restart
May 23 18:31:21 hglserver /bsd: start = 0, len = 9793, fs = /s1
May 23 18:31:21 hglserver /bsd: panic: ffs_alloccg: map corrupted

(note: panic occurred at 16:54; machine restarted at 18:31 after lengthy
fscks...)

*** dmesg:

OpenBSD 4.0-stable (GENERIC) #10: Mon May 14 20:04:41 BST 2007
    [EMAIL PROTECTED]:/usr/src/sys/arch/i386/compile/GENERIC
cpu0: AMD Sempron(tm) 2400+ ("AuthenticAMD" 686-class, 256KB L2 cache) 1.67
GHz
cpu0:
FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,
FXSR,SSE
real mem  = 1073246208 (1048092K)
avail mem = 971010048 (948252K)
using 4256 buffers containing 53764096 bytes (52504K) of memory
mainbus0 (root)
bios0 at mainbus0: AT/286+(00) BIOS, date 12/08/04, BIOS32 rev. 0 @ 0xfda50,
SMBIOS rev. 2.3 @ 0xf0630 (29 entries)
pcibios0 at bios0: rev 2.1 @ 0xf0000/0x10000
pcibios0: PCI IRQ Routing Table rev 1.0 @ 0xf7f00/192 (10 entries)
pcibios0: PCI Interrupt Router at 000:17:0 ("VIA VT8237 ISA" rev 0x00)
pcibios0: PCI bus #1 is the last bus
bios0: ROM list: 0xc0000/0x9000 0xc9000/0x5400
cpu0 at mainbus0
pci0 at mainbus0 bus 0: configuration mode 1 (no bios)
pchb0 at pci0 dev 0 function 0 "VIA VT8377 PCI" rev 0x80
ppb0 at pci0 dev 1 function 0 "VIA VT8377 AGP" rev 0x00
pci1 at ppb0 bus 1
vga1 at pci1 dev 0 function 0 "Matrox MGA G400/G450 AGP" rev 0x85
wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
re0 at pci0 dev 10 function 0 "Realtek 8169" rev 0x10: irq 5, address
00:14:6c:c0:28:60
rgephy0 at re0 phy 7: RTL8169S/8110S PHY, rev. 0
ahc0 at pci0 dev 12 function 0 "Adaptec AHA-29160 U160" rev 0x02: irq 11
scsibus0 at ahc0: 16 targets
sd0 at scsibus0 targ 4 lun 1: <NEXSAN, ATAboy2, 4r41> SCSI2 0/direct fixed
sd0: 858306MB, 53644 cyl, 128 head, 256 sec, 512 bytes/sec, 1757812500 sec
total
sd1 at scsibus0 targ 4 lun 2: <NEXSAN, ATAboy2, 4r41> SCSI2 0/direct fixed
sd1: 858306MB, 53644 cyl, 128 head, 256 sec, 512 bytes/sec, 1757812500 sec
total
sd2 at scsibus0 targ 4 lun 3: <NEXSAN, ATAboy2, 4r41> SCSI2 0/direct fixed
sd2: 858306MB, 53644 cyl, 128 head, 256 sec, 512 bytes/sec, 1757812500 sec
total
sd3 at scsibus0 targ 4 lun 4: <NEXSAN, ATAboy2, 4r41> SCSI2 0/direct fixed
sd3: 286012MB, 35751 cyl, 128 head, 128 sec, 512 bytes/sec, 585753906 sec
total
pciide0 at pci0 dev 15 function 0 "VIA VT6420 SATA" rev 0x80: DMA
pciide0: using irq 10 for native-PCI interrupt
pciide1 at pci0 dev 15 function 1 "VIA VT82C571 IDE" rev 0x06: ATA133,
channel 0 configured to compatibility, channel 1 configured to compatibility
wd0 at pciide1 channel 0 drive 0: <Maxtor 7L250R0>
wd0: 16-sector PIO, LBA48, 239372MB, 490234752 sectors
wd1 at pciide1 channel 0 drive 1: <Maxtor 7L250R0>
wd1: 16-sector PIO, LBA48, 239372MB, 490234752 sectors
wd0(pciide1:0:0): using PIO mode 4, Ultra-DMA mode 6
wd1(pciide1:0:1): using PIO mode 4, Ultra-DMA mode 6
atapiscsi0 at pciide1 channel 1 drive 0
scsibus1 at atapiscsi0: 2 targets
cd0 at scsibus1 targ 0 lun 0: <SONY, DVD RW DW-D23A, CYS1> SCSI0 5/cdrom
removable
cd0(pciide1:1:0): using PIO mode 4, Ultra-DMA mode 4
uhci0 at pci0 dev 16 function 0 "VIA VT83C572 USB" rev 0x81: irq 11
usb0 at uhci0: USB revision 1.0
uhub0 at usb0
uhub0: VIA UHCI root hub, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhci1 at pci0 dev 16 function 1 "VIA VT83C572 USB" rev 0x81: irq 11
usb1 at uhci1: USB revision 1.0
uhub1 at usb1
uhub1: VIA UHCI root hub, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
uhci2 at pci0 dev 16 function 2 "VIA VT83C572 USB" rev 0x81: irq 10
usb2 at uhci2: USB revision 1.0
uhub2 at usb2
uhub2: VIA UHCI root hub, rev 1.00/1.00, addr 1
uhub2: 2 ports with 2 removable, self powered
uhci3 at pci0 dev 16 function 3 "VIA VT83C572 USB" rev 0x81: irq 10
usb3 at uhci3: USB revision 1.0
uhub3 at usb3
uhub3: VIA UHCI root hub, rev 1.00/1.00, addr 1
uhub3: 2 ports with 2 removable, self powered
ehci0 at pci0 dev 16 function 4 "VIA VT6202 USB" rev 0x86: irq 5
usb4 at ehci0: USB revision 2.0
uhub4 at usb4
uhub4: VIA EHCI root hub, rev 2.00/1.00, addr 1
uhub4: 8 ports with 8 removable, self powered
viapm0 at pci0 dev 17 function 0 "VIA VT8237 ISA" rev 0x00
iic0 at viapm0
auvia0 at pci0 dev 17 function 5 "VIA VT8233 AC97" rev 0x60: irq 5
ac97: codec id 0x434d4983 (C-Media Electronics CMI9761A+)
audio0 at auvia0
vr0 at pci0 dev 18 function 0 "VIA RhineII-2" rev 0x78: irq 11, address
00:0b:6a:c0:3f:14
ukphy0 at vr0 phy 1: Generic IEEE 802.3u media interface, rev. 10: OUI
0x004063, model 0x0032
isa0 at mainbus0
isadma0 at isa0
pckbc0 at isa0 port 0x60/5
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0: console keyboard, using wsdisplay0
pmsi0 at pckbc0 (aux slot)
pckbc0: using irq 12 for aux slot
wsmouse0 at pmsi0 mux 0
pcppi0 at isa0 port 0x61
midi0 at pcppi0: <PC speaker>
spkr0 at pcppi0
lpt0 at isa0 port 0x378/4 irq 7
lm0 at isa0 port 0x290/8: W83697HF
npx0 at isa0 port 0xf0/16: reported by CPUID; using exception 16
pccom0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
fdc0 at isa0 port 0x3f0/6 irq 6 drq 2
fd0 at fdc0 drive 0: 1.44MB 80 cyl, 2 head, 18 sec
fd1 at fdc0 drive 1: density unknown
biomask ef6d netmask ef6d ttymask ffef
pctr: user-level cycle counter enabled
mtrr: Pentium Pro MTRR support
ahc0: target 4 using 16bit transfers
ahc0: target 4 synchronous at 80.0MHz DT, offset = 0x30
dkcsum: sd0 matches BIOS drive 0x82
dkcsum: sd1 matches BIOS drive 0x83
dkcsum: sd2 matches BIOS drive 0x84
dkcsum: sd3 matches BIOS drive 0x85
dkcsum: wd0 matches BIOS drive 0x80
dkcsum: wd1 matches BIOS drive 0x81
root on wd0a
rootdev=0x0 rrootdev=0x300 rawdev=0x302
WARNING: / was not properly unmounted
re0: watchdog timeout
re0: watchdog timeout

Reply via email to