Hi Jeremy, I am using the smartmon tools and in fact the first drive I replaced did show some errors. Next two of them were zeroed out and thoroughly tested using WD tools. No errors were reported either by smartmon nor by WD tools. I was also glad when the shop I bought them replaced them immediately, no questions asked. They said that they were having a lot of issues with WD drives lately.
I will probably try to get a different brand controller especially after seeing the relevant PR Thanks, hp# smartctl -a /dev/ad4 smartctl 5.41 2011-06-09 r3365 [FreeBSD 8.2-STABLE amd64] (local build) Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Model Family: Western Digital Caviar Blue Serial ATA Device Model: WDC WD7500AALX-009BA0 Serial Number: WD-WCATR5711398 LU WWN Device Id: 5 0014ee 25ad8ccf5 Firmware Version: 15.01H15 User Capacity: 750,156,374,016 bytes [750 GB] Sector Size: 512 bytes logical/physical Device is: In smartctl database [for details use: -P show] ATA Version is: 8 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Thu Jun 23 14:46:12 2011 EEST SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x82) Offline data collection activity was completed without error. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (13260) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 155) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x3037) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 188 178 021 Pre-fail Always - 3558 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 10 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 21 10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 8 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 7 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 2 194 Temperature_Celsius 0x0022 110 107 000 Old_age Always - 37 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 No self-tests have been logged. [To run self-tests, use: smartctl -t] SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. On Thu, Jun 23, 2011 at 3:58 AM, Jeremy Chadwick <free...@jdc.parodius.com>wrote: > On Wed, Jun 22, 2011 at 05:52:39PM +0300, George Kontostanos wrote: > > This is the 3rd disk I replace in 3 disk- Raiz1 pool and I really start > to > > believe that the problem is somewhere else. The disks reside in a Promise > > PDC40718 SATA300 controller. I am running this set up since 8.0-Release > with > > no issues till a few months ago after 8.2-Release now at 8.2-Stable. > > Symptoms: > > > > Jun 22 17:08:53 hp kernel: ata2: timeout waiting to issue command > > Jun 22 17:08:53 hp kernel: ata2: error issuing SETFEATURES ENABLE WCACHE > > command > > Jun 22 17:09:33 hp kernel: ad4: WARNING - SET_MULTI taskqueue timeout - > > completing request directly > > Jun 22 17:09:33 hp kernel: ad4: WARNING - WRITE_DMA48 requeued due to > > channel reset LBA=321558741 > > Jun 22 17:09:34 hp kernel: ata2: SIGNATURE: 00000101 > > Jun 22 17:09:34 hp kernel: ad4: WARNING - WRITE_DMA48 requeued due to > > channel reset LBA=321558869 > > Jun 22 17:09:34 hp kernel: ata2: FAILURE - already active DMA on this > device > > Jun 22 17:09:34 hp kernel: ata2: setting up DMA failed > > Jun 22 17:09:34 hp kernel: ata2: FAILURE - already active DMA on this > device > > Jun 22 17:09:34 hp kernel: ata2: setting up DMA failed > > > > > > After a while the disk gets detached from the pool. Always the same disk. > > Rite now I am in the process of resilvering : > > > > pool: tank > > state: ONLINE > > status: One or more devices is currently being resilvered. The pool will > > continue to function, possibly in a degraded state. > > action: Wait for the resilver to complete. > > scan: resilver in progress since Wed Jun 22 17:09:40 2011 > > 189G scanned out of 578G at 88.8M/s, 1h14m to go > > 62.9G resilvered, 32.63% done > > config: > > > > NAME STATE READ WRITE CKSUM > > tank ONLINE 0 0 0 > > raidz1-0 ONLINE 0 0 0 > > label/zdisk1 ONLINE 0 0 0 > > label/zdisk2 ONLINE 0 0 0 > > label/zdisk3 ONLINE 0 0 0 (resilvering) > > > > But those errors have started to appear again. Again this is the 3rd disk > > replaced !!! Full dmesg attached > > > > -- > > George Kontostanos > > aisecure.net <http://www.aisecure.net> > > > Copyright (c) 1992-2011 The FreeBSD Project. > > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 > > The Regents of the University of California. All rights reserved. > > FreeBSD is a registered trademark of The FreeBSD Foundation. > > FreeBSD 8.2-STABLE #0: Mon Jun 6 19:00:19 EEST 2011 > > gkon...@hp.aicom.loc:/usr/obj/usr/src/sys/ML110G3 amd64 > > Timecounter "i8254" frequency 1193182 Hz quality 0 > > CPU: Intel(R) Pentium(R) D CPU 3.20GHz (3200.13-MHz K8-class CPU) > > Origin = "GenuineIntel" Id = 0xf64 Family = f Model = 6 Stepping = > 4 > > > Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> > > Features2=0xe4bd<SSE3,DTES64,MON,DS_CPL,VMX,EST,CNXT-ID,CX16,xTPR,PDCM> > > AMD Features=0x20100800<SYSCALL,NX,LM> > > AMD Features2=0x1<LAHF> > > TSC: P-state invariant > > real memory = 4294967296 (4096 MB) > > avail memory = 4106780672 (3916 MB) > > ACPI APIC Table: <HP OEMAPIC > > > FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs > > FreeBSD/SMP: 1 package(s) x 2 core(s) > > cpu0 (BSP): APIC ID: 0 > > cpu1 (AP): APIC ID: 1 > > ioapic0: Changing APIC ID to 2 > > ioapic0 <Version 2.0> irqs 0-23 on motherboard > > kbd1 at kbdmux0 > > acpi0: <HP OEMXSDT> on motherboard > > acpi0: [ITHREAD] > > acpi0: Power Button (fixed) > > Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 > > acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0 > > cpu0: <ACPI CPU> on acpi0 > > cpu1: <ACPI CPU> on acpi0 > > pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 > > pci0: <ACPI PCI bus> on pcib0 > > pcib1: <ACPI PCI-PCI bridge> irq 16 at device 28.0 on pci0 > > pci1: <ACPI PCI bus> on pcib1 > > pcib2: <ACPI PCI-PCI bridge> irq 17 at device 28.5 on pci0 > > pci7: <ACPI PCI bus> on pcib2 > > bge0: <Broadcom NetXtreme Gigabit Ethernet Controller, ASIC rev. > 0x004101> mem 0xfeaf0000-0xfeafffff irq 17 at device 0.0 on pci7 > > bge0: CHIP ID 0x00004101; ASIC REV 0x04; CHIP REV 0x41; PCI-E > > miibus0: <MII bus> on bge0 > > brgphy0: <BCM5750 10/100/1000baseTX PHY> PHY 1 on miibus0 > > brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, > 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow > > bge0: Ethernet address: 00:13:21:cc:39:35 > > bge0: [ITHREAD] > > uhci0: <Intel 82801G (ICH7) USB controller USB-A> port 0xdc00-0xdc1f irq > 23 at device 29.0 on pci0 > > uhci0: [ITHREAD] > > uhci0: LegSup = 0x2f00 > > usbus0: <Intel 82801G (ICH7) USB controller USB-A> on uhci0 > > uhci1: <Intel 82801G (ICH7) USB controller USB-B> port 0xd880-0xd89f irq > 19 at device 29.1 on pci0 > > uhci1: [ITHREAD] > > uhci1: LegSup = 0x2f00 > > usbus1: <Intel 82801G (ICH7) USB controller USB-B> on uhci1 > > uhci2: <Intel 82801G (ICH7) USB controller USB-C> port 0xd800-0xd81f irq > 18 at device 29.2 on pci0 > > uhci2: [ITHREAD] > > uhci2: LegSup = 0x2f00 > > usbus2: <Intel 82801G (ICH7) USB controller USB-C> on uhci2 > > ehci0: <Intel 82801GB/R (ICH7) USB 2.0 controller> mem > 0xfe9ffc00-0xfe9fffff irq 23 at device 29.7 on pci0 > > ehci0: [ITHREAD] > > usbus3: EHCI version 1.0 > > usbus3: <Intel 82801GB/R (ICH7) USB 2.0 controller> on ehci0 > > pcib3: <ACPI PCI-PCI bridge> at device 30.0 on pci0 > > pci8: <ACPI PCI bus> on pcib3 > > atapci0: <Promise PDC40718 SATA300 controller> port > 0xec00-0xec7f,0xe800-0xe8ff mem 0xfebff000-0xfebfffff,0xfebc0000-0xfebdffff > irq 16 at device 0.0 on pci8 > > atapci0: [ITHREAD] > > atapci0: [ITHREAD] > > ata2: <ATA channel 0> on atapci0 > > ata2: SIGNATURE: 00000101 > > ata2: [ITHREAD] > > ata3: <ATA channel 1> on atapci0 > > ata3: SIGNATURE: 00000101 > > ata3: [ITHREAD] > > ata4: <ATA channel 2> on atapci0 > > ata4: [ITHREAD] > > ata5: <ATA channel 3> on atapci0 > > ata5: SIGNATURE: 00000101 > > ata5: [ITHREAD] > > vgapci0: <VGA-compatible display> port 0xe000-0xe0ff mem > 0xe8000000-0xefffffff,0xfebb0000-0xfebbffff irq 16 at device 2.0 on pci8 > > isab0: <PCI-ISA bridge> at device 31.0 on pci0 > > isa0: <ISA bus> on isab0 > > atapci1: <Intel ICH7 UDMA100 controller> port > 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xffa0-0xffaf at device 31.1 on pci0 > > ata0: <ATA channel 0> on atapci1 > > ata0: [ITHREAD] > > ahci0: <Intel ICH7 AHCI SATA controller> port > 0xd480-0xd487,0xd400-0xd403,0xd080-0xd087,0xd000-0xd003,0xcc00-0xcc0f mem > 0xfe9ff800-0xfe9ffbff irq 19 at device 31.2 on pci0 > > ahci0: [ITHREAD] > > ahci0: AHCI v1.10 with 4 3Gbps ports, Port Multiplier not supported > > ahcich0: <AHCI channel> at channel 0 on ahci0 > > ahcich0: [ITHREAD] > > ahcich1: <AHCI channel> at channel 1 on ahci0 > > ahcich1: [ITHREAD] > > ahcich2: <AHCI channel> at channel 2 on ahci0 > > ahcich2: [ITHREAD] > > ahcich3: <AHCI channel> at channel 3 on ahci0 > > ahcich3: [ITHREAD] > > acpi_button0: <Power Button> on acpi0 > > atrtc0: <AT realtime clock> port 0x70-0x71 irq 8 on acpi0 > > uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 > > uart0: [FILTER] > > uart1: <16550 or compatible> port 0x2f8-0x2ff irq 3 on acpi0 > > uart1: [FILTER] > > ppc0: <Parallel port> port 0x378-0x37f,0x778-0x77f irq 7 drq 3 on acpi0 > > ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode > > ppc0: FIFO with 16/16/8 bytes threshold > > ppc0: [ITHREAD] > > ppbus0: <Parallel port bus> on ppc0 > > plip0: <PLIP network interface> on ppbus0 > > plip0: [ITHREAD] > > lpt0: <Printer> on ppbus0 > > lpt0: [ITHREAD] > > lpt0: Interrupt-driven port > > ppi0: <Parallel I/O> on ppbus0 > > orm0: <ISA Option ROMs> at iomem > 0xc0000-0xc8fff,0xc9000-0xcdfff,0xcf800-0xd47ff,0xd4800-0xd57ff on isa0 > > sc0: <System console> at flags 0x100 on isa0 > > sc0: VGA <16 virtual consoles, flags=0x300> > > vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 > > atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0 > > atkbd0: <AT Keyboard> irq 1 on atkbdc0 > > kbd0 at atkbd0 > > atkbd0: [GIANT-LOCKED] > > atkbd0: [ITHREAD] > > est0: <Enhanced SpeedStep Frequency Control> on cpu0 > > est: CPU supports Enhanced Speedstep, but is not recognized. > > est: cpu_vendor GenuineIntel, msr 102400001024 > > device_attach: est0 attach returned 6 > > p4tcc0: <CPU Frequency Thermal Control> on cpu0 > > est1: <Enhanced SpeedStep Frequency Control> on cpu1 > > est: CPU supports Enhanced Speedstep, but is not recognized. > > est: cpu_vendor GenuineIntel, msr 102400001024 > > device_attach: est1 attach returned 6 > > p4tcc1: <CPU Frequency Thermal Control> on cpu1 > > ZFS NOTICE: Prefetch is disabled by default if less than 4GB of RAM is > present; > > to enable, add "vfs.zfs.prefetch_disable=0" to > /boot/loader.conf. > > ZFS filesystem version 5 > > ZFS storage pool version 28 > > Timecounters tick every 1.000 msec > > usbus0: 12Mbps Full Speed USB v1.0 > > usbus1: 12Mbps Full Speed USB v1.0 > > usbus2: 12Mbps Full Speed USB v1.0 > > usbus3: 480Mbps High Speed USB v2.0 > > ugen0.1: <Intel> at usbus0 > > uhub0: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus0 > > ugen1.1: <Intel> at usbus1 > > uhub1: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus1 > > ugen2.1: <Intel> at usbus2 > > uhub2: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus2 > > ugen3.1: <Intel> at usbus3 > > uhub3: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus3 > > ad4: 715404MB <WDC WD7500AALX-009BA0 15.01H15> at ata2-master UDMA100 > SATA 3Gb/s > > ada0 at ahcich0 bus 0 scbus0 target 0 lun 0 > > ada0: <ST3250410AS 3.AAA> ATA-7 SATA 1.x device > > ada0: 150.000MB/s transfers (SATA 1.x, UDMA6, PIO 8192bytes) > > ada0: Command Queueing enabled > > ada0: 238475MB (488397168 512 byte sectors: 16H 63S/T 16383C) > > ada1 at ahcich1 bus 0 scbus1 target 0 lun 0 > > ada1: <ST3250410AS 3.AAA> ATA-7 SATA 1.x device > > ada1: 150.000MB/s transfers (SATA 1.x, UDMA6, PIO 8192bytes) > > ada1: Command Queueing enabled > > ada1: 238475MB (488397168 512 byte sectors: 16H 63S/T 16383C)ad6: > 610480MB <WDC WD6401AALS-00J7B1 05.00K05> at ata3-master UDMA100 SATA 3Gb/s > > > > ad10: 610480MB <WDC WD6401AALS-00J7B1 05.00K05> at ata5-master UDMA100 > SATA 3Gb/s > > SMP: AP CPU #1 Launched! > > Root mount waiting for: usbus3 usbus2 usbus1 usbus0 > > uhub0: 2 ports with 2 removable, self powered > > uhub1: 2 ports with 2 removable, self powered > > uhub2: 2 ports with 2 removable, self powered > > Root mount waiting for: usbus3 > > uhub3: 6 ports with 6 removable, self powered > > Root mount waiting for: usbus3 > > ugen3.2: <Seagate> at usbus3 > > umass0: <Seagate FreeAgent Go, class 0/0, rev 2.00/1.38, addr 2> on > usbus3 > > umass0: SCSI over Bulk-Only; quirks = 0x0000 > > ugen0.2: <American Power Conversion> at usbus0 > > umass0:4:0:-1: Attached to scbus4 > > Trying to mount root from zfs:zroot > > da0 at umass-sim0 bus 0 scbus4 target 0 lun 0 > > da0: <Seagate FreeAgent Go 0138> Fixed Direct Access SCSI-4 device > > da0: 40.000MB/s transfers > > da0: 610480MB (1250263726 512 byte sectors: 255H 63S/T 77825C) > > bge0: link state changed to UP > > S > > log_sysevent: type 19 is not implemented > > ata2: SIGNATURE: ffffffff > > ata2: timeout waiting to issue command > > ata2: error issuing SETFEATURES SET TRANSFER MODE command > > ata2: timeout waiting to issue command > > ata2: error issuing SETFEATURES ENABLE RCACHE command > > ata2: timeout waiting to issue command > > ata2: error issuing SETFEATURES ENABLE WCACHE command > > ad4: WARNING - SET_MULTI taskqueue timeout - completing request directly > > ad4: WARNING - WRITE_DMA48 requeued due to channel reset LBA=321558741 > > ata2: SIGNATURE: 00000101 > > ad4: WARNING - WRITE_DMA48 requeued due to channel reset LBA=321558869 > > ata2: FAILURE - already active DMA on this device > > ata2: setting up DMA failed > > ata2: FAILURE - already active DMA on this device > > ata2: setting up DMA failed > > George, > > Can you please install ports/sysutils/smartmontools (should be version > 5.41; if you have an older version please upgrade) and provide output > from the following comman > > smartctl -a /dev/ad4 > > With this I should be able to rule out weird disk problems. It's always > good to start there. > > For those unable to parse the above topology, the system has two SATA > controllers (the Promise uses ata(4), while the on-board ICH7 is in AHCI > mode and is using ahci.ko (AHCI-to-CAM)): > > atapci0 = Promise PDC40718 (Promise SATA300 TX4) > --> ata2-master = ad4 = WDC WD7500AALX-009BA0 > --> ata2-slave = <empty> > --> ata3-master = ad6 = WDC WD6401AALS-00J7B1 > --> ata3-slave = <empty> > --> ata4-master = <empty> > --> ata4-slave = <empty> > --> ata5-master = ad10 = WDC WD6401AALS-00J7B1 > --> ata5-slave = <empty> > > ahci0 = Intel ICH7 on-board in AHCI mode > --> ahcich0 = ada0 = ST3250410AS 3.AAA > --> ahcich1 = ada1 = ST3250410AS 3.AAA > --> ahcich2 = <empty> > --> ahcich3 = <empty> > > If you can't get this situation solved, I'd recommend spending $40 > (pocket change) to invest in a Silicon Image 3124 card. Your existing > Promise controller is a PCI card (not PCIe or PCI-X), and I don't know > if your motherboard has any PCIe or PCI-X slots, so I'm going to assume > the 133MByte/sec limitation is acceptable to you. As such, that limits > you to effectively this card: > > http://www.newegg.com/Product/Product.aspx?Item=N82E16816132017 > > You do not have to use the RAID functionality of the card. FreeBSD > supports this card using siis(4) and it does utilise CAM, so your disks > would show up as adaX. The driver is actively supported/maintained. > > Avoid looking at cards which use the 3112, 3114, or 3512 chips. > > Hope this helps, or at least directs you in a path that lets you solve > the problem through a little bit of money. > > -- > | Jeremy Chadwick jdc at parodius.com | > | Parodius Networking http://www.parodius.com/ | > | UNIX Systems Administrator Mountain View, CA, US | > | Making life hard for others since 1977. PGP 4BD6C0CB | > > -- George Kontostanos aisecure.net <http://www.aisecure.net> _______________________________________________ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"