>Number: 157397 >Category: kern >Synopsis: ahci/ada/cam NCQ timeouts on Samsung and non-disable-ability >Confidential: no >Severity: serious >Priority: low >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Sun May 29 13:50:09 UTC 2011 >Closed-Date: >Last-Modified: >Originator: Matthias Andree >Release: FreeBSD 8.2-STABLE amd64 >Organization: >Environment: System: FreeBSD apollo.emma.line.org 8.2-STABLE FreeBSD 8.2-STABLE #67: Fri May 27 21:36:00 CEST 2011 t...@apollo.emma.line.org:/usr/obj/usr/src/sys/GENERIC amd64
>Description: (please disregard "class" field above) Greetings, I seem to have difficulties with the interaction between ahci(4), ada(4) and the CAM subsystem on RELENG_8. Perhaps my HDD is quirky (Samsung HD103SI rev. 1AG01118), perhaps ahci or ada. (dmesg below) I seem to have two problems: 1. the problem is that once in a while queued ATA commands time out, and let the computer freeze for like half a minute or two. It then recovers. This only ever affects the Samsung drive, I've never seen it on the Western Digital. Either drive doesn't report any suspicious data to S.M.A.R.T. - no pending blocks, no past reallocations, no parameters even close to Thresh, no self-test failures. Possibly the Samsung also has firmware quirks, silent limits and so on, I know there were firmware updates to avoid data loss for larger siblings in the same drive series, but apparently not for the HD103SI. 2. Trying to reduce the number of tagged openings with camcontrol tags ada1 -N 16 doesn't work because min == max; trying to disable tagged negotiation through "camcontrol negotiate ada1 -T disable" does not have any visible effect. This is my /boot/loader.conf: ahci_load="YES" atapicam_load="YES" cuse4bsd_load="YES" geom_journal_load="YES" sem_load="YES" snd_hda_load="YES" vboxdrv_load="YES" hw.ata.wc=0 kern.cam.ada.write_cache=0 kern.ipc.shmall=32768 kern.ipc.shmmax=67108864 kern.maxfiles="25000" vfs.root.mountfrom="ufs:/dev/ada0s4a" This is dmesg with sensitive and useless information omitted. At the end, it shows two episodes with freezes, of a handful of timeouts each time. ... FreeBSD 8.2-STABLE #67: Fri May 27 21:36:00 CEST 2011 t...@apollo.emma.line.org:/usr/obj/usr/src/sys/GENERIC amd64 Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: AMD Phenom(tm) II X4 905e Processor (2508.56-MHz K8-class CPU) Origin = "AuthenticAMD" Id = 0x100f42 Family = 10 Model = 4 Stepping = 2 Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT> Features2=0x802009<SSE3,MON,CX16,POPCNT> AMD Features=0xee500800<SYSCALL,NX,MMX+,FFXSR,Page1GB,RDTSCP,LM,3DNow!+,3DNow!> AMD Features2=0x37ff<LAHF,CMP,SVM,ExtAPIC,CR8,ABM,SSE4A,MAS,Prefetch,OSVW,IBS,SKINIT,WDT> TSC: P-state invariant real memory = 4294967296 (4096 MB) avail memory = 3838828544 (3660 MB) ACPI APIC Table: <052010 APIC1732> FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs FreeBSD/SMP: 1 package(s) x 4 core(s) cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 cpu2 (AP): APIC ID: 2 cpu3 (AP): APIC ID: 3 ACPI Warning: Optional field Pm2ControlBlock has zero address or length: 0x0000000000000000/0x1 (20101013/tbfadt-655) ioapic0 <Version 2.1> irqs 0-23 on motherboard Cuse4BSD v0.1.14 @ /dev/cuse kbd1 at kbdmux0 acpi0: <052010 XSDT1732> on motherboard acpi0: [ITHREAD] acpi0: Power Button (fixed) acpi0: reservation of fee00000, 1000 (3) failed acpi0: reservation of ffb80000, 80000 (3) failed acpi0: reservation of fec10000, 20 (3) failed acpi0: reservation of 0, a0000 (3) failed acpi0: reservation of 100000, cfe00000 (3) failed Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <32-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0 cpu0: <ACPI CPU> on acpi0 cpu1: <ACPI CPU> on acpi0 cpu2: <ACPI CPU> on acpi0 cpu3: <ACPI CPU> on acpi0 acpi_hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0 Timecounter "HPET" frequency 14318180 Hz quality 900 pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 pci0: <ACPI PCI bus> on pcib0 pcib1: <ACPI PCI-PCI bridge> at device 1.0 on pci0 pci1: <ACPI PCI bus> on pcib1 vgapci0: <VGA-compatible display> port 0xc000-0xc0ff mem 0xd0000000-0xdfffffff,0xfbde0000-0xfbdeffff,0xfbc00000-0xfbcfffff irq 18 at device 5.0 on pci1 hdac0: <ATI RS780 High Definition Audio Controller> mem 0xfbdfc000-0xfbdfffff irq 19 at device 5.1 on pci1 hdac0: HDA Driver Revision: 20100226_0142 hdac0: [ITHREAD] pcib2: <ACPI PCI-PCI bridge> irq 18 at device 6.0 on pci0 pci2: <ACPI PCI bus> on pcib2 ale0: <Atheros AR8121/AR8113/AR8114 PCIe Ethernet> port 0xdc00-0xdc7f mem 0xfbec0000-0xfbefffff irq 18 at device 0.0 on pci2 ale0: 960 Tx FIFO, 1024 Rx FIFO ale0: Using 1 MSI messages. ale0: 4GB boundary crossed, switching to 32bit DMA addressing mode. miibus0: <MII bus> on ale0 atphy0: <Atheros F1 10/100/1000 PHY> PHY 0 on miibus0 atphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT-FDX, 1000baseT-FDX-master, auto ale0: Ethernet address: ... ale0: [FILTER] pcib3: <ACPI PCI-PCI bridge> irq 19 at device 7.0 on pci0 pci3: <ACPI PCI bus> on pcib3 fwohci0: <1394 Open Host Controller Interface> port 0xe800-0xe8ff mem 0xfbfff800-0xfbffffff irq 19 at device 0.0 on pci3 fwohci0: [ITHREAD] fwohci0: OHCI version 1.10 (ROM=1) fwohci0: No. of Isochronous channels is 4. fwohci0: EUI64 ... fwohci0: Phy 1394a available S400, 2 ports. fwohci0: Link S400, max_rec 2048 bytes. firewire0: <IEEE1394(FireWire) bus> on fwohci0 fwe0: <Ethernet over FireWire> on firewire0 if_fwe0: Fake Ethernet address: ... fwe0: Ethernet address: ... fwip0: <IP over FireWire> on firewire0 fwip0: Firewire address: ... dcons_crom0: <dcons configuration ROM> on firewire0 dcons_crom0: bus_addr 0x1028000 fwohci0: Initiate bus reset fwohci0: fwohci_intr_core: BUS reset fwohci0: fwohci_intr_core: node_id=0x00000000, SelfID Count=1, CYCLEMASTER mode ahci0: <ATI IXP700 AHCI SATA controller> port 0xb000-0xb007,0xa000-0xa003,0x9000-0x9007,0x8000-0x8003,0x7000-0x700f mem 0xfbbffc00-0xfbbfffff irq 22 at device 17.0 on pci0 ahci0: [ITHREAD] ahci0: AHCI v1.10 with 6 3Gbps ports, Port Multiplier supported ahcich0: <AHCI channel> at channel 0 on ahci0 ahcich0: [ITHREAD] ahcich1: <AHCI channel> at channel 1 on ahci0 ahcich1: [ITHREAD] ahcich2: <AHCI channel> at channel 2 on ahci0 ahcich2: [ITHREAD] ahcich3: <AHCI channel> at channel 3 on ahci0 ahcich3: [ITHREAD] ahcich4: <AHCI channel> at channel 4 on ahci0 ahcich4: [ITHREAD] ahcich5: <AHCI channel> at channel 5 on ahci0 ahcich5: [ITHREAD] ohci0: <OHCI (generic) USB controller> mem 0xfbbfd000-0xfbbfdfff irq 16 at device 18.0 on pci0 ohci0: [ITHREAD] usbus0: <OHCI (generic) USB controller> on ohci0 ohci1: <OHCI (generic) USB controller> mem 0xfbbfe000-0xfbbfefff irq 16 at device 18.1 on pci0 ohci1: [ITHREAD] usbus1: <OHCI (generic) USB controller> on ohci1 ehci0: <EHCI (generic) USB 2.0 controller> mem 0xfbbff800-0xfbbff8ff irq 17 at device 18.2 on pci0 ehci0: [ITHREAD] usbus2: EHCI version 1.0 usbus2: <EHCI (generic) USB 2.0 controller> on ehci0 ohci2: <OHCI (generic) USB controller> mem 0xfbbfb000-0xfbbfbfff irq 18 at device 19.0 on pci0 ohci2: [ITHREAD] usbus3: <OHCI (generic) USB controller> on ohci2 ohci3: <OHCI (generic) USB controller> mem 0xfbbfc000-0xfbbfcfff irq 18 at device 19.1 on pci0 ohci3: [ITHREAD] usbus4: <OHCI (generic) USB controller> on ohci3 ehci1: <EHCI (generic) USB 2.0 controller> mem 0xfbbff400-0xfbbff4ff irq 19 at device 19.2 on pci0 ehci1: [ITHREAD] usbus5: EHCI version 1.0 usbus5: <EHCI (generic) USB 2.0 controller> on ehci1 pci0: <serial bus, SMBus> at device 20.0 (no driver attached) atapci0: <ATI IXP700/800 UDMA133 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xff00-0xff0f at device 20.1 on pci0 ata0: <ATA channel 0> on atapci0 ata0: [ITHREAD] hdac1: <ATI SB600 High Definition Audio Controller> mem 0xfbbf4000-0xfbbf7fff irq 16 at device 20.2 on pci0 hdac1: HDA Driver Revision: 20100226_0142 hdac1: [ITHREAD] isab0: <PCI-ISA bridge> at device 20.3 on pci0 isa0: <ISA bus> on isab0 pcib4: <ACPI PCI-PCI bridge> at device 20.4 on pci0 pci4: <ACPI PCI bus> on pcib4 ohci4: <OHCI (generic) USB controller> mem 0xfbbfa000-0xfbbfafff irq 18 at device 20.5 on pci0 ohci4: [ITHREAD] usbus6: <OHCI (generic) USB controller> on ohci4 acpi_button0: <Power Button> on acpi0 atrtc0: <AT realtime clock> port 0x70-0x71 irq 8 on acpi0 acpi_hpet1: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0 device_attach: acpi_hpet1 attach returned 12 uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 uart0: [FILTER] sc0: <System console> at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0 atkbd0: <AT Keyboard> irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] atkbd0: [ITHREAD] ppc0: cannot reserve I/O port range acpi_throttle0: <ACPI CPU Throttling> on cpu0 hwpstate0: <Cool`n'Quiet 2.0> on cpu0 Timecounters tick every 1.000 msec vboxdrv: fAsync=0 offMin=0x15a offMax=0x29f firewire0: 1 nodes, maxhop <= 0 cable IRM irm(0) (me) firewire0: bus manager 0 hdac0: HDA Codec #0: ATI RS690/780 HDMI pcm0: <HDA ATI RS690/780 HDMI PCM #0 HDMI> at cad 0 nid 1 on hdac0 hdac1: HDA Codec #0: VIA VT1708S_0 pcm1: <HDA VIA VT1708S_0 PCM #0 Analog> at cad 0 nid 1 on hdac1 pcm2: <HDA VIA VT1708S_0 PCM #1 Digital> at cad 0 nid 1 on hdac1 pcm3: <HDA VIA VT1708S_0 PCM #2 Digital> at cad 0 nid 1 on hdac1 usbus0: 12Mbps Full Speed USB v1.0 usbus1: 12Mbps Full Speed USB v1.0 usbus2: 480Mbps High Speed USB v2.0 usbus3: 12Mbps Full Speed USB v1.0 usbus4: 12Mbps Full Speed USB v1.0 usbus5: 480Mbps High Speed USB v2.0 usbus6: 12Mbps Full Speed USB v1.0 ugen0.1: <ATI> at usbus0 uhub0: <ATI OHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus0 ugen1.1: <ATI> at usbus1 uhub1: <ATI OHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus1 ugen2.1: <ATI> at usbus2 uhub2: <ATI EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus2 ugen3.1: <ATI> at usbus3 uhub3: <ATI OHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus3 ugen4.1: <ATI> at usbus4 uhub4: <ATI OHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus4 ugen5.1: <ATI> at usbus5 uhub5: <ATI EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus5 ugen6.1: <ATI> at usbus6 uhub6: <ATI OHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus6 ada0 at ahcich0 bus 0 scbus0 target 0 lun 0 ada0: <WDC WD5002ABYS-02B1B0 ...> ATA-8 SATA 2.x device ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) ada0: Command Queueing enabled ada0: 476940MB (976773168 512 byte sectors: 16H 63S/T 16383C) ada1 at ahcich2 bus 0 scbus2 target 0 lun 0 ada1: <SAMSUNG HD103SI 1AG01118> ATA-7 SATA 2.x device ada1: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) ada1: Command Queueing enabled ada1: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C) cd0 at ahcich3 bus 0 scbus3 target 0 lun 0 cd0: <...> Removable CD-ROM SCSI-0 device cd0: 150.000MB/s transfers (SATA 1.x, UDMA5, ATAPI 12bytes, PIO 8192bytes) cd0: Attempt to query device size failed: NOT READY, Medium not present - tray closed SMP: AP CPU #2 Launched! SMP: AP CPU #3 Launched! SMP: AP CPU #1 Launched! GEOM: ada0: partition 2 does not start on a track boundary. GEOM: ada0: partition 2 does not end on a track boundary. GEOM: ada0: partition 1 does not start on a track boundary. GEOM: ada0: partition 1 does not end on a track boundary. GEOM: ada1: partition 2 does not start on a track boundary. GEOM: ada1: partition 2 does not end on a track boundary. uhub6: 2 ports with 2 removable, self powered uhub0: 3 ports with 3 removable, self powered uhub1: 3 ports with 3 removable, self powered uhub3: 3 ports with 3 removable, self powered uhub4: 3 ports with 3 removable, self powered GEOM_JOURNAL: Journal 3658629072: ada1s1d contains data. GEOM_JOURNAL: Journal 3658629072: ada1s1e contains journal. GEOM_JOURNAL: Journal ada1s1d clean. Root mount waiting for: GJOURNAL usbus5 usbus2 GEOM: ada1s1d.journal: invalid disklabel. GEOM: ufsid/4d540ab0e4fef3cb: invalid disklabel. GEOM: ufs/usrgjournal: invalid disklabel. Root mount waiting for: usbus5 usbus2 Root mount waiting for: usbus5 usbus2 uhub2: 6 ports with 6 removable, self powered uhub5: 6 ports with 6 removable, self powered ugen5.2: <vendor 0x05e3> at usbus5 umass0: <vendor 0x05e3 USB Storage, class 0/0, rev 2.00/97.32, addr 2> on usbus5 umass0: SCSI over Bulk-Only; quirks = 0x0000 Root mount waiting for: usbus5 usbus2 ugen0.2: <Microsoft> at usbus0 ukbd0: <Microsoft Natural Ergonomic Keyboard 4000, class 0/0, rev 2.00/1.73, addr 2> on usbus0 ugen2.2: <vendor 0x05e3> at usbus2 uhub7: <vendor 0x05e3 USB2.0 Hub, class 9/0, rev 2.00/7.02, addr 2> on usbus2 kbd2 at ukbd0 uhid0: <Microsoft Natural Ergonomic Keyboard 4000, class 0/0, rev 2.00/1.73, addr 2> on usbus0 umass0:6:0:-1: Attached to scbus6 uhub7: 4 ports with 4 removable, self powered (probe0:umass-sim0:0:0:0): TEST UNIT READY. CDB: 0 0 0 0 0 0 (probe0:umass-sim0:0:0:0): CAM status: SCSI Status Error (probe0:umass-sim0:0:0:0): SCSI status: Check Condition (probe0:umass-sim0:0:0:0): SCSI sense: NOT READY asc:3a,0 (Medium not present) da0 at umass-sim0 bus 0 scbus6 target 0 lun 0 da0: <Generic STORAGE DEVICE 9732> Removable Direct Access SCSI-0 device da0: 40.000MB/s transfers da0: Attempt to query device size failed: NOT READY, Medium not present ugen2.3: <EPSON> at usbus2 ulpt0: <EPSON USB Printer, class 0/0, rev 1.10/1.00, addr 3> on usbus2 ulpt0: using bi-directional mode Root mount waiting for: usbus2 (probe0:umass-sim0:0:0:1): TEST UNIT READY. CDB: 0 20 0 0 0 0 (probe0:umass-sim0:0:0:1): CAM status: SCSI Status Error (probe0:umass-sim0:0:0:1): SCSI status: Check Condition (probe0:umass-sim0:0:0:1): SCSI sense: NOT READY asc:3a,0 (Medium not present) da1 at umass-sim0 bus 0 scbus6 target 0 lun 1 da1: <Generic STORAGE DEVICE 9732> Removable Direct Access SCSI-0 device da1: 40.000MB/s transfers da1: Attempt to query device size failed: NOT READY, Medium not present ugen2.4: ... ugen2.5: ... ulpt1: ... ulpt1: using bi-directional mode Trying to mount root from ufs:/dev/ada0s4a ugen0.3: ... ums0: ... ums0: 8 buttons and [XYZT] coordinates ID=0 cryptosoft0: <software crypto> on motherboard GEOM_ELI: Device ada0s4b.eli created. GEOM_ELI: Encryption: AES-XTS 256 GEOM_ELI: Crypto: software vboxnet0: Ethernet address: ... ale0: link state changed to UP fuse4bsd: version 0.3.9-pre1, FUSE ABI 7.8 drm0: <ATI Radeon 3300 Graphics> on vgapci0 info: [drm] MSI enabled 1 message(s) info: [drm] Initialized radeon 1.31.0 20080613 info: [drm] Setting GART location based on new memory map info: [drm] Loading RS780/RS880 Microcode info: [drm] Resetting GPU info: [drm] writeback test succeeded in 1 usecs drm0: [ITHREAD] info: [drm] Resetting GPU First failure episode: ahcich2: Timeout on slot 28 ahcich2: is 00000000 cs 8fffffff ss ffffffff rs ffffffff tfd c0 serr 00000000 ahcich2: Timeout on slot 26 ahcich2: is 00000000 cs e3ffffff ss ffffffff rs ffffffff tfd c0 serr 00000000 ahcich2: Timeout on slot 24 ahcich2: is 00000000 cs f8ffffff ss ffffffff rs ffffffff tfd c0 serr 00000000 Computer has recovered. Several minutes later a second episode: ahcich2: Timeout on slot 24 ahcich2: is 00000000 cs f80000ff ss ff0000ff rs ff0000ff tfd c0 serr 00000000 ahcich2: Timeout on slot 23 ahcich2: is 00000000 cs fc00003f ss ff80003f rs ff80003f tfd c0 serr 00000000 ahcich2: Timeout on slot 19 ahcich2: is 00000000 cs ffc00007 ss fff80007 rs fff80007 tfd c0 serr 00000000 ahcich2: Timeout on slot 18 ahcich2: is 00000000 cs ffe00007 ss fffc0007 rs fffc0007 tfd c0 serr 00000000 ahcich2: Timeout on slot 18 ahcich2: is 00000000 cs ffe00007 ss fffc0007 rs fffc0007 tfd c0 serr 00000000 ahcich2: Timeout on slot 18 ahcich2: is 00000000 cs ffe00007 ss fffc0007 rs fffc0007 tfd c0 serr 00000000 ahcich2: Timeout on slot 18 ahcich2: is 00000000 cs ffe0000f ss fffc000f rs fffc000f tfd c0 serr 00000000 ahcich2: Timeout on slot 25 ahcich2: is 00000000 cs f1ffffff ss ffffffff rs ffffffff tfd c0 serr 00000000 >How-To-Repeat: >Fix: >Release-Note: >Audit-Trail: >Unformatted: _______________________________________________ freebsd-bugs@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-bugs To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"