On Feb 10, 2006, at 9:03 AM, Wilko Bulte wrote:
On Thu, Feb 09, 2006 at 11:08:24PM +0100, Wilko Bulte wrote..
On Thu, Feb 09, 2006 at 09:24:23PM +0100, Sren Schmidt wrote..
Wilko Bulte wrote:
On Thu, Feb 09, 2006 at 03:45:53PM +0100, Sren Schmidt wrote..
Wilko Bulte wrote:
On Thu, Feb 09, 2006 at 03:37:07PM +0100, Sren Schmidt wrote..
Wilko Bulte wrote:
On Wed, Feb 08, 2006 at 10:44:05PM +0100, Sren Schmidt wrote..
Wilko Bulte wrote:
On Wed, Feb 08, 2006 at 10:02:08PM +0100, Sren Schmidt
wrote..
Wilko Bulte wrote:
Hi Soren,
I just went to 6.1-PRE on my main machine, coming from
6.0-STABLE
of roughly end of december.
And I hit some stuff that really worries me:
- the freshly built kernel keels over with (hand
transcribed):
ata3: reiniting channel SATA connect ...
SATA connected
sata_connect_devices 0x1 <ATA_MASTER>
ad6: req=0xC35ba0c8 SETFEATURES SETTRANSFERMODE
semaphore timeout
!! DANGER Will RObinson !!
(... is where I cannot read my own handwriting, it
scrolled quite
fast on
the screen..)
Boot device is a SATA RAID1 on a Promise 2300.
Hmm, that should not happen. Could you try to backstep
just ATA to
before the MFC, that is 24/1/06 and let me know if that
helps
please ?
First impression is that the problem is gone. None of the
previously reported errors are seen. I am running a level
0 dump
from disk to disk
to see if the box remains stable. Given that this is my
primary
machine
I sure hope it will be :-)
Another snag is that my ad10 disk on 6.0-STABLE suddenly
became
ad12 on
6.1-PRE
Hmm that is because there is only 2 ports on your promise
which is
now correctly identified, before it was errounsly found
as 3 ports.
Ah, OK. I would suggest a note to the Release Note
writers would be
a good
thing, devices changing location after an upgrade in the -
stable
branch
is unnerving ;-)
Well, the good thing is that I can reproduce the error
here, the bad
thing is that it slipped through testing on -current...
Oh, well, I'll look into it ASAP...
Thank you Soren!
OK, had a few this afternoon, could you try this patch and
let me know
if it helps, at least it makes the problem go away on my
testbed..
Is this relative to HEAD or RELENG_6? I cannot / will not go
to HEAD
with this machine (my main production box.. :-)
Doesn't matter, ATA is the same on both...
OK, I was not sure if they were 100% identical.
The patch at first impression seems to have eliminated the problem.
Good seems I'm on the right track at least.
Interestingly enough ad10 remained ad10 with the patch applied?
Yeah, thats intentional, I though we better not break POLA here..
I agree :-)
I'll put some load on to see what happens.
Let me know how that turns out, I'll clean things up a bit and
get it
committed to -current, then get permission to MFC when we are
sure it
fixes the problem...
I ran a 44GB disk-to-disk dump without incidents (source on the
RAID1,
target on the JBOD). No problems whatsoever.
Looks like things behave much better now. Tonight the machine will
run a daily full dump to DLT tape, I'll know how that turns out
tomorrow.
Backup ran without problems.
--
Wilko Bulte [EMAIL PROTECTED]
_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-
[EMAIL PROTECTED]"
Soren,
I am still getting this error. I cvsup'd to RELENG_6 today (Feb. 15,
2006 at 6:00 PM CET) and I got the following files from HEAD:
ata-all.c (1.264)
ata-all.h (1.113)
ata-chipset.c (1.158)
ata-pci.c (1.115)
I have then rebuilt the kernel and world and I tested the system.
I have a Promise TX2300 with two 250 GB Maxtor in RAID1 with gmirror,
plus two 9 GB SCSI drives in RAID1 with gmirror where I put the OS
and applications and a 160 GB UATA Seagate (attached to the
motherboard UATA) where I backup sensitive data.
The system was able to rebuild the mirror (I always got the error
with RELENG_6_0), but then I made a backup tar from the two Maxtor's
mirror to the Seagate and I got the following error after having
copied approx 16 GB:
Feb 15 20:56:31 zoe kernel: GEOM_MIRROR: Device data1: provider ad4
disconnected.
Feb 15 20:56:31 zoe kernel: GEOM_MIRROR: Device data1: provider ad4
detected.
Feb 15 20:56:31 zoe kernel: GEOM_MIRROR: Device data1: rebuilding
provider ad4.
...
Feb 15 22:33:15 zoe kernel: GEOM_MIRROR: Device data1: rebuilding
provider ad4 finished.
Feb 15 22:33:15 zoe kernel: GEOM_MIRROR: Device data1: provider ad4
activated.
...
Feb 15 23:04:12 zoe kernel: ad4: req=0xc2422578 SETFEATURES SET
TRANSFER MODE semaphore timeout !! DANGER Will Robinson !!
Feb 15 23:04:28 zoe last message repeated 4 times
Feb 15 23:04:32 zoe kernel: ad4: req=0xc2422578 SETFEATURES SET
TRANSFER MODE semaphore timeout !! DANGER Will Robinson !!
Feb 15 23:05:04 zoe last message repeated 8 times
Feb 15 23:07:08 zoe last message repeated 31 times
Feb 15 23:08:56 zoe last message repeated 27 times
Feb 15 23:09:00 zoe kernel: ad4: req=0xc2422578 SETFEATURES SET
TRANSFER MODE semaphore timeout !! DANGER Will Robinson !!
Feb 15 23:09:32 zoe last message repeated 8 times
Feb 15 23:11:36 zoe last message repeated 31 times
Feb 15 23:21:40 zoe last message repeated 151 times
Feb 15 23:31:40 zoe last message repeated 150 times
Feb 15 23:34:28 zoe last message repeated 42 times
Feb 15 23:34:32 zoe kernel: ad4: req=0xc2422578 SETFEATURES SET
TRANSFER MODE semaphore timeout !! DANGER Will Robinson !!
Feb 15 23:35:04 zoe last message repeated 8 times
Feb 15 23:37:08 zoe last message repeated 31 times
Feb 15 23:40:52 zoe last message repeated 56 times
Feb 15 23:40:53 zoe shutdown: power-down by paolo:
Feb 15 23:40:56 zoe syslogd: exiting on signal 15
I was not able to shutdown the system cleanly.
I also noticed a small problem in the code (probably not related to
this error):
At line 1011 of src/sys/dev/ata/ata-all.c a free() is used but
ata_delayed_attach is not reset to NULL
Actual (1.264):
if (config_intrhook_establish(ata_delayed_attach) != 0) {
printf("ata: config_intrhook_establish failed\n");
free(ata_delayed_attach, M_TEMP);
}
Fix:
if (config_intrhook_establish(ata_delayed_attach) != 0) {
printf("ata: config_intrhook_establish failed\n");
free(ata_delayed_attach, M_TEMP);
+ ata_delayed_attach = NULL;
}
This is my dmesg:
Copyright (c) 1992-2006 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD 6.1-PRERELEASE #0: Wed Feb 15 20:27:17 CET 2006
[EMAIL PROTECTED]:/usr/obj/usr/src/sys/ZOE_022
ACPI APIC Table: <Intel N440BX >
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel Pentium III (696.41-MHz 686-class CPU)
Origin = "GenuineIntel" Id = 0x683 Stepping = 3
Features=0x387fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,
MCA,CMOV,PAT,PSE36,PN,MMX,FXSR,SSE>
real memory = 536805376 (511 MB)
avail memory = 516100096 (492 MB)
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
cpu0 (BSP): APIC ID: 1
cpu1 (AP): APIC ID: 0
ioapic0 <Version 1.1> irqs 0-23 on motherboard
npx0: [FAST]
npx0: <math processor> on motherboard
npx0: INT 16 interface
acpi0: <PTLTD RSDT> on motherboard
acpi0: Power Button (fixed)
Timecounter "ACPI-safe" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0xc08-0xc0b on acpi0
cpu0: <ACPI CPU> on acpi0
cpu1: <ACPI CPU> on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
agp0: <Intel 82443GX host to PCI bridge> mem 0xf8000000-0xfbffffff at
device 0.0 on pci0
pcib1: <PCI-PCI bridge> at device 1.0 on pci0
pci1: <PCI bus> on pcib1
pcib2: <PCI-PCI bridge> at device 15.0 on pci1
pci2: <PCI bus> on pcib2
atapci0: <Promise PDC20771 SATA300 controller> port 0x3400-0x347f,
0x3000-0x30ff mem 0xf4220000-0xf4220fff,0xf4200000-0xf421ffff irq 23
at device 7.0 on pci2
ata2: <ATA channel 0> on atapci0
ata3: <ATA channel 1> on atapci0
ata4: <ATA channel 2> on atapci0
ahc0: <Adaptec aic7896/97 Ultra2 SCSI adapter> port 0x2400-0x24ff mem
0xf4101000-0xf4101fff irq 19 at device 12.0 on pci0
ahc0: [GIANT-LOCKED]
aic7896/97: Ultra2 Wide Channel A, SCSI Id=7, 32/253 SCBs
ahc1: <Adaptec aic7896/97 Ultra2 SCSI adapter> port 0x2800-0x28ff mem
0xf4102000-0xf4102fff irq 19 at device 12.1 on pci0
ahc1: [GIANT-LOCKED]
aic7896/97: Ultra2 Wide Channel B, SCSI Id=7, 32/253 SCBs
fxp0: <Intel 82559 Pro/100 Ethernet> port 0x2000-0x203f mem
0xf4100000-0xf4100fff,0xf4000000-0xf40fffff irq 21 at device 14.0 on
pci0
miibus0: <MII bus> on fxp0
inphy0: <i82555 10/100 media interface> on miibus0
inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
fxp0: Ethernet address: 00:d0:b7:88:6c:c5
isab0: <PCI-ISA bridge> at device 18.0 on pci0
isa0: <ISA bus> on isab0
atapci1: <Intel PIIX4 UDMA33 controller> port
0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x2040-0x204f at device 18.1 on pci0
ata0: <ATA channel 0> on atapci1
ata1: <ATA channel 1> on atapci1
uhci0: <Intel 82371AB/EB (PIIX4) USB controller> port 0x2060-0x207f
irq 21 at device 18.2 on pci0
uhci0: [GIANT-LOCKED]
usb0: <Intel 82371AB/EB (PIIX4) USB controller> on uhci0
usb0: USB revision 1.0
uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
intpm0: <Intel 82371AB Power management controller> port
0x1040-0x104f irq 9 at device 18.3 on pci0
intpm0: I/O mapped 1040
intpm0: intr IRQ 9 enabled revision 0
intpm0: [GIANT-LOCKED]
intsmb0: <Intel PIIX4 SMBUS Interface> on intpm0
smbus1: <System Management Bus> on intsmb0
smb0: <SMBus generic I/O> on smbus1
intpm0: PM I/O mapped c00
pci0: <display, VGA> at device 20.0 (no driver attached)
atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
fdc0: <floppy drive controller> port 0x3f4-0x3f7 irq 6 drq 2 on acpi0
fdc0: No FDOUT register!
device_attach: fdc0 attach returned 6
ppc0: <ECP parallel printer port> port 0x378-0x37f,0x778-0x77f irq 7
drq 3 on acpi0
ppc0: Generic chipset (ECP/PS2/NIBBLE) in COMPATIBLE mode
ppc0: FIFO with 16/16/8 bytes threshold
ppbus0: <Parallel port bus> on ppc0
plip0: <PLIP network interface> on ppbus0
lpt0: <Printer> on ppbus0
lpt0: Interrupt-driven port
ppi0: <Parallel I/O> on ppbus0
sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10
on acpi0
sio0: type 16550A
sio1: <16550A-compatible COM port> port 0x2f8-0x2ff irq 3 on acpi0
sio1: type 16550A
fdc0: <floppy drive controller> port 0x3f4-0x3f7 irq 6 drq 2 on acpi0
fdc0: No FDOUT register!
device_attach: fdc0 attach returned 6
pmtimer0 on isa0
orm0: <ISA Option ROMs> at iomem 0xc0000-0xc7fff,0xc8000-0xcdfff,
0xce000-0xcf7ff on isa0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on
isa0
ulpt0: EPSON USB Printer, rev 1.10/1.00, addr 2, iclass 7/1
ulpt0: using bi-directional mode
Timecounters tick every 1.000 msec
ad0: 152627MB <WDC WD1600BB-00HTA0 15.05R15> at ata0-master UDMA33
acd0: CDRW <HL-DT-ST GCE-8525B/1.02> at ata1-master UDMA33
ad4: 239372MB <Maxtor 7L250S0 BANC1E00> at ata2-master SATA150
GEOM_MIRROR: Device data1 created (id=618669616).
GEOM_MIRROR: Device data1: provider ad4 detected.
ad6: 239372MB <Maxtor 7L250S0 BANC1E00> at ata3-master SATA150
Waiting 5 seconds for SCSI devices to settle
GEOM_MIRROR: Device data1: provider ad6 detected.
GEOM_MIRROR: Device data1: provider ad6 activated.
GEOM_MIRROR: Device data1: provider ad4 is stale.
GEOM_MIRROR: Device data1: provider mirror/data1 launched.
SMP: AP CPU #1 Launched!
da0 at ahc0 bus 0 target 0 lun 0
da0: <IBM DNES-309170W SAH0> Fixed Direct Access SCSI-3 device
da0: 80.000MB/s transfers (40.000MHz, offset 30, 16bit), Tagged
Queueing Enabled
da0: 8748MB (17916240 512 byte sectors: 255H 63S/T 1115C)
da1 at ahc0 bus 0 target 1 lun 0
da1: <IBM DNES-309170W SAH0> Fixed Direct Access SCSI-3 device
da1: 80.000MB/s transfers (40.000MHz, offset 30, 16bit), Tagged
Queueing Enabled
da1: 8748MB (17916240 512 byte sectors: 255H 63S/T 1115C)
GEOM_MIRROR: Device sgm0 created (id=1390498924).
GEOM_MIRROR: Device sgm0: provider da0 detected.
GEOM_MIRROR: Device sgm0: provider da1 detected.
GEOM_MIRROR: Device sgm0: provider da1 activated.
GEOM_MIRROR: Device sgm0: provider da0 activated.
GEOM_MIRROR: Device sgm0: provider mirror/sgm0 launched.
Trying to mount root from ufs:/dev/mirror/sgm0s1a
Regards
Paolo
_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"