On Thu, Feb 09, 2006 at 09:24:23PM +0100, Sren Schmidt wrote.. > Wilko Bulte wrote: > >On Thu, Feb 09, 2006 at 03:45:53PM +0100, Sren Schmidt wrote.. > >>Wilko Bulte wrote: > >>>On Thu, Feb 09, 2006 at 03:37:07PM +0100, Sren Schmidt wrote.. > >>>>Wilko Bulte wrote: > >>>>>On Wed, Feb 08, 2006 at 10:44:05PM +0100, Sren Schmidt wrote.. > >>>>>>Wilko Bulte wrote: > >>>>>>>On Wed, Feb 08, 2006 at 10:02:08PM +0100, Sren Schmidt wrote.. > >>>>>>>>Wilko Bulte wrote: > >>>>>>>>>Hi Soren, > >>>>>>>>> > >>>>>>>>>I just went to 6.1-PRE on my main machine, coming from 6.0-STABLE > >>>>>>>>>of roughly end of december. > >>>>>>>>> > >>>>>>>>>And I hit some stuff that really worries me: > >>>>>>>>> > >>>>>>>>>- the freshly built kernel keels over with (hand transcribed): > >>>>>>>>> > >>>>>>>>>ata3: reiniting channel SATA connect ... > >>>>>>>>>SATA connected > >>>>>>>>>sata_connect_devices 0x1 <ATA_MASTER> > >>>>>>>>> > >>>>>>>>>ad6: req=0xC35ba0c8 SETFEATURES SETTRANSFERMODE semaphore timeout > >>>>>>>>>!! DANGER Will RObinson !! > >>>>>>>>> > >>>>>>>>>(... is where I cannot read my own handwriting, it scrolled quite > >>>>>>>>>fast on > >>>>>>>>>the screen..) > >>>>>>>>> > >>>>>>>>>Boot device is a SATA RAID1 on a Promise 2300. > >>>>>>>>Hmm, that should not happen. Could you try to backstep just ATA to > >>>>>>>>before the MFC, that is 24/1/06 and let me know if that helps > >>>>>>>>please ? > >>>>>>>First impression is that the problem is gone. None of the > >>>>>>>previously reported errors are seen. I am running a level 0 dump > >>>>>>>from disk to disk > >>>>>>>to see if the box remains stable. Given that this is my primary > >>>>>>>machine > >>>>>>>I sure hope it will be :-) > >>>>>>> > >>>>>>>>>Another snag is that my ad10 disk on 6.0-STABLE suddenly became > >>>>>>>>>ad12 on > >>>>>>>>>6.1-PRE > >>>>>>>>Hmm that is because there is only 2 ports on your promise which is > >>>>>>>>now correctly identified, before it was errounsly found as 3 ports. > >>>>>>>Ah, OK. I would suggest a note to the Release Note writers would be > >>>>>>>a good > >>>>>>>thing, devices changing location after an upgrade in the -stable > >>>>>>>branch > >>>>>>>is unnerving ;-) > >>>>>>Well, the good thing is that I can reproduce the error here, the bad > >>>>>>thing is that it slipped through testing on -current... > >>>>>>Oh, well, I'll look into it ASAP... > >>>>>Thank you Soren! > >>>>OK, had a few this afternoon, could you try this patch and let me know > >>>>if it helps, at least it makes the problem go away on my testbed.. > >>>Is this relative to HEAD or RELENG_6? I cannot / will not go to HEAD > >>>with this machine (my main production box.. :-) > >>Doesn't matter, ATA is the same on both... > > > >OK, I was not sure if they were 100% identical. > > > >The patch at first impression seems to have eliminated the problem. > > Good seems I'm on the right track at least. > > >Interestingly enough ad10 remained ad10 with the patch applied? > > Yeah, thats intentional, I though we better not break POLA here..
I agree :-) > >I'll put some load on to see what happens. > > Let me know how that turns out, I'll clean things up a bit and get it > committed to -current, then get permission to MFC when we are sure it > fixes the problem... I ran a 44GB disk-to-disk dump without incidents (source on the RAID1, target on the JBOD). No problems whatsoever. Looks like things behave much better now. Tonight the machine will run a daily full dump to DLT tape, I'll know how that turns out tomorrow. thanks, Wilko -- Wilko Bulte [EMAIL PROTECTED] _______________________________________________ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"