Alexander Motin wrote: > Warner Losh wrote: >> I don't suppose that your driver could cause the hardware to interrupt after >> a little time? That would be more resource friendly... Otherwise, 1ms is >> long enough that a msleep or tsleep would likely work quite nicely. > > It's not his driver, it's mine. Actually, unlike AHCI, this hardware > even has interrupt for ready transition (second, biggest of sleeps). But > it is not used in present situation. > >> On Apr 11, 2011, at 1:43 PM, dieter...@engineer.com wrote: >>>>> FreeBSD 8.2 amd64 uniprocessor >>>>> >>>>> kernel: siisch1: DISCONNECT requested >>>>> kernel: siisch1: SIIS reset... >>>>> kernel: siisch1: siis_sata_connect() calling DELAY(1000) >>>>> last message repeated 59 times >>>>> kernel: siisch1: SATA connect time=60ms status=00000123 >>>>> kernel: siisch1: SIIS reset done: devices=00000001 >>>>> kernel: siisch1: DISCONNECT requested >>>>> kernel: siisch1: SIIS reset... >>>>> kernel: siisch1: siis_sata_connect() calling DELAY(1000) >>>>> last message repeated 58 times >>>>> kernel: siisch1: SATA connect time=59ms status=00000123 >>>>> ... >>>>> kernel: siisch0: siis_wait_ready() calling DELAY(1000) >>>>> last message repeated 1300 times >>>>> kernel: siisch0: port is not ready (timeout 10000ms) status = >>> 001f2000 >>>>> Meanwhile, *everything* comes to a screeching halt. Device >>>>> drivers are locked out, and thus incoming data is lost. >>>>> Losing incoming data is unacceptable. >>>>> >>>>> Need an alternative to DELAY() that does not lock out >>>>> other device drivers. There must be a way to reset one >>>>> bit of hardware without locking down the entire machine. >>> Hans Petter Selasky writes: >>>> An alternative to DELAY() is the simplest solution. You probably need >>>> to do some redesign in the SCSI layer to find a better solution. >>> I keep coming back to the idea that a device driver for one >>> controller should not have to lock out *all* the hardware. >>> RS-232 locks out Ethernet. Disk drivers lock out Ethernet. >>> And so on. Why? Is there some fundamental reason that this >>> *has* to be? I thought the conversion from spl() to mutex() >>> was supposed to fix this? >>> >>> I'm making progress on my project converting printf(9) calls >>> to log(9), and fixing some bugs along the way. Eventually I'll >>> have patches to submit. But this is really a workaround, not >>> a fix to the underlying problem. >>> >>> Redesigning the SCSI layer sounds like a job for someone who took >>> a lot more CS classes than I did. /dev/brain returns ENOCLUE. :-( > > CAM is not completely innocent in this situation indeed. CAM defines > XPT_RESET_BUS request as synchronous. It is not queued, and called under > the SIM mutex lock. I don't think lock can be safely dropped in the > middle there. > > Now I think that I could try to move readiness waiting out of the > siis_reset() to do it asynchronously. I'll think about it.
I've fixed this problem for ahci(4) in HEAD, there should be no sleeps longer then 100ms now (typical 1-2ms). With siis(4) the situation is different. There by default should be no sleeps longer then 100ms (typical 1-2ms). Longer sleep means that either controller is not responding, or it can't establish link to device it sees. I've reduced waiting timeout from 10s to 1s. It should improve situation a bit, but I would look for the original problem cause. Have you done something specific to trigger it? Are your drive/cables OK? -- Alexander Motin _______________________________________________ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"