Re: Need an alternative to DELAY()

Alexander Motin Tue, 12 Apr 2011 23:49:31 -0700

Alexander Motin wrote:
> Warner Losh wrote:
>> I don't suppose that your driver could cause the hardware to interrupt after 
>> a little time?  That would be more resource friendly...  Otherwise, 1ms is 
>> long enough that a msleep or tsleep would likely work quite nicely.
> 
> It's not his driver, it's mine. Actually, unlike AHCI, this hardware
> even has interrupt for ready transition (second, biggest of sleeps). But
> it is not used in present situation.
> 
>> On Apr 11, 2011, at 1:43 PM, [email protected] wrote:
>>>>> FreeBSD 8.2  amd64  uniprocessor
>>>>>
>>>>> kernel: siisch1: DISCONNECT requested
>>>>> kernel: siisch1: SIIS reset...
>>>>> kernel: siisch1: siis_sata_connect() calling DELAY(1000)
>>>>> last message repeated 59 times
>>>>> kernel: siisch1: SATA connect time=60ms status=00000123
>>>>> kernel: siisch1: SIIS reset done: devices=00000001
>>>>> kernel: siisch1: DISCONNECT requested
>>>>> kernel: siisch1: SIIS reset...
>>>>> kernel: siisch1: siis_sata_connect() calling DELAY(1000)
>>>>> last message repeated 58 times
>>>>> kernel: siisch1: SATA connect time=59ms status=00000123
>>>>> ...
>>>>> kernel: siisch0: siis_wait_ready() calling DELAY(1000)
>>>>> last message repeated 1300 times
>>>>> kernel: siisch0: port is not ready (timeout 10000ms) status = 
>>> 001f2000
>>>>> Meanwhile, *everything* comes to a screeching halt.  Device
>>>>> drivers are locked out, and thus incoming data is lost.
>>>>> Losing incoming data is unacceptable.
>>>>>
>>>>> Need an alternative to DELAY() that does not lock out
>>>>> other device drivers.  There must be a way to reset one
>>>>> bit of hardware without locking down the entire machine.
>>> Hans Petter Selasky writes:
>>>> An alternative to DELAY() is the simplest solution. You probably need
>>>> to do some redesign in the SCSI layer to find a better solution.
>>> I keep coming back to the idea that a device driver for one
>>> controller should not have to lock out *all* the hardware.
>>> RS-232 locks out Ethernet.  Disk drivers lock out Ethernet.
>>> And so on.  Why?  Is there some fundamental reason that this
>>> *has* to be?  I thought the conversion from spl() to mutex()
>>> was supposed to fix this?
>>>
>>> I'm making progress on my project converting printf(9) calls
>>> to log(9), and fixing some bugs along the way.  Eventually I'll
>>> have patches to submit.  But this is really a workaround, not
>>> a fix to the underlying problem.
>>>
>>> Redesigning the SCSI layer sounds like a job for someone who took
>>> a lot more CS classes than I did.  /dev/brain returns ENOCLUE.  :-(
> 
> CAM is not completely innocent in this situation indeed. CAM defines
> XPT_RESET_BUS request as synchronous. It is not queued, and called under
> the SIM mutex lock. I don't think lock can be safely dropped in the
> middle there.
> 
> Now I think that I could try to move readiness waiting out of the
> siis_reset() to do it asynchronously. I'll think about it.


I've fixed this problem for ahci(4) in HEAD, there should be no sleeps
longer then 100ms now (typical 1-2ms).

With siis(4) the situation is different. There by default should be no
sleeps longer then 100ms (typical 1-2ms). Longer sleep means that either
controller is not responding, or it can't establish link to device it
sees. I've reduced waiting timeout from 10s to 1s. It should improve
situation a bit, but I would look for the original problem cause. Have
you done something specific to trigger it? Are your drive/cables OK?

-- 
Alexander Motin
_______________________________________________
[email protected] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[email protected]"

Re: Need an alternative to DELAY()

Reply via email to