On 07-12-07 01:23, Robert Hancock wrote:
David P. Reed wrote:
After much, much testing (months, off and on, pursuing hypotheses),
I've discovered that the use of "outb al,0x80" instructions to "delay"
after inb and outb instructions causes solid freezes on my HP dv9000z
laptop, when ACPI is enabled.
It takes a fair number of out's to 0x80, but the hard freeze is
reliably reproducible by writing a driver that solely does a loop of
50 outb's to 0x80 and calling it in a loop 1000 times from user
space. !!!
The serious impact is that the /dev/rtc and /dev/nvram devices are
very unreliable - thus "hwclock" freezes very reliably while looping
waiting for a new second value and calling "cat /dev/nvram" in a loop
freezes the machine if done a few times in a row.
This is reproducible, but requires a fair number of outb's to the 0x80
diagnostic port, and seems to require ACPI to be on.
io_64.h is the source of these particular instructions, via the
CMOS_READ and CMOS_WRITE macros, which are defined in mc146818_64.h.
(I wonder if the same problem occurs in 32-bit mode).
I'm happy to complete and test a patch, but I'm curious what the right
approach ought to be. I have to say I have no clue as to what ACPI is
doing on this chipset (nvidia MCP51) that would make port 80 do
this. A raw random guess is that something is logging POST codes, but
if so, not clear what is problematic in ACPI mode.
ANy help/suggestions?
Changing the delay instruction sequence from the outb to short jumps
might be the safe thing. But Linus, et al. may have experience with
that on other architectures like older Pentiums etc.
The fact that these "pausing" calls are needed in the first place seems
rather cheesy. If there's hardware that's unable to respond to IO port
writes as fast as possible, then surely there's a better solution than
trying to stall the IOs by an arbitrary and hardware-dependent amount of
time, like udelay calls, etc. Does any remotely recent hardware even
need this?
The idea is that the delay is not in fact hardware dependent. With in the
the absense of a POST board port 0x80 being sort of guaranteeed to not be
decoded on PCI but forwarded to and left to die on ISA/LPC one should get
the effect that the _next_ write will have survived an ISA/LPC bus address
cycle acknowledgement timeout.
I believe.
And no, I don't believe any remotely recent hardware needs it and have in
fact wondered about it since actual 386 days, having since that time never
found a device that wouldn't in fact take back to back I/O even. Even back
then (ie, legacy only systems, no forwarding from PCI or anything) BIOSes
provided ISA bus wait-state settings which should be involved in getting
insanely stupid and old hardware to behave...
Port 0xed has been suggested as an alternate port. Probably not a great
"fix" but if replacing the out with a simple udelay() isn't that simple
(during early boot I gather) then it might at least be something for you to
try. I'd hope that the 0x80 in include/asm/io.h:native_io_delay() would be
the only one you are running into, so you could change that to 0xed and see
what catches fire.
If there are no sensible fixes, an 0x80/0xed choice could I assume be hung
of DMI or something (if that _is_ parsed soon enough).
Rene.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/