Bruce Evans wrote:
On Sat, 16 Dec 2006, I wrote:
On Thu, 14 Dec 2006, I wrote:
On Wed, 13 Dec 2006, Jung-uk Kim wrote:
On Wednesday 13 December 2006 03:51 pm, Scott Long wrote:
scottl 2006-12-13 20:51:51 UTC
FreeBSD src repository
Modified files:
sys/dev/bge if_bge.c
Log:
Remove a redundant write of the firmware reset magic number. It
...
I am still getting firmware handshake timeouts and/or watchdog
timeouts. Most importantly it panics or get witness warnings (lots
of 'memory modified after free'). Panic goes like this (while
kldunload if_bge with dhclient enabled):
brgphy0: detached
miibus0: detached
bge0: firmware handshake timed out, found 0x4b657654
bge0: firmware handshake timed out, found 0x4b657654
I have seen these for debugging the redundant-write problem (not for
detach but for bringing up the interface for the first time). My 5701
just hangs if there is any redundant write (2 where the first one was
in bge_reset(), or 2 separate, or 2 where the second one was). My
5705 survives two separate sets of 256 repeated writes; however, then
the firmware handshake times out; however2, everything works normally
after ignoring the the timeout except for printing the message. I
just noticed that this error wasn't ignored until recently -- I noticed
the return statement being removed but not that it was in a critical
area.
The debugging code doesn't seem to have been responsible for this.
Now, without it I almost (?) always get handshake errors on the 5705,
but never (?) on the 5701. Apparently, the 3rd write (the one that
was removed) was the only correctly placed one.
Avoiding the "write_op" part of the changes fixes the handshake errors
on my non-PCIE 5705. write_op is only used to write the reset value and
one other value to BGE_MISC_CFG. bge_writemem_ind() apparently writes
the reset to nowhere, but bge_writereg() still works.
%%%
Index: if_bge.c
===================================================================
RCS file: /home/ncvs/src/sys/dev/bge/if_bge.c,v
retrieving revision 1.165
diff -u -2 -r1.165 if_bge.c
--- if_bge.c 15 Dec 2006 00:27:06 -0000 1.165
+++ if_bge.c 18 Dec 2006 10:44:05 -0000
@@ -2544,4 +2634,7 @@
if (sc->bge_flags & BGE_FLAG_PCIE)
write_op = bge_writemem_direct;
+ /* XXX bge_writemem_ind is wrong for at least reset of 5705. */
+ else if (sc->bge_asicrev == BGE_ASICREV_BCM5705)
+ write_op = bge_writereg_ind;
else
write_op = bge_writemem_ind;
%%%
The panics might be caused by the change making the reset null. Resetting
might be much more necessary for uninitialization than for initialization.
The bug caused the following behaviour here:
- the problem with taking a long time to start serving nfs requests (with
/usr nfs-mounted) became larger. Normally, nfs tries to start before
the interface is really up and then it takes about a minute to start.
With the bug, it often got portmap errors and sometimes never started.
- after "ifconfig down", it took a reboot to bring the interface back up.
Bruce
Ok, this looks like a result of me not understanding a bit of the linux
code that I read. When doing the reset, the linux equivalent of
bge_writemem_ind() is specifically avoided.
I'm on vacation for the next 10 days, but I'll try to put together a
patch that addresses this and other problems soon. Ping my after the
first of the year otherwise.
Scott
_______________________________________________
[email protected] mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "[EMAIL PROTECTED]"