On Sun, 24 Dec 2006, Oleg Bulyzhin wrote:
On Fri, Dec 22, 2006 at 01:24:45AM +1100, Bruce Evans wrote:
On Wed, 20 Dec 2006, Gleb Smirnoff wrote:
I have a suspicion that this may cause a problem under high load. Imagine
that thread #1 is spinning in bge_start_locked() getting packets out
of interface queue and putting them into TX ring. Some other threads are
putting the packets into interface queue while its lock is temporarily
relinguished be the thread #1. In the same time interrupts happen, some
packets are sent, but the TX ring is never got empty.
The above scenario will cause a fake watchdog event.
bge_start_locked() starts with the bge (sc) lock held and never releases
it as far as I can see. This this problem can't happen (the lock
prevents both txeof and the watchdog from being reached before start
resets the timeout to 5 seconds).
it's quite unusal) and it is not lock related:
1) bge_start_locked() & bge_encap fills tx ring.
2) during next 5 seconds we do not have packets for transmit (i.e. no
bge_start_locked() calls --> no bge_timer refreshing)
3) for any reason (don't ask me how can this happen), chip was unable to
send whole tx ring (only part of it).
4) here we have false watchdog - chip is not wedged but bge_watchdog would
reset it.
Then it is a true watchdog IMO. Something is very wrong if you can't send
512 packets in 5 seconds (or even 1 packet in 5/512 seconds).
I'd say correct solution would be combination of yours and previous bge_txeof()
behaviour:
1) We should cancel watchdog if tx ring is empty (as you did)
2) We should not clear bge_timer inside loop (like it was before), but set it
to 5. So watchdog is active until tx ring is empty and we will not get
watchdog event while tx ring consumer index is moving.
I prefer the overall timeout of 5 seconds (maybe even 1 second, but
the simple algorithm requires at least an average of 1.5 seconds). I
don't want the watchdog to be delayed by 512*5 seconds in the unlikely
event that 1 packet trickles out every 5 seconds, especially since
implementing this takes more code (though the amount is tiny). This
won't happen of course. No interface-start calls in 5 seconds will
also be rare.
Bruce
_______________________________________________
cvs-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "[EMAIL PROTECTED]"