On 10/26/11 12:54 AM, Pawel Jakub Dawidek wrote:
On Mon, Oct 24, 2011 at 08:14:22AM -0400, John Baldwin wrote:
On Sunday, October 23, 2011 11:58:28 am Pawel Jakub Dawidek wrote:
On Sun, Oct 23, 2011 at 11:44:45AM +0300, Kostik Belousov wrote:
On Sun, Oct 23, 2011 at 08:10:38AM +0200, Pawel Jakub Dawidek wrote:
My suggestion would be that if we won't be able to fix it before 9.0,
we should turn this assertion off, as the system seems to be able to
recover.
Shipped kernels have all assertions turned off.
Yes, I'm aware of that, but many people compile their production kernels
with INVARIANTS/INVARIANT_SUPPORT to fail early instead of eg.
corrupting data. I'd be fine in moving this under DIAGNOSTIC or changing
it into a printf, so it will be visible.
No, the kernel is corrupting things in other places when this is true, so
if you are running with INVARIANTS, we want to know about it.   Specifically,
in several places in TCP we assume that rcv_adv>= rcv_nxt, and depend on
being able to do 'rcv_adv - rcv_nxt'.

In this case, it looks like the difference is consistently less than one
frame.  I suspect the other end of the connection is sending just beyond the
end of the advertised window (it probably assumes it is better to send a full
frame if it has that much pending data even though part of it is beyond the
window edge vs sending a truncated packet that just fills the window) and that
that frame is accepted ok in the header prediction case and it's ACK is
delayed, but the next packet to arrive then trips over this assumption.

Since 'win' is guaranteed to be non-negative and we explicitly cast
'rcv_adv - rcv_nxt' to (int) in the following line that the assert is checking
for:

        tp->rcv_wnd = imax(win, (int)(tp->rcv_adv - tp->rcv_nxt));

I think we already handle this case ok and perhaps the assertion can just be
removed?  Not sure if others feel that it warrants a comment to note that this
is the case being handled.
I added debug to the places where rcv_adv and rcv_nxt are modified. Here
is what happens before the panic occurs:

tcp_do_segment:1722 negative window: tp 0xfffffe000dab1b70 rcv_nxt 4022361548 
rcv_adv 4022360100 diff -1448
tcp_do_segment:2847 negative window: tp 0xfffffe000dab1b70 rcv_nxt 4022362298 
rcv_adv 4022361548 diff -750
tcp_do_segment:1722 negative window: tp 0xfffffe000dab1b70 rcv_nxt 4022363746 
rcv_adv 4022362298 diff -1448
tcp_do_segment:2847 negative window: tp 0xfffffe000dab1b70 rcv_nxt 4022364836 
rcv_adv 4022363746 diff -1090
tcp_do_segment:1722 negative window: tp 0xfffffe000dab1b70 rcv_nxt 4022366284 
rcv_adv 4022364836 diff -1448
tcp_do_segment:1722 negative window: tp 0xfffffe000dab1b70 rcv_nxt 4022370628 
rcv_adv 4022369690 diff -938
tcp_do_segment:1722 negative window: tp 0xfffffe000dab1b70 rcv_nxt 4022379140 
rcv_adv 4022377692 diff -1448
tcp_do_segment:1722 negative window: tp 0xfffffe000dab1b70 rcv_nxt 4022387792 
rcv_adv 4022386344 diff -1448
tcp_do_segment:2847 negative window: tp 0xfffffe000dab1b70 rcv_nxt 4022388890 
rcv_adv 4022387792 diff -1098
tcp_do_segment:1722 negative window: tp 0xfffffe000dab1b70 rcv_nxt 4022390338 
rcv_adv 4022388890 diff -1448
tcp_do_segment:2847 negative window: tp 0xfffffe000dab1b70 rcv_nxt 4022394563 
rcv_adv 4022394342 diff -221
panic: tcp_input negative window: tp 0xfffffe000dab1b70 rcv_nxt 4022394563 
rcv_adv 4022394342 win=0 diff -221

I can send you the full log if you want, I've plenty of messages where
rcv_adv<  rcv_nxt, not all of them trigger this assertion.

Might be a good place to use the new sifter tool.

_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Reply via email to