[Differential] D6120: tcp/syncache: Set flowid and hash type properly for SYN|ACK

2016-04-28 Thread hiren (hiren panchasara)
hiren accepted this revision. hiren added a comment. This revision has a positive review. I hope you'd write a bit more descriptive commit-log (not just 'what' but also 'why') for the change. Thanks a lot for your work! Cheers, Hiren INLINE COMMENTS sys/netinet/tcp_syncache.c:1507 Do

[Differential] D5872: tcp: Don't prematurely drop receiving-only connections

2016-04-20 Thread hiren (hiren panchasara)
hiren added a comment. In https://reviews.freebsd.org/D5872#128555, @lstewart wrote: > I thought that had been fixed ages ago... oops. Fixed? i.e. doing something other than setting cwnd to 1 seg? > It should be calling cc_cong_signal() with a new congestion type. Hum...

[Differential] D5872: tcp: Don't prematurely drop receiving-only connections

2016-04-20 Thread hiren (hiren panchasara)
hiren added a comment. In https://reviews.freebsd.org/D5872#128539, @lstewart wrote: > ... but replace with a macro to check that the rexmit/persist timer is armed if appropriate! Yes, that would be useful! REVISION DETAIL https://reviews.freebsd.org/D5872 EMAIL PREFERENCES

[Differential] D5872: tcp: Don't prematurely drop receiving-only connections

2016-04-20 Thread hiren (hiren panchasara)
hiren added a comment. Ack for removing ENOBUFs case. REVISION DETAIL https://reviews.freebsd.org/D5872 EMAIL PREFERENCES https://reviews.freebsd.org/settings/panel/emailpreferences/ To: sepherosa_gmail.com, network, glebius, adrian, delphij, decui_microsoft.com, honzhan_microsoft.com,

[Differential] [Commented On] D5872: tcp: Don't prematurely drop receiving-only connections

2016-04-16 Thread hiren (hiren panchasara)
hiren added a comment. In https://reviews.freebsd.org/D5872#127345, @jtl wrote: > In https://reviews.freebsd.org/D5872#127343, @mike-karels.net wrote: > > > If we get an ENOBUFS when sending data, we will already be running the retransmit timer. > > > Good point, but see below.

[Differential] [Updated] D5872: tcp: Don't prematurely drop receiving-only connections

2016-04-15 Thread hiren (hiren panchasara)
hiren added a comment. In https://reviews.freebsd.org/D5872#127123, @jtl wrote: > > The key feature that makes the retransmit timer inappropriate for an ACK-only case is that it is only stopped when we receive input; however, in the ACK-only case, we really want to stop it

[Differential] [Commented On] D1711: Changes to the callout code to restore active semantics and also add a test-framework and test to validate thecallout code (and potentially for use by other tests)

2015-02-19 Thread hiren (hiren panchasara)
hiren added a comment. Another panic from an almost *idle* box: Sanitized panic #6 Dump header from device /dev/da0s1b Architecture: amd64 Architecture Version: 2 Dump Length: 6525980672B (6223 MB) Blocksize: 512 Dumptime: Thu Feb 19 06:16:57 2015 Hostname: xx

[Differential] [Commented On] D1711: Changes to the callout code to restore active semantics and also add a test-framework and test to validate thecallout code (and potentially for use by other tests)

2015-02-17 Thread hiren (hiren panchasara)
hiren added a comment. >>! In D1711#96, @rrs wrote: > Hiren: > > You have the wrong structure type. > > In the printf before panic it is giving you the lock that was spinning.. that > would be in the callout_cpu structure I bet.. I mis-told you in email. > > So if you did > > print *(struct ca

[Differential] [Commented On] D1711: Changes to the callout code to restore active semantics and also add a test-framework and test to validate thecallout code (and potentially for use by other tests)

2015-02-17 Thread hiren (hiren panchasara)
hiren added a comment. >>! In D1711#92, @rrs wrote: > Hiren: > > There also should have been a printf before the panic string > printf( "spin lock %p (%s) held by %p (tid %d) too long\n", > m, m->lock_object.lo_name, td, td->td_tid); > > Can we see what that lovely printf has displa

[Differential] [Commented On] D1711: Changes to the callout code to restore active semantics and also add a test-framework and test to validate thecallout code (and potentially for use by other tests)

2015-02-17 Thread hiren (hiren panchasara)
hiren added a comment. >>! In D1711#91, @rrs wrote: > Hiren: > > Thats helpful.. as I said this is strange. The callout you posted shows its > associated with CPU 0, (c_cpu == 0), and yet > the mtx on that (which is what we are spinning on) is free (its owned == 4). > So why would we have crash

[Differential] [Commented On] D1711: Changes to the callout code to restore active semantics and also add a test-framework and test to validate thecallout code (and potentially for use by other tests)

2015-02-17 Thread hiren (hiren panchasara)
hiren added a comment. >>! In D1711#86, @hselasky wrote: > Hi, > > rrs + hiren: > > I think the problem is this: > > In "_callout_stop_safe()" we sometimes exit having "cc_migration_cpu(cc, > direct) = CPUBLOCK;". Now if a second call to "_callout_stop_safe()" happens > before the pending cal

[Differential] [Commented On] D1711: Changes to the callout code to restore active semantics and also add a test-framework and test to validate thecallout code (and potentially for use by other tests)

2015-02-17 Thread hiren (hiren panchasara)
hiren added a comment. >>! In D1711#88, @rrs wrote: > Hans: > > I don't get your call sequence, I sent you an email on it.. > > Hiren: > > Can you go up the call chain and dump the callout structure > c in > 0x80760064 in callout_lock (c=0xf8000d81dc98) at > /usr/src/sys/kern/kern_

[Differential] [Commented On] D1711: Changes to the callout code to restore active semantics and also add a test-framework and test to validate thecallout code (and potentially for use by other tests)

2015-02-16 Thread hiren (hiren panchasara)
hiren added a comment. @hps: cc_cpu[MAXCPU] info as you requested on IRC. Let me know if you need more info. (kgdb) backtrace #0 doadump (textdump=1) at pcpu.h:219 #1 0x80749c17 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:452 #2 0x80749ff4 in pa

[Differential] [Commented On] D1711: Changes to the callout code to restore active semantics and also add a test-framework and test to validate thecallout code (and potentially for use by other tests)

2015-02-16 Thread hiren (hiren panchasara)
hiren added a comment. @rrs: One more Sanitized panic #5 Dump header from device /dev/da0s1b Architecture: amd64 Architecture Version: 2 Dump Length: 1694281728B (1615 MB) Blocksize: 512 Dumptime: Sun Feb 15 18:03:14 2015 Hostname: x Magic: FreeBSD

[Differential] [Commented On] D1711: Changes to the callout code to restore active semantics and also add a test-framework and test to validate thecallout code (and potentially for use by other tests)

2015-02-16 Thread hiren (hiren panchasara)
hiren added a comment. @rrs: Looks like we've come full circle back to the very first crash reported. We are on stable10 with all relevant fixes. Sanitized panic #4 Dump header from device /dev/da0s1b Architecture: amd64 Architecture Version: 2 Dump Length: 6764437504B (6451

[Differential] [Commented On] D1777: Associated fix for arp/nd6 timer usage.

2015-02-08 Thread hiren (hiren panchasara)
hiren added a comment. It all started with: https://lists.freebsd.org/pipermail/freebsd-net/2014-September/039730.html Last (conclusive) email in that thread: https://lists.freebsd.org/pipermail/freebsd-net/2015-January/040895.html That issue was fixed by: https://reviews.freebsd.org/D1438 i.e.

[Differential] [Commented On] D1777: Associated fix for arp/nd6 timer usage.

2015-02-08 Thread hiren (hiren panchasara)
hiren added a comment. >>! In D1777#16, @bz wrote: > Hiren, it only took us 4 years to trigger this? Can people actually > easily/reliably reproduce it? Heh, I am not sure about "people" but we @llnw can see this very reliably. Do you have any other theories/patches that we can try? It'd be he

[Differential] [Commented On] D1711: Changes to the callout code to restore active semantics and also add a test-framework and test to validate thecallout code (and potentially for use by other tests)

2015-02-06 Thread hiren (hiren panchasara)
hiren added a comment. Update from llnw world: Things have been pretty stable here without any panics for 24+ hours with Stable10+D1711+D1777. Thanks a lot, Randall! REVISION DETAIL https://reviews.freebsd.org/D1711 To: rrs, gnn, rwatson, lstewart, jhb, kostikbel, sbruno, imp, adrian, hsela

[Differential] [Commented On] D1777: Associated fix for arp/nd6 timer usage.

2015-02-06 Thread hiren (hiren panchasara)
hiren added a comment. Update from llnw world: Things have been pretty stable here without any panics for 24+ hours with Stable10+D1711+D1777. Thanks a lot, Randall! REVISION DETAIL https://reviews.freebsd.org/D1777 To: rrs, imp, sbruno, gnn, rwatson, lstewart, kostikbel, adrian, bz, jhb Cc

[Differential] [Commented On] D1711: Changes to the callout code to restore active semantics and also add a test-framework and test to validate thecallout code (and potentially for use by other tests)

2015-02-04 Thread hiren (hiren panchasara)
hiren added a comment. >>! In D1711#59, @rrs wrote: > Hiren: > > Ok looking at kern_timeout.c thats a call to > class->lc_lock(c_lock, lock_status); > > If my 10.x matches yours. > > And the call inside that kern_rwlock.c:757 > is > > v = rw->rw_lock; > owner = (struct thread *)RW_OWNER(v);

[Differential] [Commented On] D1711: Changes to the callout code to restore active semantics and also add a test-framework and test to validate thecallout code (and potentially for use by other tests)

2015-02-04 Thread hiren (hiren panchasara)
hiren added a comment. >>! In D1711#60, @hiren wrote: >>>! In D1711#58, @rrs wrote: > >> hiren: >> >> This looks interesting to me, it is definitely something I would like to >> look at. I assume you >> are on 10.stable like Sean? > > Yes, its plain stable10+D1711. > Also, all 3 panics are fr

[Differential] [Commented On] D1711: Changes to the callout code to restore active semantics and also add a test-framework and test to validate thecallout code (and potentially for use by other tests)

2015-02-04 Thread hiren (hiren panchasara)
hiren added a comment. >>! In D1711#61, @hselasky wrote: > Hi, > > There is only one or two likely consumers of callout_init_rw() at the present > moment, and one of them is: > > ./netinet6/nd6.c: canceled = callout_stop(&ln->ln_timer_ch); > ./netinet6/nd6.c: can

[Differential] [Commented On] D1711: Changes to the callout code to restore active semantics and also add a test-framework and test to validate thecallout code (and potentially for use by other tests)

2015-02-04 Thread hiren (hiren panchasara)
hiren added a comment. >>! In D1711#59, @rrs wrote: > Hiren: > > Ok looking at kern_timeout.c thats a call to > class->lc_lock(c_lock, lock_status); > > If my 10.x matches yours. It's not :-( Looks like what we have here is not stock stable10 really. I'll check all the details and get back

[Differential] [Commented On] D1711: Changes to the callout code to restore active semantics and also add a test-framework and test to validate thecallout code (and potentially for use by other tests)

2015-02-03 Thread hiren (hiren panchasara)
hiren added a comment. >>! In D1711#58, @rrs wrote: > hiren: > > This looks interesting to me, it is definitely something I would like to look > at. I assume you > are on 10.stable like Sean? Yes, its plain stable10+D1711. Also, all 3 panics are from the same system. REVISION DETAIL https:

[Differential] [Changed Subscribers] D1711: Changes to the callout code to restore active semantics and also add a test-framework and test to validate thecallout code (and potentially for use by other

2015-02-03 Thread hiren (hiren panchasara)
hiren added a subscriber: hiren. hiren added a comment. Sanitized panic #3 Dump header from device /dev/da0s1b Architecture: amd64 Architecture Version: 2 Dump Length: 5393809408B (5143 MB) Blocksize: 512 Dumptime: Tue Feb 3 13:21:19 2015 Hostname: xxx