Re: kern/177362: [netinet] [patch] Wrong control used to return TOS
Synopsis: [netinet] [patch] Wrong control used to return TOS State-Changed-From-To: open->feedback State-Changed-By: hiren State-Changed-When: Tue May 7 04:42:42 UTC 2013 State-Changed-Why: tuexen@ does not see a need for code change. Waiting for submitter feedback. http://www.freebsd.org/cgi/query-pr.cgi?pr=177362 ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: kern/177184: [bge] [patch] enable wake on lan
Synopsis: [bge] [patch] enable wake on lan Responsible-Changed-From-To: freebsd-net->hiren Responsible-Changed-By: hiren Responsible-Changed-When: Tue May 7 04:48:09 UTC 2013 Responsible-Changed-Why: Grab. http://www.freebsd.org/cgi/query-pr.cgi?pr=177184 ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: kern/176667: [libalias] [patch] libalias locks on uninitalized data
Synopsis: [libalias] [patch] libalias locks on uninitalized data State-Changed-From-To: open->patched State-Changed-By: hiren State-Changed-When: Tue May 7 05:03:30 UTC 2013 State-Changed-Why: Gleb committed r248158. Responsible-Changed-From-To: freebsd-net->glebius Responsible-Changed-By: hiren Responsible-Changed-When: Tue May 7 05:03:30 UTC 2013 Responsible-Changed-Why: Gleb committed r248158. http://www.freebsd.org/cgi/query-pr.cgi?pr=176667 ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: bin/136994: [patch] ifconfig(8) print carp mac address
Synopsis: [patch] ifconfig(8) print carp mac address Responsible-Changed-From-To: freebsd-net->hiren Responsible-Changed-By: hiren Responsible-Changed-When: Tue May 14 22:22:56 UTC 2013 Responsible-Changed-Why: Grab. http://www.freebsd.org/cgi/query-pr.cgi?pr=136994 ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: kern/177184: [bge] [patch] enable wake on lan
Synopsis: [bge] [patch] enable wake on lan Responsible-Changed-From-To: hiren->freebsd-net Responsible-Changed-By: hiren Responsible-Changed-When: Mon Oct 7 16:02:38 UTC 2013 Responsible-Changed-Why: http://www.freebsd.org/cgi/query-pr.cgi?pr=177184 ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Differential] [Changed Subscribers] D1711: Changes to the callout code to restore active semantics and also add a test-framework and test to validate thecallout code (and potentially for use by other
hiren added a subscriber: hiren. hiren added a comment. Sanitized panic #3 Dump header from device /dev/da0s1b Architecture: amd64 Architecture Version: 2 Dump Length: 5393809408B (5143 MB) Blocksize: 512 Dumptime: Tue Feb 3 13:21:19 2015 Hostname: Magic: FreeBSD Kernel Dump Version String: FreeBSD 10.1-STABLE-D1711 #0: Tue Feb 3 12:19:58 MST 2015 root@:/usr/obj/usr/src/sys/SIXFOUR Panic String: page fault Dump Parity: 4197108606 Bounds: 0 Dump Status: good Backtrace: Reading symbols from /boot/kernel/cc_cubic.ko.symbols...done. Loaded symbols for /boot/kernel/cc_cubic.ko.symbols Reading symbols from /boot/kernel/cc_cdg.ko.symbols...done. Loaded symbols for /boot/kernel/cc_cdg.ko.symbols Reading symbols from /boot/kernel/h_ertt.ko.symbols...done. Loaded symbols for /boot/kernel/h_ertt.ko.symbols Reading symbols from /boot/kernel/if_lagg.ko.symbols...done. Loaded symbols for /boot/kernel/if_lagg.ko.symbols #0 doadump (textdump=1) at pcpu.h:219 in pcpu.h (kgdb) #0 doadump (textdump=1) at pcpu.h:219 #1 0x8072d0b7 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:452 #2 0x8072d494 in panic (fmt=) at /usr/src/sys/kern/kern_shutdown.c:759 #3 0x80ae703f in trap_fatal (frame=, eva=) at /usr/src/sys/amd64/amd64/trap.c:865 #4 0x80ae7358 in trap_pfault (frame=0xfe1f9e73f7b0, usermode=) at /usr/src/sys/amd64/amd64/trap.c:676 #5 0x80ae69ba in trap (frame=0xfe1f9e73f7b0) at /usr/src/sys/amd64/amd64/trap.c:440 #6 0x80acca22 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:236 #7 0x8072b400 in __rw_wlock_hard (c=0xf8030a59aa28, tid=18446735277970927616, file=0x0, line=173648384) at /usr/src/sys/kern/kern_rwlock.c:757 #8 0x80742915 in softclock_call_cc (c=0xf8030a59aa98, cc=0x81342180, direct=0) at /usr/src/sys/kern/kern_timeout.c:637 #9 0x80742db4 in softclock (arg=0x81342180) at /usr/src/sys/kern/kern_timeout.c:801 #10 0x806fde4b in intr_event_execute_handlers ( p=, ie=0xf80015214c00) at /usr/src/sys/kern/kern_intr.c:1264 #11 0x806fe7e6 in ithread_loop (arg=0xf800151f6f00) at /usr/src/sys/kern/kern_intr.c:1277 #12 0x806fba6a in fork_exit ( callout=0x806fe750 , arg=0xf800151f6f00, frame=0xfe1f9e73fac0) at /usr/src/sys/kern/kern_fork.c:1017 #13 0x80accf5e in fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:611 #14 0x in ?? () Current language: auto; currently minimal (kgdb) REVISION DETAIL https://reviews.freebsd.org/D1711 To: rrs, gnn, rwatson, lstewart, jhb, kostikbel, hselasky, adrian, imp, sbruno Cc: hiren, jhb, kostikbel, emaste, delphij, neel, erj, freebsd-net ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Differential] [Commented On] D1711: Changes to the callout code to restore active semantics and also add a test-framework and test to validate thecallout code (and potentially for use by other tests)
hiren added a comment. >>! In D1711#58, @rrs wrote: > hiren: > > This looks interesting to me, it is definitely something I would like to look > at. I assume you > are on 10.stable like Sean? Yes, its plain stable10+D1711. Also, all 3 panics are from the same system. REVISION DETAIL https://reviews.freebsd.org/D1711 To: rrs, gnn, rwatson, lstewart, jhb, kostikbel, hselasky, adrian, imp, sbruno Cc: hiren, jhb, kostikbel, emaste, delphij, neel, erj, freebsd-net ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Differential] [Commented On] D1711: Changes to the callout code to restore active semantics and also add a test-framework and test to validate thecallout code (and potentially for use by other tests)
hiren added a comment. >>! In D1711#59, @rrs wrote: > Hiren: > > Ok looking at kern_timeout.c thats a call to > class->lc_lock(c_lock, lock_status); > > If my 10.x matches yours. It's not :-( Looks like what we have here is not stock stable10 really. I'll check all the details and get back first thing in the morning tomorrow. Thanks for checking and sorry for the trouble. REVISION DETAIL https://reviews.freebsd.org/D1711 To: rrs, gnn, rwatson, lstewart, jhb, kostikbel, hselasky, adrian, imp, sbruno Cc: hiren, jhb, kostikbel, emaste, delphij, neel, erj, freebsd-net ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Differential] [Commented On] D1711: Changes to the callout code to restore active semantics and also add a test-framework and test to validate thecallout code (and potentially for use by other tests)
hiren added a comment. >>! In D1711#61, @hselasky wrote: > Hi, > > There is only one or two likely consumers of callout_init_rw() at the present > moment, and one of them is: > > ./netinet6/nd6.c: canceled = callout_stop(&ln->ln_timer_ch); > ./netinet6/nd6.c: canceled = > callout_reset(&ln->ln_timer_ch, INT_MAX, > ./netinet6/nd6.c: canceled = > callout_reset(&ln->ln_timer_ch, tick, > ./netinet6/in6.c: callout_init_rw(&lle->base.ln_timer_ch, > &lle->base.lle_lock, > > hiren: Is this box configured for IPv6 ? No, not for panic #3. But as I replied to rrs's comment, I need to first make sure what tree we are running and if we are missing critical fixes from stable-10. > > static void > in_lltable_free(struct lltable *llt, struct llentry *lle) > { > LLE_WUNLOCK(lle); > LLE_LOCK_DESTROY(lle); > free(lle, M_LLTABLE); > } > > ln_lltable_free() does not drain the callout associated with it and I am not > sure if we have a sleeping context for that. Even if the refcount is zero, it > doesn't mean that the callback is finished using the RW mutex. > > This is another example where we really need a > "callout_drain_async_function()". REVISION DETAIL https://reviews.freebsd.org/D1711 To: rrs, gnn, rwatson, lstewart, jhb, kostikbel, hselasky, adrian, imp, sbruno Cc: hiren, jhb, kostikbel, emaste, delphij, neel, erj, freebsd-net ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Differential] [Commented On] D1711: Changes to the callout code to restore active semantics and also add a test-framework and test to validate thecallout code (and potentially for use by other tests)
hiren added a comment. >>! In D1711#60, @hiren wrote: >>>! In D1711#58, @rrs wrote: > >> hiren: >> >> This looks interesting to me, it is definitely something I would like to >> look at. I assume you >> are on 10.stable like Sean? > > Yes, its plain stable10+D1711. > Also, all 3 panics are from the same system. >>! In D1711#62, @hiren wrote: >>>! In D1711#59, @rrs wrote: >> Hiren: >> >> Ok looking at kern_timeout.c thats a call to >> class->lc_lock(c_lock, lock_status); >> >> If my 10.x matches yours. > > It's not :-( > > Looks like what we have here is not stock stable10 really. I'll check all the > details and get back first thing in the morning tomorrow. > > Thanks for checking and sorry for the trouble. My bad. So yes, these machines are indeed stable10+D1711. I'll follow up separately on original comment. REVISION DETAIL https://reviews.freebsd.org/D1711 To: rrs, gnn, rwatson, lstewart, jhb, kostikbel, hselasky, adrian, imp, sbruno Cc: hiren, jhb, kostikbel, emaste, delphij, neel, erj, freebsd-net ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Differential] [Commented On] D1711: Changes to the callout code to restore active semantics and also add a test-framework and test to validate thecallout code (and potentially for use by other tests)
hiren added a comment. >>! In D1711#59, @rrs wrote: > Hiren: > > Ok looking at kern_timeout.c thats a call to > class->lc_lock(c_lock, lock_status); > > If my 10.x matches yours. > > And the call inside that kern_rwlock.c:757 > is > > v = rw->rw_lock; > owner = (struct thread *)RW_OWNER(v); > > I would imagine v is probably a freed lock or some such.. not sure. > If you have a vmcore sending the registers would be helpful. And for that > matter if you have a vmcore if you could get in the frame of kern_timeout > and tell me what > c_lock > c_func > are that would be helpful. I have not tested this with my test framework for > locks > that pass in a lock.. If the c_func is not some private thing but something in > BSD I can puzzle out what sub-system is using the callout this way and > try to reproduce a test that will blow up this way on me as well. > > Assuming of course its not the caller that has freed the > lock ahead of the callout system running... panic #3 happened on stable-10+this patch. I've setup a -head box with this patch to reproduce the problem. In any case, I'll try to get vmcore and other details tomorrow. REVISION DETAIL https://reviews.freebsd.org/D1711 To: rrs, gnn, rwatson, lstewart, jhb, kostikbel, hselasky, adrian, imp, sbruno Cc: hiren, jhb, kostikbel, emaste, delphij, neel, erj, freebsd-net ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Differential] [Commented On] D1777: Associated fix for arp/nd6 timer usage.
hiren added a comment. Update from llnw world: Things have been pretty stable here without any panics for 24+ hours with Stable10+D1711+D1777. Thanks a lot, Randall! REVISION DETAIL https://reviews.freebsd.org/D1777 To: rrs, imp, sbruno, gnn, rwatson, lstewart, kostikbel, adrian, bz, jhb Cc: bz, emaste, hiren, julian, hselasky, freebsd-net ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Differential] [Commented On] D1711: Changes to the callout code to restore active semantics and also add a test-framework and test to validate thecallout code (and potentially for use by other tests)
hiren added a comment. Update from llnw world: Things have been pretty stable here without any panics for 24+ hours with Stable10+D1711+D1777. Thanks a lot, Randall! REVISION DETAIL https://reviews.freebsd.org/D1711 To: rrs, gnn, rwatson, lstewart, jhb, kostikbel, sbruno, imp, adrian, hselasky Cc: julian, hiren, jhb, kostikbel, emaste, delphij, neel, erj, freebsd-net ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Differential] [Commented On] D1777: Associated fix for arp/nd6 timer usage.
hiren added a comment. >>! In D1777#16, @bz wrote: > Hiren, it only took us 4 years to trigger this? Can people actually > easily/reliably reproduce it? Heh, I am not sure about "people" but we @llnw can see this very reliably. Do you have any other theories/patches that we can try? It'd be helpful to understand your reservations about this patch. REVISION DETAIL https://reviews.freebsd.org/D1777 To: rrs, imp, sbruno, gnn, rwatson, lstewart, kostikbel, adrian, jhb, bz Cc: bz, emaste, hiren, julian, hselasky, freebsd-net ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Differential] [Commented On] D1777: Associated fix for arp/nd6 timer usage.
hiren added a comment. It all started with: https://lists.freebsd.org/pipermail/freebsd-net/2014-September/039730.html Last (conclusive) email in that thread: https://lists.freebsd.org/pipermail/freebsd-net/2015-January/040895.html That issue was fixed by: https://reviews.freebsd.org/D1438 i.e. https://svnweb.freebsd.org/base?view=revision&revision=277213 That got reverted as it was not entirely correct/complete. And rrs@ started working on a better approach with https://reviews.freebsd.org/D1711 After applying D1711, we started seeing a bunch of other panics: panic #1 https://reviews.freebsd.org/D1711#54 panic #2 https://reviews.freebsd.org/D1711#55 panic #3 https://reviews.freebsd.org/D1711#56 And finally. after applying patch from this review D1777, we do not see any of the panics and machines seem happy. REVISION DETAIL https://reviews.freebsd.org/D1777 To: rrs, imp, sbruno, gnn, rwatson, lstewart, kostikbel, adrian, jhb, bz Cc: ae, bz, emaste, hiren, julian, hselasky, freebsd-net ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Differential] [Commented On] D1711: Changes to the callout code to restore active semantics and also add a test-framework and test to validate thecallout code (and potentially for use by other tests)
hiren added a comment. @rrs: Looks like we've come full circle back to the very first crash reported. We are on stable10 with all relevant fixes. Sanitized panic #4 Dump header from device /dev/da0s1b Architecture: amd64 Architecture Version: 2 Dump Length: 6764437504B (6451 MB) Blocksize: 512 Dumptime: Mon Feb 16 02:54:11 2015 Hostname: xxx Magic: FreeBSD Kernel Dump Version String: FreeBSD 10.1-STABLE-llnw12 #0: Fri Feb 13 02:22:48 MST 2015 jason@x:/usr/obj/usr/src/sys/SIXFOUR Panic String: spin lock held too long Dump Parity: 1861214463 Bounds: 0 Dump Status: good Backtrace: Reading symbols from /boot/kernel/cc_cubic.ko.symbols...done. Loaded symbols for /boot/kernel/cc_cubic.ko.symbols Reading symbols from /boot/kernel/cc_cdg.ko.symbols...done. Loaded symbols for /boot/kernel/cc_cdg.ko.symbols Reading symbols from /boot/kernel/h_ertt.ko.symbols...done. Loaded symbols for /boot/kernel/h_ertt.ko.symbols Reading symbols from /boot/kernel/ftcp.ko...done. Loaded symbols for /boot/kernel/ftcp.ko #0 doadump (textdump=1) at pcpu.h:219 in pcpu.h (kgdb) #0 doadump (textdump=1) at pcpu.h:219 #1 0x80749c17 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:452 #2 0x80749ff4 in panic (fmt=) at /usr/src/sys/kern/kern_shutdown.c:759 #3 0x80735683 in _mtx_lock_spin_cookie (c=, tid=, opts=, file=, line=) at /usr/src/sys/kern/kern_mutex.c:561 #4 0x80760064 in callout_lock (c=0xf80870266e80) at /usr/src/sys/kern/kern_timeout.c:530 #5 0x8075fc62 in callout_reset_sbt_on (c=0xf80870266e80, sbt=, precision=, ftn=0x808bcfe0 , arg=0xf80870266c00, cpu=, flags=) at /usr/src/sys/kern/kern_timeout.c:975 #6 0x808bd807 in tcp_timer_activate (tp=0x0, timer_type=, delta=) at /usr/src/sys/netinet/tcp_timer.c:883 #7 0x808b3ce0 in tcp_output (tp=0xf80870266c00) at /usr/src/sys/netinet/tcp_output.c:1579 #8 0x808bfa41 in tcp_usr_send (so=, flags=, m=, nam=, control=, td=) at /usr/src/sys/netinet/tcp_usrreq.c:887 #9 0x807c2535 in sosend_generic (so=0xf80678731000, addr=0x0, uio=0xfe2021072960, top=, control=, flags=, td=0x143) at /usr/src/sys/kern/uipc_socket.c:1284 #10 0x807a3fc3 in soo_write (fp=, uio=0xfe2021072960, active_cred=, flags=, td=) at /usr/src/sys/kern/sys_socket.c:103 #11 0x8079cb47 in dofilewrite (td=0xf8011c9c7920, fd=323, fp=0xf804c4c17870, auio=0xfe2021072960, offset=, flags=0) at file.h:304 #12 0x8079c878 in kern_writev (td=0xf8011c9c7920, fd=323, auio=0xfe2021072960) at /usr/src/sys/kern/sys_generic.c:481 #13 0x8079c803 in sys_write (td=, uap=) at /usr/src/sys/kern/sys_generic.c:396 #14 0x80b059ca in amd64_syscall (td=0xf8011c9c7920, traced=0) at subr_syscall.c:134 #15 0x80aeae3b in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:396 #16 0x000801640b8a in ?? () Current language: auto; currently minimal (kgdb) REVISION DETAIL https://reviews.freebsd.org/D1711 To: rrs, gnn, rwatson, lstewart, jhb, kostikbel, sbruno, imp, adrian, hselasky Cc: julian, hiren, jhb, kostikbel, emaste, delphij, neel, erj, freebsd-net ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Differential] [Commented On] D1711: Changes to the callout code to restore active semantics and also add a test-framework and test to validate thecallout code (and potentially for use by other tests)
hiren added a comment. @rrs: One more Sanitized panic #5 Dump header from device /dev/da0s1b Architecture: amd64 Architecture Version: 2 Dump Length: 1694281728B (1615 MB) Blocksize: 512 Dumptime: Sun Feb 15 18:03:14 2015 Hostname: x Magic: FreeBSD Kernel Dump Version String: FreeBSD 10.1-STABLE-llnw12 #0: Fri Feb 13 02:22:48 MST 2015 jason@:/usr/obj/usr/src/sys/SIXFOUR Panic String: spin lock held too long Dump Parity: 4219482370 Bounds: 0 Dump Status: good Backtrace: Reading symbols from /boot/kernel/cc_cubic.ko.symbols...done. Loaded symbols for /boot/kernel/cc_cubic.ko.symbols Reading symbols from /boot/kernel/cc_cdg.ko.symbols...done. Loaded symbols for /boot/kernel/cc_cdg.ko.symbols Reading symbols from /boot/kernel/h_ertt.ko.symbols...done. Loaded symbols for /boot/kernel/h_ertt.ko.symbols Reading symbols from /boot/kernel/if_gif.ko.symbols...done. Loaded symbols for /boot/kernel/if_gif.ko.symbols Reading symbols from /boot/kernel/ftcp.ko...done. Loaded symbols for /boot/kernel/ftcp.ko #0 doadump (textdump=1) at pcpu.h:219 in pcpu.h (kgdb) #0 doadump (textdump=1) at pcpu.h:219 #1 0x80749c17 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:452 #2 0x80749ff4 in panic (fmt=) at /usr/src/sys/kern/kern_shutdown.c:759 #3 0x80735683 in _mtx_lock_spin_cookie (c=, tid=, opts=, file=, line=) at /usr/src/sys/kern/kern_mutex.c:561 #4 0x80760064 in callout_lock (c=0xfe92f6d0) at /usr/src/sys/kern/kern_timeout.c:530 #5 0x8076019c in _callout_stop_safe (c=0xfe92f6d0, safe=0) at /usr/src/sys/kern/kern_timeout.c:1119 #6 0x80557202 in mpt_scsi_reply_handler (mpt=0xfe90, req=0xfe92f678, reply_desc=0, reply_frame=0x0) at /usr/src/sys/dev/mpt/mpt_cam.c:2599 #7 0x805509a7 in mpt_intr (arg=0xfe90) at /usr/src/sys/dev/mpt/mpt.c:823 #8 0x8055d9d6 in mpt_pci_intr (arg=0xfe90) at /usr/src/sys/dev/mpt/mpt_pci.c:802 #9 0x8071a7fb in intr_event_execute_handlers ( p=, ie=0xf80008532e00) at /usr/src/sys/kern/kern_intr.c:1264 #10 0x8071b196 in ithread_loop (arg=0xf8000857bc00) at /usr/src/sys/kern/kern_intr.c:1277 #11 0x8071841a in fork_exit ( callout=0x8071b100 , arg=0xf8000857bc00, frame=0xfe064b1edc00) at /usr/src/sys/kern/kern_fork.c:1017 #12 0x80aeb08e in fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:611 #13 0x in ?? () Current language: auto; currently minimal (kgdb) REVISION DETAIL https://reviews.freebsd.org/D1711 To: rrs, gnn, rwatson, lstewart, jhb, kostikbel, sbruno, imp, adrian, hselasky Cc: julian, hiren, jhb, kostikbel, emaste, delphij, neel, erj, freebsd-net ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Differential] [Commented On] D1711: Changes to the callout code to restore active semantics and also add a test-framework and test to validate thecallout code (and potentially for use by other tests)
hiren added a comment. @hps: cc_cpu[MAXCPU] info as you requested on IRC. Let me know if you need more info. (kgdb) backtrace #0 doadump (textdump=1) at pcpu.h:219 #1 0x80749c17 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:452 #2 0x80749ff4 in panic (fmt=) at /usr/src/sys/kern/kern_shutdown.c:759 #3 0x80735683 in _mtx_lock_spin_cookie (c=, tid=, opts=, file=, line=) at /usr/src/sys/kern/kern_mutex.c:561 #4 0x80760064 in callout_lock (c=0xf8000d81dc98) at /usr/src/sys/kern/kern_timeout.c:530 #5 0x8075fc62 in callout_reset_sbt_on (c=0xf8000d81dc98, sbt=, precision=, ftn=0x8082a610 , arg=0xf8000d81dc00, cpu=, flags=) at /usr/src/sys/kern/kern_timeout.c:975 #6 0x8082b878 in arpintr (m=) at /usr/src/sys/netinet/if_ether.c:781 #7 0x808189d2 in netisr_dispatch_src (proto=, source=, m=0x0) at /usr/src/sys/net/netisr.c:972 #8 0x80811396 in ether_demux (ifp=, m=0xf8005c1e8000) at /usr/src/sys/net/if_ethersubr.c:851 #9 0x80812029 in ether_nh_input (m=) at /usr/src/sys/net/if_ethersubr.c:646 #10 0x808189d2 in netisr_dispatch_src (proto=, source=, m=0x0) at /usr/src/sys/net/netisr.c:972 #11 0x80425f9b in em_rxeof (count=99) at /usr/src/sys/dev/e1000/if_em.c:4532 #12 0x80426373 in em_msix_rx (arg=0xf8000c53a200) at /usr/src/sys/dev/e1000/if_em.c:1600 #13 0x8071a7fb in intr_event_execute_handlers (p=, ie=0xf8000c4ac300) at /usr/src/sys/kern/kern_intr.c:1264 #14 0x8071b196 in ithread_loop (arg=0xf8000c5166e0) at /usr/src/sys/kern/kern_intr.c:1277 #15 0x8071841a in fork_exit (callout=0x8071b100 , arg=0xf8000c5166e0, frame=0xfe0c23fccc00) at /usr/src/sys/kern/kern_fork.c:1017 #16 0x80aeb08e in fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:611 #17 0x in ?? () (kgdb) p * cc_cpu@8 [1/18608] $2 = {{cc_lock = {lock_object = {lo_name = 0x80d03d28 "callout", lo_flags = 720896, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, cc_exec_entity = {{ cc_curr = 0x0, ce_migration_func = 0, ce_migration_arg = 0x0, ce_migration_cpu = 64, ce_migration_time = 0, ce_migration_prec = 0, cc_cancel = false, cc_waiting = false}, {cc_curr = 0x0, ce_migration_func = 0, ce_migration_arg = 0x0, ce_migration_cpu = 64, ce_migration_time = 0, ce_migration_prec = 0, cc_cancel = false, cc_waiting = false}}, cc_next = 0x0, cc_callout = 0xfe6a4000, cc_callwheel = 0xfe7c6000, cc_expireq = { tqh_first = 0x0, tqh_last = 0x81364288}, cc_callfree = {slh_first = 0xfe7c5240}, cc_firstevent = 899380454888656, cc_lastscan = 899380454354416, cc_cookie = 0xf8000c34f100, cc_bucket = 31391, cc_ktr_event_name = "callwheel cpu 0\000\000\000\000"}, {cc_lock = { lock_object = {lo_name = 0x80d03d28 "callout", lo_flags = 720896, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, cc_exec_entity = {{cc_curr = 0x0, ce_migration_func = 0, ce_migration_arg = 0x0, ce_migration_cpu = 64, ce_migration_time = 0, ce_migration_prec = 0, cc_cancel = false, cc_waiting = false}, {cc_curr = 0x0, ce_migration_func = 0, ce_migration_arg = 0x0, ce_migration_cpu = 64, ce_migration_time = 0, ce_migration_prec = 0, cc_cancel = false, cc_waiting = false}}, cc_next = 0x0, cc_callout = 0x0, cc_callwheel = 0xfe85a000, cc_expireq = {tqh_first = 0x0, tqh_last = 0x81364408}, cc_callfree = {slh_first = 0x0}, cc_firstevent = 899620856539720, cc_lastscan = 899620209076539, cc_cookie = 0xf8000c34f080, cc_bucket = 13092, cc_ktr_event_name = "callwheel cpu 1\000\000\000\000"}, {cc_lock = {lock_object = { lo_name = 0x80d03d28 "callout", lo_flags = 720896, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, cc_exec_entity = {{cc_curr = 0x0, ce_migration_func = 0, ce_migration_arg = 0x0, ce_migration_cpu = 64, ce_migration_time = 0, ce_migration_prec = 0, cc_cancel = false, cc_waiting = false}, {cc_curr = 0x0, ce_migration_func = 0, ce_migration_arg = 0x0, ce_migration_cpu = 64, ce_migration_time = 0, ce_migration_prec = 0, cc_cancel = false, cc_waiting = false}}, cc_next = 0x0, cc_callout = 0x0, cc_callwheel = 0xfe89a000, cc_expireq = {tqh_first = 0x0, tqh_last = 0x81364588}, cc_callfree = {slh_first = 0x0}, cc_firstevent = 899446753609881, cc_lastscan = 899445618670680, cc_cookie = 0xf8000c34f000, cc_bucket = 2680, cc_ktr_event_name = "callwheel cpu 2\000\000\000\000"}, {cc_lock = {lock_object = { lo_name = 0x80d
[Differential] [Commented On] D1711: Changes to the callout code to restore active semantics and also add a test-framework and test to validate thecallout code (and potentially for use by other tests)
hiren added a comment. >>! In D1711#88, @rrs wrote: > Hans: > > I don't get your call sequence, I sent you an email on it.. > > Hiren: > > Can you go up the call chain and dump the callout structure > c in > 0x80760064 in callout_lock (c=0xf8000d81dc98) at > /usr/src/sys/kern/kern_timeout.c:530 (kgdb) frame 4 #4 0x80760064 in callout_lock (c=0xf8000d81dc98) at /usr/src/sys/kern/kern_timeout.c:530 530 CC_LOCK(cc); (kgdb) p *c $1 = {c_links = {le = {le_next = 0x0, le_prev = 0xfe804db8}, sle = {sle_next = 0x0}, tqe = {tqe_next = 0x0, tqe_prev = 0xfe804db8}}, c_time = 903238992575048, c_precision = 241591893750, c_arg = 0xf8000d81dc00, c_func = 0x8082a610 , c_lock = 0x0, c_flags = 22, c_cpu = 0} > > There is something funny here, because the lock's listed (which is what the > spin lock panic was on) > are all mtx_lock=4.. which means they are un-locked. REVISION DETAIL https://reviews.freebsd.org/D1711 To: rrs, gnn, rwatson, lstewart, jhb, kostikbel, sbruno, imp, adrian, hselasky Cc: julian, hiren, jhb, kostikbel, emaste, delphij, neel, erj, freebsd-net ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Differential] [Commented On] D1711: Changes to the callout code to restore active semantics and also add a test-framework and test to validate thecallout code (and potentially for use by other tests)
hiren added a comment. >>! In D1711#86, @hselasky wrote: > Hi, > > rrs + hiren: > > I think the problem is this: > > In "_callout_stop_safe()" we sometimes exit having "cc_migration_cpu(cc, > direct) = CPUBLOCK;". Now if a second call to "_callout_stop_safe()" happens > before the pending callback has returned, which is using a mutex, we are > deadlocked, because "_callout_stop_safe()" is called having the same lock > locked which the callback needs to aquire aswell. Because the callout > subsystem cannot aquire the mutex during the callback function, it can > neither reach the migration code which resets the cc_migration_cpu() variable. > > hiren: Can you backtrace all the softclock processes in your dump? How exactly do I do it? I do not see any explicit mention of softclock in the dump. REVISION DETAIL https://reviews.freebsd.org/D1711 To: rrs, gnn, rwatson, lstewart, jhb, kostikbel, sbruno, imp, adrian, hselasky Cc: julian, hiren, jhb, kostikbel, emaste, delphij, neel, erj, freebsd-net ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Differential] [Commented On] D1711: Changes to the callout code to restore active semantics and also add a test-framework and test to validate thecallout code (and potentially for use by other tests)
hiren added a comment. >>! In D1711#91, @rrs wrote: > Hiren: > > Thats helpful.. as I said this is strange. The callout you posted shows its > associated with CPU 0, (c_cpu == 0), and yet > the mtx on that (which is what we are spinning on) is free (its owned == 4). > So why would we have crashed > holding the spin lock too long? Unless just as we decided to panic the owner > released it. Hmm there is > code in there to check that though.. td = mtx_owner() if (td == NULL) > return... > > The c_flags = 22 which is PENDING/ACTIVE and Return unlocked. That means it > is *supposed* to be on the > callout wheel someplace. The linked list used is then the LLIST.. i.e. > {le_next = 0x0, le_prev = 0xfe804db8} > > Now if le_next is 0, its the end of the list. > > Can you look back a the previous.. i.e. walk it back > > print *(struct callout *)0xfe804db8 > > That should print a valid callout as well.. and we should be able to walk > back to > the top of the wheel.. by keeping on moving back. Hrm, are there only 2 entries here? (kgdb) print *(struct callout *)0xfe804db8 $4 = {c_links = {le = {le_next = 0xf8000d81dc98, le_prev = 0x0}, sle = {sle_next = 0xf8000d81dc98}, tqe = {tqe_next = 0xf8000d81dc98, tqe_prev = 0x0}}, c_time = 0, c_precision = 0, c_arg = 0x0, c_func = 0, c_lock = 0x0, c_flags = 0, c_cpu = 0} le_next is back to 0xf8000d81dc98. Anything else I should look at? REVISION DETAIL https://reviews.freebsd.org/D1711 To: rrs, gnn, rwatson, lstewart, jhb, kostikbel, sbruno, imp, adrian, hselasky Cc: julian, hiren, jhb, kostikbel, emaste, delphij, neel, erj, freebsd-net ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Differential] [Commented On] D1711: Changes to the callout code to restore active semantics and also add a test-framework and test to validate thecallout code (and potentially for use by other tests)
hiren added a comment. >>! In D1711#92, @rrs wrote: > Hiren: > > There also should have been a printf before the panic string > printf( "spin lock %p (%s) held by %p (tid %d) too long\n", > m, m->lock_object.lo_name, td, td->td_tid); > > Can we see what that lovely printf has displayed? Ah, my bad for not providing that earlier here: spin lock 0x81364180 (callout) held by 0xf8000dc0e920 (tid 100111) too long panic: spin lock held too long Now, (kgdb) print *(struct callout *)0x81364180 $8 = {c_links = {le = {le_next = 0x80d03d28, le_prev = 0xb}, sle = {sle_next = 0x80d03d28}, tqe = {tqe_next = 0x80d03d28, tqe_prev = 0xb}}, c_time = 0, c_precision = 4, c_arg = 0x0, c_func = 0, c_lock = 0x0, c_flags = 0, c_cpu = 0} if I print it's le_next, that node's le_prev is not pointing to 0x81364180 (kgdb) print *(struct callout *)0x80d03d28 $9 = {c_links = {le = {le_next = 0x74756f6c6c6163, le_prev = 0x6f207265626d754e}, sle = {sle_next = 0x74756f6c6c6163}, tqe = {tqe_next = 0x74756f6c6c6163, tqe_prev = 0x6f207265626d754e}}, c_time = 7307497714779234406, c_precision = 7809632219779637363, c_arg = 0x61206c656568776c, c_func = 0x20657a697320646e, c_lock = 0x6f656d697420666f, c_flags = 690517109, c_cpu = 1701998624} Also, trying to print le_next or le_prev is not working: (kgdb) print *(struct callout *)0x6f207265626d754e Cannot access memory at address 0x6f207265626d754e (kgdb) print *(struct callout *)0x74756f6c6c6163 Cannot access memory at address 0x74756f6c6c6163 Is something wrong here or I am failing to understand this. (The latter has a higher probability) > > In theory the lo_name should be "callout" and the %p should point to > &cc_cpu[0].cc_lock > > Can we validate that these align correctly too? How do I validate it? REVISION DETAIL https://reviews.freebsd.org/D1711 To: rrs, gnn, rwatson, lstewart, jhb, kostikbel, sbruno, imp, adrian, hselasky Cc: julian, hiren, jhb, kostikbel, emaste, delphij, neel, erj, freebsd-net ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Differential] [Commented On] D1711: Changes to the callout code to restore active semantics and also add a test-framework and test to validate thecallout code (and potentially for use by other tests)
hiren added a comment. >>! In D1711#96, @rrs wrote: > Hiren: > > You have the wrong structure type. > > In the printf before panic it is giving you the lock that was spinning.. that > would be in the callout_cpu structure I bet.. I mis-told you in email. > > So if you did > > print *(struct callout_cpu *)0x81364180 > > It should show you our CPU structure .. and I believe the lock should be > un-held owner = 4 > Either that or 0x81364180 does not equal > > &cc_cpu[0] bah , right. (kgdb) print *(struct callout_cpu *)0x81364180 $5 = {cc_lock = {lock_object = {lo_name = 0x80d03d28 "callout", lo_flags = 720896, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, cc_exec_entity = {{ cc_curr = 0x0, ce_migration_func = 0, ce_migration_arg = 0x0, ce_migration_cpu = 64, ce_migration_time = 0, ce_migration_prec = 0, cc_cancel = false, cc_waiting = false}, {cc_curr = 0x0, ce_migration_func = 0, ce_migration_arg = 0x0, ce_migration_cpu = 64, ce_migration_time = 0, ce_migration_prec = 0, cc_cancel = false, cc_waiting = false}}, cc_next = 0x0, cc_callout = 0xfe6a4000, cc_callwheel = 0xfe7c6000, cc_expireq = {tqh_first = 0x0, tqh_last = 0x81364288}, cc_callfree = {slh_first = 0xfe7c5240}, cc_firstevent = 899380454888656, cc_lastscan = 899380454354416, cc_cookie = 0xf8000c34f100, cc_bucket = 31391, cc_ktr_event_name = "callwheel cpu 0\000\000\000\000"} REVISION DETAIL https://reviews.freebsd.org/D1711 To: rrs, gnn, rwatson, lstewart, jhb, kostikbel, sbruno, imp, adrian, hselasky Cc: julian, hiren, jhb, kostikbel, emaste, delphij, neel, erj, freebsd-net ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Differential] [Commented On] D1711: Changes to the callout code to restore active semantics and also add a test-framework and test to validate thecallout code (and potentially for use by other tests)
hiren added a comment. Another panic from an almost *idle* box: Sanitized panic #6 Dump header from device /dev/da0s1b Architecture: amd64 Architecture Version: 2 Dump Length: 6525980672B (6223 MB) Blocksize: 512 Dumptime: Thu Feb 19 06:16:57 2015 Hostname: xx Magic: FreeBSD Kernel Dump Version String: FreeBSD 10.1-STABLE-llnw12 #0: Fri Feb 13 02:22:48 MST 2015 jason@:/usr/obj/usr/src/sys/SIXFOUR Panic String: spin lock held too long Dump Parity: 1313546413 Bounds: 0 Dump Status: good Backtrace: Reading symbols from /boot/kernel/cc_cubic.ko.symbols...done. Loaded symbols for /boot/kernel/cc_cubic.ko.symbols Reading symbols from /boot/kernel/cc_cdg.ko.symbols...done. Loaded symbols for /boot/kernel/cc_cdg.ko.symbols Reading symbols from /boot/kernel/h_ertt.ko.symbols...done. Loaded symbols for /boot/kernel/h_ertt.ko.symbols Reading symbols from /boot/kernel/ftcp.ko...done. Loaded symbols for /boot/kernel/ftcp.ko #0 doadump (textdump=1) at pcpu.h:219 in pcpu.h (kgdb) #0 doadump (textdump=1) at pcpu.h:219 #1 0x80749c17 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:452 #2 0x80749ff4 in panic (fmt=) at /usr/src/sys/kern/kern_shutdown.c:759 #3 0x80735683 in _mtx_lock_spin_cookie (c=, tid=, opts=, file=, line=) at /usr/src/sys/kern/kern_mutex.c:561 #4 0x80af3fc1 in smp_tlb_shootdown (vector=246, pmap=0x81391ae0, addr1=18446742009410568192, addr2=18446742009410572288) at /usr/src/sys/amd64/amd64/mp_machdep.c:1145 #5 0x80af5d3c in pmap_invalidate_range (pmap=, sva=, eva=) at /usr/src/sys/amd64/amd64/pmap.c:1480 #6 0x807d57ef in vfs_vmio_release (bp=0xfe1f298bd000) at /usr/src/sys/kern/vfs_bio.c:1861 #7 0x807d622b in getnewbuf (maxsize=, gbflags=) at /usr/src/sys/kern/vfs_bio.c:2149 #8 0x807d3791 in getblk (vp=0xf802a158f3b0, blkno=0, size=4096, slpflag=0, slptimeo=0, flags=) at /usr/src/sys/kern/vfs_bio.c:3210 #9 0x807d41dd in breadn_flags (vp=0xf802a158f3b0, blkno=0, size=0, rablkno=0x0, rabsize=0x0, cnt=0, cred=0xfe2020f6e670, flags=0, bpp=0xfe2020f6e670) at /usr/src/sys/kern/vfs_bio.c:1127 #10 0x8095c84a in ffs_blkatoff (vp=0x0, offset=0, res=0x0, bpp=0xfe2020f6e7f8) at /usr/src/sys/ufs/ffs/ffs_subr.c:86 #11 0x8096ef92 in ufs_readdir (ap=0xfe2020f6e900) at /usr/src/sys/ufs/ufs/ufs_vnops.c:2177 #12 0x80c2af07 in VOP_READDIR_APV (vop=, a=) at vnode_if.c:1821 #13 0x807f9aaa in kern_getdirentries (td=0xf800230a8000, fd=, buf=0x8022a9000 , count=, basep=0xfe2020f6e980, residp=0x0) at vnode_if.h:758 #14 0x807f9888 in sys_getdirentries (td=0x0, uap=0xfe2020f6ea40) at /usr/src/sys/kern/vfs_syscalls.c:4030 #15 0x80b059ca in amd64_syscall (td=0xf800230a8000, traced=0) at subr_syscall.c:134 #16 0x80aeae3b in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:396 #17 0x000801588efa in ?? () Current language: auto; currently minimal (kgdb) @rrs This does not have your patch so we won't get any more interesting data but I wanted to show that we get the panic on almost idle boxes too. What does that tell us? REVISION DETAIL https://reviews.freebsd.org/D1711 To: rrs, gnn, rwatson, lstewart, jhb, kostikbel, sbruno, imp, adrian, hselasky Cc: julian, hiren, jhb, kostikbel, emaste, delphij, neel, erj, freebsd-net ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Differential] [Updated] D5872: tcp: Don't prematurely drop receiving-only connections
hiren added a comment. In https://reviews.freebsd.org/D5872#127123, @jtl wrote: > > The key feature that makes the retransmit timer inappropriate for an ACK-only case is that it is only stopped when we receive input; however, in the ACK-only case, we really want to stop it as soon as we transmit a successful ACK. Indeed. I guess we want to treat internal insufficient memory error with retransmit timer remedy. One would also argue that do you really want to go on when you failed to respond (with the ACK) for these many times. Don't you have bigger problems by now? > Of course, we could just drop the ACK and everything would "just work". But, it //probably// is still a good idea to try to re-transmit the ACK. I am not opposed to the suggested patch but its just...weird. (Also if its not obvious, I don't have a better solution to present. :-)) REVISION DETAIL https://reviews.freebsd.org/D5872 EMAIL PREFERENCES https://reviews.freebsd.org/settings/panel/emailpreferences/ To: sepherosa_gmail.com, network, glebius, lstewart, adrian, delphij, decui_microsoft.com, honzhan_microsoft.com, howard0su_gmail.com, freebsd-net-list, transport, jtl, hiren Cc: jtl ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Differential] [Commented On] D5872: tcp: Don't prematurely drop receiving-only connections
hiren added a comment. In https://reviews.freebsd.org/D5872#127345, @jtl wrote: > In https://reviews.freebsd.org/D5872#127343, @mike-karels.net wrote: > > > If we get an ENOBUFS when sending data, we will already be running the retransmit timer. > > > Good point, but see below. > > > If we drop an ACK on ENOBUFS, either we will receive more data and > > attempt another ACK, or the sender will time out and resend data. Either > > will get the connection started again. > > True, but it may take time if we have to wait for a retransmission time out. That isn't necessarily a bad thing. But setting retransmission timer on an ACK seems... wrong. >> I believe lines 1552-1554 should >> simply be deleted. > > On the surface, that seems like a reasonable alternative. However, its interesting to see why this code was originally added. It looks like it was added by jlemon in r61179. Here's what the commit message says: > >> When attempting to transmit a packet, if the system fails to allocate >> a mbuf, it may return without setting any timers. If no more data is >> scheduled to be transmitted (this was a FIN) the system will sit in >> LAST_ACK state forever. >> >> Thus, when mbuf allocation fails, set the retransmit timer if neither >> the retransmit or persist timer is already pending. > > Even back then, the retransmit and/or persist timers should have been running by this point in the code. It would be interesting to know if jlemon proved this code fixed the problem, or it was just speculated. We fixed a similar-sounding problem last year, and the code is different now, so these particular lines may truly no longer be necessary. But, I am a little hesitant to remove this code without knowing more about its origin. > > Just my 2c. I guess this boils down to how you want to "deal" with this (possibly) transient memory error. I'd like to go with @mike-karels.net's suggestion as when we reach to this point in code, I don't see how timers couldn't have been set already. And I doubt we can get any more info from that old commit :/ But because this is (sigh..) tcp and if we want to be really cautious, I like @jtl 's patch of handling this for 'acks' with delack. (In that patch, I am not sure why we want to drop cwnd down to 1mss only in case of data and not for acks. I think we should do that for both?) (fwiw, other BSDs don't have this special handling as Mark suggested.) REVISION DETAIL https://reviews.freebsd.org/D5872 EMAIL PREFERENCES https://reviews.freebsd.org/settings/panel/emailpreferences/ To: sepherosa_gmail.com, network, glebius, lstewart, adrian, delphij, decui_microsoft.com, honzhan_microsoft.com, howard0su_gmail.com, freebsd-net-list, transport, jtl, hiren Cc: mike-karels.net, jtl ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Differential] D5872: tcp: Don't prematurely drop receiving-only connections
hiren added a comment. Ack for removing ENOBUFs case. REVISION DETAIL https://reviews.freebsd.org/D5872 EMAIL PREFERENCES https://reviews.freebsd.org/settings/panel/emailpreferences/ To: sepherosa_gmail.com, network, glebius, adrian, delphij, decui_microsoft.com, honzhan_microsoft.com, howard0su_gmail.com, freebsd-net-list, transport, jtl, hiren, lstewart Cc: gnn, mike-karels.net, jtl ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Differential] D5872: tcp: Don't prematurely drop receiving-only connections
hiren added a comment. In https://reviews.freebsd.org/D5872#128539, @lstewart wrote: > ... but replace with a macro to check that the rexmit/persist timer is armed if appropriate! Yes, that would be useful! REVISION DETAIL https://reviews.freebsd.org/D5872 EMAIL PREFERENCES https://reviews.freebsd.org/settings/panel/emailpreferences/ To: sepherosa_gmail.com, network, glebius, adrian, delphij, decui_microsoft.com, honzhan_microsoft.com, howard0su_gmail.com, freebsd-net-list, transport, jtl, hiren, lstewart Cc: gnn, mike-karels.net, jtl ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Differential] D5872: tcp: Don't prematurely drop receiving-only connections
hiren added a comment. In https://reviews.freebsd.org/D5872#128555, @lstewart wrote: > I thought that had been fixed ages ago... oops. Fixed? i.e. doing something other than setting cwnd to 1 seg? > It should be calling cc_cong_signal() with a new congestion type. Hum... tcp_quench() used to be there which essentially had this 1 line to set cwnd to 1 seg. Is there any (RFC) guidance for what to do in this situation? REVISION DETAIL https://reviews.freebsd.org/D5872 EMAIL PREFERENCES https://reviews.freebsd.org/settings/panel/emailpreferences/ To: sepherosa_gmail.com, network, glebius, adrian, delphij, decui_microsoft.com, honzhan_microsoft.com, howard0su_gmail.com, freebsd-net-list, transport, jtl, hiren, lstewart Cc: gnn, mike-karels.net, jtl ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Differential] D6120: tcp/syncache: Set flowid and hash type properly for SYN|ACK
hiren accepted this revision. hiren added a comment. This revision has a positive review. I hope you'd write a bit more descriptive commit-log (not just 'what' but also 'why') for the change. Thanks a lot for your work! Cheers, Hiren INLINE COMMENTS sys/netinet/tcp_syncache.c:1507 Do you mind adding a line or two about what this function does in comments to improve readability? Thanks :-) REVISION DETAIL https://reviews.freebsd.org/D6120 EMAIL PREFERENCES https://reviews.freebsd.org/settings/panel/emailpreferences/ To: sepherosa_gmail.com, adrian, rwatson, gnn, lstewart, glebius, delphij, mike-karels.net, jtl, sbruno, hiren, transport, network Cc: freebsd-net-list ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: how to find out if an IP address is assigned statically or dynamically?
On Feb 11, 2013 10:44 PM, "h bagade" wrote: > > Hi all, > > I want to know if there is a way to find out if an interface address is > assigned by dhcp or statically? For example, any distinctive flag or > something like that on ifconfig output! or any other way except processing > dhclient leases files? As per my limited knowledge, no. The only reliable way is to look at /var/db/dhclient.leases. files as you mentioned. Hiren ___ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: request
On Wed, Mar 13, 2013 at 12:02 AM, rathish mudaliar wrote: > Hi, > > I request you to kindly add me to your mailing list. http://lists.freebsd.org/mailman/listinfo/freebsd-net This is where you can go and subscribe. cheers, Hiren > > > Regards > Rathish.R > ___ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: bce(4) panics, 9.2rc1
On Wed, Jul 24, 2013 at 2:23 PM, Sean Bruno wrote: > On Wed, 2013-07-24 at 14:07 -0700, Sean Bruno wrote: >> Running 9.2 in production load mail servers. We're hitting the >> "watchdog" message and crashing with the stable/9 version. We're >> reverting the change from 2 weeks ago and seeing if it still happens. >> We didn't see this from stable/9 from about a month ago. >> pciconf -lvb: http://people.freebsd.org/~hiren/pciconf.txt Thanks, Hiren ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: DTrace network providers
On Thu, Aug 22, 2013 at 8:36 AM, George Neville-Neil wrote: > > On Aug 21, 2013, at 1:00 , Mark Johnston wrote: > > > Hello! > > > > I've ported the ip, tcp and udp DTrace providers to FreeBSD, following > > the Solaris documentation here: > > > > https://wikis.oracle.com/display/DTrace/ip+Provider > > https://wikis.oracle.com/display/DTrace/tcp+Provider > > https://wikis.oracle.com/display/DTrace/udp+Provider > > > > My implementation of these providers makes use of dynamic translators, > > for which FreeBSD support was added in r254468; this patch won't compile > > with earlier revisions. The use of dynamic translators means that > > existing DTrace scripts which use these providers will just work when run > > on FreeBSD - no modifications needed. In particular, all of the examples > > in the links above will work properly on FreeBSD with my diff. > > > > I've collected a bunch of example scripts for these providers and placed > > them here: > > > > http://people.freebsd.org/~markj/dtrace/network-providers/ > > > > To run one you just need to execute "dtrace -s
Re: DTrace network providers
On Thu, Aug 22, 2013 at 3:05 PM, Yuri wrote: > On 08/20/2013 22:00, Mark Johnston wrote: >> >> The patch is here: >> >> >> http://people.freebsd.org/~markj/patches/network-providers/network-providers-1.diff >> >> It depends on r254468. To use it, just recompile the kernel (assuming > > > Your patch fails to apply, see below. fwiw, I update the laptop to an hour back CURRENT r254665 and patch applied cleanly. cheers, hiren > > I use clean r254468 as you suggested. > > Yuri > > -- > |diff --git a/sys/netinet/ip_output.c b/sys/netinet/ip_output.c > |index 0a87e7a..15196e0 100644 > |--- a/sys/netinet/ip_output.c > |+++ b/sys/netinet/ip_output.c > -- > Patching file a/sys/netinet/ip_output.c using Plan A... > Hunk #1 succeeded at 34. > Hunk #2 succeeded at 48. > Hunk #3 succeeded at 66. > Hunk #4 failed at 625. > Hunk #5 succeeded at 660 with fuzz 2. > 1 out of 5 hunks failed--saving rejects to a/sys/netinet/ip_output.c.rej > > -- > |diff --git a/sys/netinet6/nd6.c b/sys/netinet6/nd6.c > |index 7755da1..9eaa0aa 100644 > |--- a/sys/netinet6/nd6.c > |+++ b/sys/netinet6/nd6.c > -- > Patching file a/sys/netinet6/nd6.c using Plan A... > Hunk #1 succeeded at 34. > Hunk #2 succeeded at 51. > Hunk #3 succeeded at 64. > Hunk #4 failed at 2086. > 1 out of 4 hunks failed--saving rejects to a/sys/netinet6/nd6.c.rej > Hmm... The next patch looks like a unified diff to me... > The text leading up to this was: > > > ___ > freebsd-dtr...@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-dtrace > To unsubscribe, send any mail to "freebsd-dtrace-unsubscr...@freebsd.org" ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: DTrace network providers
On Thu, Aug 22, 2013 at 10:25 AM, hiren panchasara < hiren.panchas...@gmail.com> wrote: > > > > On Thu, Aug 22, 2013 at 8:36 AM, George Neville-Neil wrote: > >> >> On Aug 21, 2013, at 1:00 , Mark Johnston wrote: >> >> > Hello! >> > >> > I've ported the ip, tcp and udp DTrace providers to FreeBSD, following >> > the Solaris documentation here: >> > >> > https://wikis.oracle.com/display/DTrace/ip+Provider >> > https://wikis.oracle.com/display/DTrace/tcp+Provider >> > https://wikis.oracle.com/display/DTrace/udp+Provider >> > >> > My implementation of these providers makes use of dynamic translators, >> > for which FreeBSD support was added in r254468; this patch won't compile >> > with earlier revisions. The use of dynamic translators means that >> > existing DTrace scripts which use these providers will just work when >> run >> > on FreeBSD - no modifications needed. In particular, all of the examples >> > in the links above will work properly on FreeBSD with my diff. >> > >> > I've collected a bunch of example scripts for these providers and placed >> > them here: >> > >> > http://people.freebsd.org/~markj/dtrace/network-providers/ >> > >> > To run one you just need to execute "dtrace -s
Re: mbuf autotuning changes
On Fri, Sep 6, 2013 at 12:14 PM, Alfred Perlstein wrote: > On 9/6/13 12:10 PM, hiren panchasara wrote: > >> tunable_mbinit() in kern_mbuf.c looks like this: >> >> 119 /* >> 120 * The default limit for all mbuf related memory is 1/2 of all >> 121 * available kernel memory (physical or kmem). >> 122 * At most it can be 3/4 of available kernel memory. >> 123 */ >> 124 realmem = qmin((quad_t)physmem * PAGE_SIZE, >> 125 vm_map_max(kmem_map) - vm_map_min(kmem_map)); >> 126 maxmbufmem = realmem / 2; >> 127 TUNABLE_QUAD_FETCH("kern.ipc.**maxmbufmem", &maxmbufmem); >> 128 if (maxmbufmem > realmem / 4 * 3) >> 129 maxmbufmem = realmem / 4 * 3; >> >> If I am reading the code correctly, we loose the value on line 126 when we >> do FETCH on line 127. >> >> And after line 127, if we havent specified kern.ipc.maxmbufmem (in >> loader.conf - I guess...), we set that value to 0. >> >> And because of that the if condition on line 128 is almost always false? >> >> What am I missing here? >> >> Thanks, >> Hiren >> __**_ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/**mailman/listinfo/freebsd-net<http://lists.freebsd.org/mailman/listinfo/freebsd-net> >> To unsubscribe, send any mail to >> "freebsd-net-unsubscribe@**freebsd.org >> " >> >> I think TUNABLE_*_FETCH will only write to the variable if it explicitly > set. > > Meaning, unless the user actually sets a value in loader.conf then 127 is > a no-op. > Thanks Navdeep and Alfred. Thats correct. Its not touching the var if its not set. I guess the other TUNABLE_INT_FETCHs later in the function checking for variable ==0 confused me. i.e. nmbclusters. 131 TUNABLE_INT_FETCH("kern.ipc.nmbclusters", &nmbclusters); 132 if (nmbclusters == 0) 133 nmbclusters = maxmbufmem / MCLBYTES / 4; But those are global variable so here we are just checking if they are explicitly set of not. If not, we will set them. For maxmbufmem, we will set it to 1/2 the realmem. and if user sets it explicitly than we will make sure its not more than 3/4 of the realmem. Thanks again. Hiren ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
mbuf autotuning changes
tunable_mbinit() in kern_mbuf.c looks like this: 119 /* 120 * The default limit for all mbuf related memory is 1/2 of all 121 * available kernel memory (physical or kmem). 122 * At most it can be 3/4 of available kernel memory. 123 */ 124 realmem = qmin((quad_t)physmem * PAGE_SIZE, 125 vm_map_max(kmem_map) - vm_map_min(kmem_map)); 126 maxmbufmem = realmem / 2; 127 TUNABLE_QUAD_FETCH("kern.ipc.maxmbufmem", &maxmbufmem); 128 if (maxmbufmem > realmem / 4 * 3) 129 maxmbufmem = realmem / 4 * 3; If I am reading the code correctly, we loose the value on line 126 when we do FETCH on line 127. And after line 127, if we havent specified kern.ipc.maxmbufmem (in loader.conf - I guess...), we set that value to 0. And because of that the if condition on line 128 is almost always false? What am I missing here? Thanks, Hiren ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: mbuf autotuning changes
On Fri, Sep 6, 2013 at 12:38 PM, Alfred Perlstein wrote: > On 9/6/13 12:36 PM, hiren panchasara wrote: > >> On Fri, Sep 6, 2013 at 12:14 PM, Alfred Perlstein wrote: >> >> On 9/6/13 12:10 PM, hiren panchasara wrote: >>> >>> tunable_mbinit() in kern_mbuf.c looks like this: >>>> >>>> 119 /* >>>> 120 * The default limit for all mbuf related memory is 1/2 of >>>> all >>>> 121 * available kernel memory (physical or kmem). >>>> 122 * At most it can be 3/4 of available kernel memory. >>>> 123 */ >>>> 124 realmem = qmin((quad_t)physmem * PAGE_SIZE, >>>> 125 vm_map_max(kmem_map) - vm_map_min(kmem_map)); >>>> 126 maxmbufmem = realmem / 2; >>>> 127 TUNABLE_QUAD_FETCH("kern.ipc.maxmbufmem", &maxmbufmem); >>>> >>>> 128 if (maxmbufmem > realmem / 4 * 3) >>>> 129 maxmbufmem = realmem / 4 * 3; >>>> >>>> If I am reading the code correctly, we loose the value on line 126 when >>>> we >>>> do FETCH on line 127. >>>> >>>> And after line 127, if we havent specified kern.ipc.maxmbufmem (in >>>> loader.conf - I guess...), we set that value to 0. >>>> >>>> And because of that the if condition on line 128 is almost always false? >>>> >>>> What am I missing here? >>>> >>>> Thanks, >>>> Hiren >>>> ___ >>>> freebsd-net@freebsd.org mailing list >>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net<http://lists.freebsd.org/**mailman/listinfo/freebsd-net> >>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net> >>>> > >>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@**fre** >>>> ebsd.org >>>> <http://freebsd.org> >>>> > >>>> >>>> " >>>> >>>> I think TUNABLE_*_FETCH will only write to the variable if it >>>> explicitly >>>> >>> set. >>> >>> Meaning, unless the user actually sets a value in loader.conf then 127 is >>> a no-op. >>> >>> Thanks Navdeep and Alfred. >> >> Thats correct. Its not touching the var if its not set. >> >> I guess the other TUNABLE_INT_FETCHs later in the function checking for >> variable ==0 confused me. i.e. nmbclusters. >> >> 131 TUNABLE_INT_FETCH("kern.ipc.**nmbclusters", &nmbclusters); >> 132 if (nmbclusters == 0) >> 133 nmbclusters = maxmbufmem / MCLBYTES / 4; >> >> But those are global variable so here we are just checking if they are >> explicitly set of not. If not, we will set them. >> >> For maxmbufmem, we will set it to 1/2 the realmem. and if user sets it >> explicitly than we will make sure its not more than 3/4 of the realmem. >> > Yes. It's somewhat confusing. > > I'm all for adding comments to this effect if you have the time and > inclination. I guess its verbose enough in kern_mbuf.c I just had to *actually* read getenv_quad() to know that its not setting the variable to 0, it was just the return value. We can probably do: [hirenp@wholecorner ~/commit_head/sys/kern]$ svn diff Index: kern_environment.c === --- kern_environment.c (revision 255320) +++ kern_environment.c (working copy) @@ -530,7 +530,8 @@ } /* - * Return a quad_t value from an environment variable. + * Return a quad_t value from an environment variable inside "data". + * If the environment variable is not set, "data" will be unchanged. */ int getenv_quad(const char *name, quad_t *data) cheers, Hiren ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
mbuf autotuning effect
We are seeing an interesting thing on a mips board with 32MB ram. We run out of mbuf very easily and looking at numbers it seems we are only getting 6mb of maxmbufmem. # sysctl -a | grep hw | grep mem hw.physmem: 33554432 hw.usermem: 21774336 hw.realmem: 33554432 # # sysctl -a | grep maxmbuf kern.ipc.maxmbufmem: 6291456 I believe that number is very low for a board with 32mb of ram. Looking at the code: sys/kern/kern_mbuf.c : tunable_mbinit() 124 realmem = qmin((quad_t)physmem * PAGE_SIZE, vm_kmem_size); 125 maxmbufmem = realmem / 2; 126 TUNABLE_QUAD_FETCH("kern.ipc.maxmbufmem", &maxmbufmem); 127 if (maxmbufmem > realmem / 4 * 3) 128 maxmbufmem = realmem / 4 * 3; So, realmem plays important role in determining maxmbufmem. physmem = 32mb PAGE_SIZE = 4096 vm_kmem_size is calculated inside sys/kern/kern_malloc.c : kmeminit() 705 vm_kmem_size = VM_KMEM_SIZE + nmbclusters * PAGE_SIZE; 706 mem_size = cnt.v_page_count; 707 708 #if defined(VM_KMEM_SIZE_SCALE) 709 vm_kmem_size_scale = VM_KMEM_SIZE_SCALE; 710 #endif 711 TUNABLE_INT_FETCH("vm.kmem_size_scale", &vm_kmem_size_scale); 712 if (vm_kmem_size_scale > 0 && 713 (mem_size / vm_kmem_size_scale) > (vm_kmem_size / PAGE_SIZE)) 714 vm_kmem_size = (mem_size / vm_kmem_size_scale) * PAGE_SIZE; here, VM_KMEM_SIZE = 12*1024*1024 nmbclusters = 0 (initially) PAGE_SIZE = 4096 # sysctl -a | grep v_page_count vm.stats.vm.v_page_count: 7035 and VM_KMEM_SIZE_SCALE = 3 for mips. So, vm_kmem_size = 12mb. Going back to tunable_mbinit(), we get realmem = 12mb. and masmbufmem = 6mb. Wanted to see if I am following the code correctly and how autotuning should work here. cheers, Hiren ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: mbuf autotuning effect
On Sep 6, 2013 8:26 PM, "Warner Losh" wrote: > > > On Sep 6, 2013, at 7:11 PM, Adrian Chadd wrote: > > > Yeah, why is VM_KMEM_SIZE only 12mbyte for MIPS? That's a little low for a > > platform that has a direct map that's slightly larger than 12mb :) > > > > Warner? Juli? > > All architectures have it at 12MB, except sparc64 where it is 16MB. This can be changed with the options VM_KMEM_SIZE=x in the config file. Right. Does that mean for any platform, if we do not have nmbclusters pre-set in kmeminit() than we will always have pretty low value of vm_kmem_size. And because of that, if maxmbufmem is not pre-set (via loader.conf) inside tunable_mbinit() , we will have very low value for maxmbufmem too. I hope (partially believe) that my understanding is not entirely correct. Because if its correct, we arw depending on loader.conf instead of actually auto tuning. Thanks, Hiren > > So my guess as to why this is the case: cut and paste worked, so nobody changed it after that. > > # Still need to reads hiren's email to comprehend it... > > Warner > > > > > > > > -adrian > > > > > > > > On 6 September 2013 16:36, hiren panchasara wrote: > > > >> We are seeing an interesting thing on a mips board with 32MB ram. > >> > >> We run out of mbuf very easily and looking at numbers it seems we are only > >> getting 6mb of maxmbufmem. > >> > >> # sysctl -a | grep hw | grep mem > >> hw.physmem: 33554432 > >> hw.usermem: 21774336 > >> hw.realmem: 33554432 > >> # > >> # sysctl -a | grep maxmbuf > >> kern.ipc.maxmbufmem: 6291456 > >> > >> I believe that number is very low for a board with 32mb of ram. > >> > >> Looking at the code: > >> > >> sys/kern/kern_mbuf.c : tunable_mbinit() > >> > >> 124 realmem = qmin((quad_t)physmem * PAGE_SIZE, vm_kmem_size); > >> 125 maxmbufmem = realmem / 2; > >> 126 TUNABLE_QUAD_FETCH("kern.ipc.maxmbufmem", &maxmbufmem); > >> 127 if (maxmbufmem > realmem / 4 * 3) > >> 128 maxmbufmem = realmem / 4 * 3; > >> > >> So, realmem plays important role in determining maxmbufmem. > >> > >> physmem = 32mb > >> PAGE_SIZE = 4096 > >> > >> vm_kmem_size is calculated inside sys/kern/kern_malloc.c : kmeminit() > >> > >> 705 vm_kmem_size = VM_KMEM_SIZE + nmbclusters * PAGE_SIZE; > >> 706 mem_size = cnt.v_page_count; > >> 707 > >> 708 #if defined(VM_KMEM_SIZE_SCALE) > >> 709 vm_kmem_size_scale = VM_KMEM_SIZE_SCALE; > >> 710 #endif > >> 711 TUNABLE_INT_FETCH("vm.kmem_size_scale", &vm_kmem_size_scale); > >> 712 if (vm_kmem_size_scale > 0 && > >> 713 (mem_size / vm_kmem_size_scale) > (vm_kmem_size / > >> PAGE_SIZE)) > >> 714 vm_kmem_size = (mem_size / vm_kmem_size_scale) * > >> PAGE_SIZE; > >> > >> here, > >> VM_KMEM_SIZE = 12*1024*1024 > >> nmbclusters = 0 (initially) > >> PAGE_SIZE = 4096 > >> # sysctl -a | grep v_page_count > >> vm.stats.vm.v_page_count: 7035 > >> > >> and VM_KMEM_SIZE_SCALE = 3 for mips. > >> > >> So, vm_kmem_size = 12mb. > >> > >> Going back to tunable_mbinit(), > >> we get realmem = 12mb. > >> and masmbufmem = 6mb. > >> > >> > >> Wanted to see if I am following the code correctly and how autotuning > >> should work here. > >> > >> cheers, > >> Hiren > >> ___ > >> freebsd-net@freebsd.org mailing list > >> http://lists.freebsd.org/mailman/listinfo/freebsd-net > >> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" > >> > > ___ > > freebsd-m...@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-mips > > To unsubscribe, send any mail to "freebsd-mips-unsubscr...@freebsd.org" > ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: mbuf autotuning effect
On Sat, Sep 7, 2013 at 12:56 PM, Ian Lepore wrote: > On Sat, 2013-09-07 at 12:21 -0700, hiren panchasara wrote: > > On Sep 6, 2013 8:26 PM, "Warner Losh" wrote: > > > > > > > > > On Sep 6, 2013, at 7:11 PM, Adrian Chadd wrote: > > > > > > > Yeah, why is VM_KMEM_SIZE only 12mbyte for MIPS? That's a little > > low > > for a > > > > platform that has a direct map that's slightly larger than 12mb :) > > > > > > > > Warner? Juli? > > > > > > All architectures have it at 12MB, except sparc64 where it is 16MB. > > This > > can be changed with the options VM_KMEM_SIZE=x in the config file. > > > > Right. Does that mean for any platform, if we do not have nmbclusters > > pre-set in kmeminit() than we will always have pretty low value of > > vm_kmem_size. And because of that, if maxmbufmem is not pre-set (via > > loader.conf) inside tunable_mbinit() , we will have very low value for > > maxmbufmem too. > > > > I hope (partially believe) that my understanding is not entirely > > correct. > > Because if its correct, we arw depending on loader.conf instead of > > actually > > auto tuning. > > > I think the part of this that strikes me as strange is calling 20% of > physical memory used for network buffers a "very low value". It seems > outrageously high to me. I'd be pissed if that much memory got wasted > on network buffers on one of our $work platforms with so little memory. > Interesting. So here how it looks on my laptop running amd64 GENERIC looks like: (without any special loader.conf settings) flymockour-l7% uname -a FreeBSD flymockour-l7.corp.yahoo.com 10.0-CURRENT FreeBSD 10.0-CURRENT #1 r253512M: Sat Jul 20 23:00:51 PDT 2013 hir...@flymockour-l7.corp.yahoo.com:/usr/obj/usr/home/hirenp/head/sys/GENERIC amd64 flymockour-l7% sysctl -a | grep hw| grep mem hw.physmem: 8496877568 hw.usermem: 3538432000 hw.realmem: 9093251072 flymockour-l7% sysctl kern.ipc.maxmbufmem kern.ipc.maxmbufmem: 4132540416 flymockour-l7% sysctl -a | grep vm.kmem_ vm.kmem_size: 8265080832 vm.kmem_size_min: 0 vm.kmem_size_max: 329853485875 vm.kmem_size_scale: 1 vm.kmem_map_size: 1380515840 vm.kmem_map_free: 5796265984 VM_KMEM_SIZE_SCALE is 1 for amd64 while 3 for mips. Which might be one reason. > So the fact that you think it's crazy-low and I think it's crazy-high > may be a sign that it's auto-tuned to a reasonable compromise, and in > both our cases the right fix would be to use the available knobs to tune > things for our particular uses. > I am pretty ignorant on what the value _should_ be. I will try to find out more. cheers, Hiren ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: mbuf autotuning effect
On Sat, Sep 7, 2013 at 1:39 PM, Adrian Chadd wrote: > On 7 September 2013 12:56, Ian Lepore wrote: > > >> I think the part of this that strikes me as strange is calling 20% of >> physical memory used for network buffers a "very low value". It seems >> outrageously high to me. I'd be pissed if that much memory got wasted >> on network buffers on one of our $work platforms with so little memory. >> >> So the fact that you think it's crazy-low and I think it's crazy-high >> may be a sign that it's auto-tuned to a reasonable compromise, and in >> both our cases the right fix would be to use the available knobs to tune >> things for our particular uses. >> > > Well, which limit is actually being hit here? 20% of 32mb is still a lot > of memory buffers.. > > Now, for sizing up the needed buffers for wifi: > > assuming 512 tx, 512 rx buffers for two ath NICs. > > another 512+512 buffers for each arge NICs. > > So, 4096 mbufs here, 2k each, so ~ 8mb of RAM. > And we are only getting 6mb of maxmbufmem with current setup. Index: mips/include/vmparam.h === --- mips/include/vmparam.h (revision 255320) +++ mips/include/vmparam.h (working copy) @@ -119,7 +119,7 @@ * is the total KVA space allocated for kmem_map. */ #ifndef VM_KMEM_SIZE_SCALE -#defineVM_KMEM_SIZE_SCALE (3) +#defineVM_KMEM_SIZE_SCALE (1) #endif /* As I mentioned on another reply in the same thread, VM_KMEM_SIZE_SCALE is 1 for amd64. If I do the same for mips as above, we get # sysctl -a | grep maxmbuf kern.ipc.maxmbufmem: 14407680 Now, do we want to have this much rams assigned to mbufs is another question. cheers, Hiren ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Exposing sysctls for ixgbe
$ sysctl hw.igb hw.igb.rxd: 4096 hw.igb.txd: 4096 hw.igb.enable_aim: 1 hw.igb.enable_msix: 1 hw.igb.max_interrupt_rate: 8000 hw.igb.buf_ring_size: 4096 hw.igb.header_split: 0 hw.igb.num_queues: 1 hw.igb.rx_process_limit: 100 $ sysctl hw.ix sysctl: unknown oid 'hw.ix': No such file or directory I thought it would be nice to have these things exposed. So I copied them from igb: http://people.freebsd.org/~hiren/ixgbe_sysctls.txt Changes for if_igb.c is to expose correct auto-tuned value for a running system for "hw.igb.num_queues", which is not the case right now. Thanks to markj@ for help/pointers. Please let me know if the diffs look okay. cheers, Hiren ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
netmap: understanding pkg-gen.c
I am new to netmap so thought of confirming my understanding. I was trying to associate ixgbe interface with netmap and blast pkts to it using tools/tools/netmap/pkg-gen.c My setup looks like this: FreeBSD 10.0-ALPHA1 #2: Sun Sep 22 18:08:18 UTC 2013 -bash-4.2$ sysctl hw.physmem hw.physmem: 51505954816 -bash-4.2$ sysctl hw.ncpu hw.ncpu: 24 -bash-4.2$ sysctl hw.ix hw.ix.enable_aim: 1 hw.ix.max_interrupt_rate: 31250 hw.ix.rx_process_limit: 256 hw.ix.tx_process_limit: 256 hw.ix.enable_msix: 1 hw.ix.num_queues: 8 hw.ix.txd: 2048 hw.ix.rxd: 2048 -bash-4.2$ On this box, I have 2 interfaces igb0 (10.73.149.28) and ix1 (10.73.149.17) and this is how I am using this binary: -bash-4.2$ sudo ./pkt-gen -i ix1 -f tx -n 1 -c 8 -p 8 -d 10.73.149.17 -s 10.73.149.28 extract_ip_range [143] extract IP range from 10.73.149.28 extract_ip_range [178] range is 10.73.149.28 0 to 10.73.149.28 0 extract_ip_range [143] extract IP range from 10.73.149.17 extract_ip_range [178] range is 10.73.149.17 0 to 10.73.149.17 0 extract_mac_range [184] extract MAC range from 90:e2:ba:30:68:c5 extract_mac_range [199] 90:e2:ba:30:68:c5 starts at 90:e2:ba:30:68:c5 extract_mac_range [184] extract MAC range from ff:ff:ff:ff:ff:ff extract_mac_range [199] ff:ff:ff:ff:ff:ff starts at ff:ff:ff:ff:ff:ff main [1530] map size is 334980 Kb main [1552] mapping 334980 Kbytes Sending on ix1: 8 queues, 8 threads and 8 cpus. 10.73.149.28 -> 10.73.149.17 (90:e2:ba:30:68:c5 -> ff:ff:ff:ff:ff:ff) main [1622] Sending 512 packets every 0.0 ns main [1624] Wait 2 secs for phy reset main [1626] Ready... sender_body [775] start sender_body [775] start sender_body [775] start sender_body [775] start sender_body [775] start sender_body [775] start sender_body [775] start sender_body [775] start sender_body [848] drop copy sender_body [848] drop copy sender_body [848] drop copy sender_body [848] drop copy sender_body [848] drop copy sender_body [848] drop copy sender_body [848] drop copy sender_body [848] drop copy main_thread [1192] 15024157 pps (15050104 pkts in 1001727 usec) main_thread [1192] 14882290 pps (14900223 pkts in 1001205 usec) main_thread [1192] 14879515 pps (14903798 pkts in 1001632 usec) main_thread [1192] 14880924 pps (15795952 pkts in 1061490 usec) main_thread [1192] 14881411 pps (15821633 pkts in 1063181 usec) main_thread [1192] 14880095 pps (15427549 pkts in 1036791 usec) main_thread [1192] 7986707 pps (8100741 pkts in 1014278 usec) Sent 1 packets, 60 bytes each, in 6.71 seconds. Speed: 14.90 Mpps Bandwidth: 7.15 Gbps (raw 10.01 Gbps) $ top -H shows: 25853 root 240 348M 35692K select 16 0:01 7.28% pkt-gen{pkt-gen} 25853 root 250 348M 35692K select 21 0:01 7.28% pkt-gen{pkt-gen} 25853 root 250 348M 35692K select 15 0:01 7.28% pkt-gen{pkt-gen} 25853 root 230 348M 35692K select 12 0:01 6.40% pkt-gen{pkt-gen} 25853 root 240 348M 35692K select 22 0:01 6.30% pkt-gen{pkt-gen} 25853 root 230 348M 35692K select 9 0:01 5.66% pkt-gen{pkt-gen} 25853 root 230 348M 35692K select 7 0:01 5.57% pkt-gen{pkt-gen} 25853 root 230 348M 35692K CPU00 0:01 5.27% pkt-gen{pkt-gen} 12 root -92- 0K 1184K WAIT6 0:10 5.08% intr{irq290: ix1:que } 12 root -92- 0K 1184K WAIT7 0:09 4.98% intr{irq291: ix1:que } 12 root -92- 0K 1184K WAIT5 0:08 4.79% intr{irq289: ix1:que } 12 root -92- 0K 1184K WAIT2 0:15 4.69% intr{irq286: ix1:que } 12 root -92- 0K 1184K WAIT3 0:08 4.69% intr{irq287: ix1:que } 12 root -92- 0K 1184K WAIT4 0:14 4.59% intr{irq288: ix1:que } 12 root -92- 0K 1184K WAIT1 0:06 3.96% intr{irq285: ix1:que } 12 root -92- 0K 1184K WAIT0 0:05 0.98% intr{irq284: ix1:que } I can only specify -p (threads) upto 8 because it cannot be more than the hw.ix.num_queues=8, is that correct? Cheers, Hiren ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Exposing sysctls for ixgbe
Any comments? cheers, Hiren On Sun, Sep 22, 2013 at 12:01 PM, hiren panchasara wrote: > $ sysctl hw.igb > hw.igb.rxd: 4096 > hw.igb.txd: 4096 > hw.igb.enable_aim: 1 > hw.igb.enable_msix: 1 > hw.igb.max_interrupt_rate: 8000 > hw.igb.buf_ring_size: 4096 > hw.igb.header_split: 0 > hw.igb.num_queues: 1 > hw.igb.rx_process_limit: 100 > $ sysctl hw.ix > sysctl: unknown oid 'hw.ix': No such file or directory > > I thought it would be nice to have these things exposed. So I copied them > from igb: > http://people.freebsd.org/~hiren/ixgbe_sysctls.txt > > Changes for if_igb.c is to expose correct auto-tuned value for a running > system for "hw.igb.num_queues", which is not the case right now. > > Thanks to markj@ for help/pointers. > > Please let me know if the diffs look okay. > > cheers, > Hiren > ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
netmap: traffic distribution
I am providing line rate traffic (via pkg-gen.c) to my 10gig ix interface. Now on receiving side, is there a way to sub-divide the traffic into multiple workloads using netmap? For example, can I get two 5G flows from 10Gbps traffic? cheers, Hiren ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: netmap: traffic distribution
On Wed, Sep 25, 2013 at 1:22 AM, Luigi Rizzo wrote: > On Wed, Sep 25, 2013 at 10:07 AM, hiren panchasara > wrote: > > > > I am providing line rate traffic (via pkg-gen.c) to my 10gig ix > interface. > > > > Now on receiving side, is there a way to sub-divide the traffic into > > multiple workloads using netmap? > > > > For example, can I get two 5G flows from 10Gbps traffic? > > not directly. You'd need to send packets with different addresses that > match > the way the filters on the NIC (RSS or similar) are programmed. > Thanks for quick responses, Liugi. So, FreeBSD needs PF_RING like thing? Any other way we can do it? Someone pointed me to multiqueue bpf too. Not sure if I can use that to achieve what I am looking for. cheers, Hiren ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: netmap: traffic distribution
On Wed, Sep 25, 2013 at 2:05 AM, Luigi Rizzo wrote: > On Wed, Sep 25, 2013 at 10:53 AM, hiren panchasara > wrote: > > > > > > > > On Wed, Sep 25, 2013 at 1:22 AM, Luigi Rizzo wrote: > >> > >> On Wed, Sep 25, 2013 at 10:07 AM, hiren panchasara > >> wrote: > >> > > >> > I am providing line rate traffic (via pkg-gen.c) to my 10gig ix > >> > interface. > >> > > >> > Now on receiving side, is there a way to sub-divide the traffic into > >> > multiple workloads using netmap? > >> > > >> > For example, can I get two 5G flows from 10Gbps traffic? > >> > >> not directly. You'd need to send packets with different addresses that > >> match > >> the way the filters on the NIC (RSS or similar) are programmed. > > > > > > Thanks for quick responses, Liugi. > > > > So, FreeBSD needs PF_RING like thing? Any other way we can do it? > > no, > PF_RING does nothing more than netmap. > Okay. > > the partitioning of traffic into queues is done by the NIC's hardware, > through some filters that i mentioned and are NIC specific. > They are often named RSS (receive side scaling), RFS (receive flow > steering), > Flow Director, and so on. Some NICs compute a hash of various header > fields > and use the result to direct packets to specific queues. Others have > "exact match" filters where you can map certain mac headers to > specific queues, and so on. > Alright. I will investigate more about RSS/RFS for ixgbe. Thanks a bunch, Hiren > > A software demultiplexer that sits on top of a netmap ring > may certainly be useful, but i have not yet designed it. > > > cheers > luigi > ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: netmap: traffic distribution
On Thu, Sep 26, 2013 at 12:58 AM, Michio Honda wrote: > Hi, > > The handiest way to try flexible flow distribution is using Flow Director. > I've confirmed that the patch posted to this list two years ago works with > netmap/ixgbe. > > http://freebsd.1045724.n5.nabble.com/Adding-Flow-Director-sysctls-to-ixgbe-4-td4769489.html > Thanks a lot for the link, Michio! It seems this work is yet not committed?!? cheers, Hiren > > Cheers, > - Michio > ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: netmap: traffic distribution
On Thu, Sep 26, 2013 at 2:27 PM, chintu hetam wrote: > Hiren, > > https://www.kernel.org/doc/Documentation/networking/scaling.txt must read > to understand nuances of each of this features. None of this techniques are > used for mostly none other than performance reasons. > Thanks for the link. So, RFS (Receive Flow Steering) is equivalent to "flow director" mentioned in FreeBSD's ixgbe drivers? > > Michio, personally i am interested to know performance results in netmap > mode with RFS patch you just mentioned. > Takuya/Luigi might have some numbers. Thanks, Hiren ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Adding Flow Director sysctls to ixgbe(4) (was: netmap: traffic distribution)
On Thu, Sep 26, 2013 at 10:38 AM, hiren panchasara < hiren.panchas...@gmail.com> wrote: > > > > On Thu, Sep 26, 2013 at 12:58 AM, Michio Honda wrote: > >> Hi, >> >> The handiest way to try flexible flow distribution is using Flow Director. >> I've confirmed that the patch posted to this list two years ago works >> with netmap/ixgbe. >> >> http://freebsd.1045724.n5.nabble.com/Adding-Flow-Director-sysctls-to-ixgbe-4-td4769489.html >> > > Thanks a lot for the link, Michio! > > It seems this work is yet not committed?!? > Takuya, I see a lot of responses/comments on proposed changes. Was anything decided at the end of it? As far as I can tell, its still not committed to the tree. Thanks, Hiren ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: netmap: traffic distribution
I think you meant "reply-all" :-) On Fri, Sep 27, 2013 at 7:53 AM, chintu hetam wrote: > As far as i know, flow director is Intel terminology it addresses both RSS > and RFS. I think FreeBSD implementation is RFS. > > Luigi, you touched upon SW de-multiplexer, i would like to know why it's > necessary. > let say i have 82599 ixgbe driver (RSS enabled)configured with 5 tuple > hash. My application reads from netmap queue 0-7(1-8), i know for sure that > each hash will be filtered to specific hw queue(0-7), is it safe to assume > netmap will provide packets in same order. > > Michio, reason i asked for performance values > http://arxiv.org/ftp/arxiv/papers/1106/1106.0443.pdf > I would like to test the accuracy of RFS,RSS and others in netmap mode.. > > Thanks > Hardik > > > On Fri, Sep 27, 2013 at 2:59 AM, hiren panchasara < > hiren.panchas...@gmail.com> wrote: > >> >> >> >> On Thu, Sep 26, 2013 at 2:27 PM, chintu hetam wrote: >> >>> Hiren, >>> >>> https://www.kernel.org/doc/Documentation/networking/scaling.txt must >>> read to understand nuances of each of this features. None of this >>> techniques are used for mostly none other than performance reasons. >>> >> >> Thanks for the link. >> So, RFS (Receive Flow Steering) is equivalent to "flow director" >> mentioned in FreeBSD's ixgbe drivers? >> >>> >>> Michio, personally i am interested to know performance results in netmap >>> mode with RFS patch you just mentioned. >>> >> Takuya/Luigi might have some numbers. >> >> Thanks, >> Hiren >> >> >> >> > ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Adding Flow Director sysctls to ixgbe(4) (was: netmap: traffic distribution)
On Fri, Sep 27, 2013 at 1:58 AM, Takuya ASADA wrote: > 2013/9/27 Adrian Chadd > >> On 27 September 2013 00:43, hiren panchasara >> wrote: >> >> >>> Takuya, >>> >>> I see a lot of responses/comments on proposed changes. Was anything >>> decided >>> at the end of it? As far as I can tell, its still not committed to the >>> tree. >>> >> >> I'd rather see an ioctl API for that chipset and then have a separate >> tool program it for now. >> > > Ah, like cxgbetool and cxgbe? (it has device specific tool and ioctls) > http://svnweb.freebsd.org/base/head/tools/tools/cxgbetool/ > Something like this for ixgbe would be nice to start with, imo. Cheers, Hiren > http://svnweb.freebsd.org/base/head/sys/dev/cxgb/ > > >> So, how bout we hack that up? :) >> > > Sound's interesting ;-) > Could you tell me more detail about your idea? > > ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Adding Flow Director sysctls to ixgbe(4) (was: netmap: traffic distribution)
On Sep 30, 2013 5:08 AM, "Takuya ASADA" wrote: > > Hi, > > This is updated version of the patch. I will give this a try today. Cheers Hiren > Signature filter list feature is added. > > Here're usage of ixgbetool: > - add a filter > ixgbetool ix0 add_sig_filter tcpv4 10.1.0.1 34763 10.1.0.2 22 3 > - show filters > ixgbetool ix0 show_sig_filter > - del a filter > ixgbetool ix0 del_sig_filter 1 > > > > 2013/9/30 Takuya ASADA >> >> Hi, >> >> I just implemented device specific ioctl with device specific configuration tool. >> It still doesn't support some important features such as: >> - FDIR enable / disable via sysctl or tunable params >> - ATR enable / disable via sysctl or tunable params >> - IPv6 support on signature filter >> - signature filter list >> - support perfect filter >> But, at least it can configure signature filter manually. >> >> Usage is as follows: >> Usage: ixgbetool [operation] >> add_sig_filter >> del_sig_filter >> >> >> 2013/9/28 hiren panchasara >>> >>> >>> >>> >>> On Fri, Sep 27, 2013 at 1:58 AM, Takuya ASADA wrote: >>>> >>>> 2013/9/27 Adrian Chadd >>>>> >>>>> On 27 September 2013 00:43, hiren panchasara < hiren.panchas...@gmail.com> wrote: >>>>> >>>>>> >>>>>> Takuya, >>>>>> >>>>>> I see a lot of responses/comments on proposed changes. Was anything decided >>>>>> at the end of it? As far as I can tell, its still not committed to the >>>>>> tree. >>>>> >>>>> >>>>> I'd rather see an ioctl API for that chipset and then have a separate tool program it for now. >>>> >>>> >>>> Ah, like cxgbetool and cxgbe? (it has device specific tool and ioctls) >>>> http://svnweb.freebsd.org/base/head/tools/tools/cxgbetool/ >>> >>> >>> Something like this for ixgbe would be nice to start with, imo. >>> >>> Cheers, >>> Hiren >>>> >>>> http://svnweb.freebsd.org/base/head/sys/dev/cxgb/ >>>> >>>>> >>>>> So, how bout we hack that up? :) >>>> >>>> >>>> Sound's interesting ;-) >>>> Could you tell me more detail about your idea? >>>> >>> >> > ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Exposing sysctls for ixgbe
+ jfv On Wed, Sep 25, 2013 at 5:35 PM, Adrian Chadd wrote: > please cc jfv and get them into his driver. :) > > -a > > > On 24 September 2013 23:53, hiren panchasara wrote: > >> Any comments? >> >> cheers, >> Hiren >> >> On Sun, Sep 22, 2013 at 12:01 PM, hiren panchasara > >wrote: >> >> > $ sysctl hw.igb >> > hw.igb.rxd: 4096 >> > hw.igb.txd: 4096 >> > hw.igb.enable_aim: 1 >> > hw.igb.enable_msix: 1 >> > hw.igb.max_interrupt_rate: 8000 >> > hw.igb.buf_ring_size: 4096 >> > hw.igb.header_split: 0 >> > hw.igb.num_queues: 1 >> > hw.igb.rx_process_limit: 100 >> > $ sysctl hw.ix >> > sysctl: unknown oid 'hw.ix': No such file or directory >> > >> > I thought it would be nice to have these things exposed. So I copied >> them >> > from igb: >> > http://people.freebsd.org/~hiren/ixgbe_sysctls.txt >> > >> > Changes for if_igb.c is to expose correct auto-tuned value for a running >> > system for "hw.igb.num_queues", which is not the case right now. >> > >> > Thanks to markj@ for help/pointers. >> > >> > Please let me know if the diffs look okay. >> > >> > cheers, >> > Hiren >> > >> ___ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" >> > > ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Flow Director statistics for ixgbe(4)
On Mon, Sep 30, 2013 at 12:02 PM, Takuya ASADA wrote: > Hi, > > This is originally part of "ixgbetool" patch, but I think it can be discuss > separately: > > http://freebsd.1045724.n5.nabble.com/Adding-Flow-Director-sysctls-to-ixgbe-4-was-netmap-traffic-distribution-tp5847066p5847789.html > > I implemented sysctls to expose Flow Director statistics. > It works like this: > $ sysctl dev.ix.0.mac_stats|grep fdir > dev.ix.0.mac_stats.fdirfree_free: 8192 > dev.ix.0.mac_stats.fdirfree_coll: 0 > dev.ix.0.mac_stats.fdirustat_add: 0 > dev.ix.0.mac_stats.fdirustat_remove: 0 > dev.ix.0.mac_stats.fdirfstat_fadd: 0 > dev.ix.0.mac_stats.fdirfstat_fremove: 0 > dev.ix.0.mac_stats.fdirmatch: 0 > dev.ix.0.mac_stats.fdirmiss: 23 > I am running this with the ixgbetool patch you have and only 1 issue I've seen so far: fdirustat_add increments on adding a filter but fdirustat_remove does not on deleting one (if thats how its supposed to work) Can you please populate "description" for all the sysctls you are adding? :-) Cheers, Hiren ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Flow Director statistics for ixgbe(4)
On Mon, Sep 30, 2013 at 1:48 PM, Takuya ASADA wrote: > Hi, > > descriptions are added. > Great. My minor suggestions (you can ignore them :-)) from "Number of filters addition events that do not change the number of free" to "Number of failed filter addition events" (I believe, "do not change the number of free" part is given when addition is failing) from "Number of packets that missed matched any flow director filter" to "Number of packets that didn't match any flow director filter" > > 2013/10/1 hiren panchasara > >> >> >> >> On Mon, Sep 30, 2013 at 12:02 PM, Takuya ASADA wrote: >> >>> Hi, >>> >>> This is originally part of "ixgbetool" patch, but I think it can be >>> discuss >>> separately: >>> >>> http://freebsd.1045724.n5.nabble.com/Adding-Flow-Director-sysctls-to-ixgbe-4-was-netmap-traffic-distribution-tp5847066p5847789.html >>> >>> I implemented sysctls to expose Flow Director statistics. >>> It works like this: >>> $ sysctl dev.ix.0.mac_stats|grep fdir >>> dev.ix.0.mac_stats.fdirfree_free: 8192 >>> dev.ix.0.mac_stats.fdirfree_coll: 0 >>> dev.ix.0.mac_stats.fdirustat_add: 0 >>> dev.ix.0.mac_stats.fdirustat_remove: 0 >>> dev.ix.0.mac_stats.fdirfstat_fadd: 0 >>> dev.ix.0.mac_stats.fdirfstat_fremove: 0 >>> dev.ix.0.mac_stats.fdirmatch: 0 >>> dev.ix.0.mac_stats.fdirmiss: 23 >>> >> >> I am running this with the ixgbetool patch you have and only 1 issue I've >> seen so far: fdirustat_add increments on adding a filter but >> fdirustat_remove does not on deleting one (if thats how its supposed to >> work) >> > As we talked on another thread, yes, problem is just the counter not getting updated. The functionality is fine. Appreciate your awesome work. Cheers, Hiren ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: netmap: understanding pkg-gen.c
Thanks Luigi. Coming back to this thread to actually understand what's going on. On Tue, Sep 24, 2013 at 8:37 PM, Luigi Rizzo wrote: > > > > On this box, I have 2 interfaces igb0 (10.73.149.28) and ix1 > (10.73.149.17) > > and this is how I am using this binary: > > > > -bash-4.2$ sudo ./pkt-gen -i ix1 -f tx -n 1 -c 8 -p 8 -d > > 10.73.149.17 -s 10.73.149.28 > So, my intention is to *send* 10gbps data to ix1 and see the card use all of its 8 queues. Above command is the correct one? I kldunloaded/loaded ixbge to clear out all the stats and tested it again: -bash-4.2$ sudo ./pkt-gen -i ix1 -f tx -n 1 -c 8 -p 8 -d 10.73.149.17 -s 10.73.149.28 extract_ip_range [143] extract IP range from 10.73.149.28 extract_ip_range [178] range is 10.73.149.28 0 to 10.73.149.28 0 extract_ip_range [143] extract IP range from 10.73.149.17 extract_ip_range [178] range is 10.73.149.17 0 to 10.73.149.17 0 extract_mac_range [184] extract MAC range from 90:e2:ba:30:68:c5 extract_mac_range [199] 90:e2:ba:30:68:c5 starts at 90:e2:ba:30:68:c5 extract_mac_range [184] extract MAC range from ff:ff:ff:ff:ff:ff extract_mac_range [199] ff:ff:ff:ff:ff:ff starts at ff:ff:ff:ff:ff:ff main [1530] map size is 334980 Kb main [1552] mapping 334980 Kbytes Sending on ix1: 8 queues, 8 threads and 8 cpus. 10.73.149.28 -> 10.73.149.17 (90:e2:ba:30:68:c5 -> ff:ff:ff:ff:ff:ff) main [1622] Sending 512 packets every 0.0 ns main [1624] Wait 2 secs for phy reset main [1626] Ready... sender_body [775] start sender_body [775] start sender_body [775] start sender_body [775] start sender_body [775] start sender_body [775] start sender_body [775] start sender_body [848] drop copy sender_body [775] start sender_body [848] drop copy sender_body [848] drop copy sender_body [848] drop copy sender_body [848] drop copy sender_body [848] drop copy sender_body [848] drop copy main_thread [1192] 15122963 pps (15130434 pkts in 1000494 usec) main_thread [1192] 14881444 pps (14896266 pkts in 1000996 usec) sender_body [841] poll error/timeout on queue 1 main_thread [1192] 14880708 pps (15659371 pkts in 1052327 usec) main_thread [1192] 14878611 pps (14888684 pkts in 1000677 usec) main_thread [1192] 14882655 pps (14897538 pkts in 1001000 usec) main_thread [1192] 11900044 pps (12029754 pkts in 1010900 usec) main_thread [1212] ouch, thread 1 exited with error Sent 87502047 packets, 60 bytes each, in 5.86 seconds. Speed: 14.92 Mpps Bandwidth: 7.16 Gbps (raw 10.03 Gbps) -bash-4.2$ But looking at the queue stats: -bash-4.2$ sysctl -a | grep ix.1 | grep queue | grep rx_packets dev.ix.1.queue0.rx_packets: 171 dev.ix.1.queue1.rx_packets: 0 dev.ix.1.queue2.rx_packets: 0 dev.ix.1.queue3.rx_packets: 0 dev.ix.1.queue4.rx_packets: 0 dev.ix.1.queue5.rx_packets: 0 dev.ix.1.queue6.rx_packets: 0 dev.ix.1.queue7.rx_packets: 0 And after a few seconds: -bash-4.2$ sysctl -a | grep ix.1 | grep queue | grep rx_packets dev.ix.1.queue0.rx_packets: 310 dev.ix.1.queue1.rx_packets: 0 dev.ix.1.queue2.rx_packets: 0 dev.ix.1.queue3.rx_packets: 0 dev.ix.1.queue4.rx_packets: 8 dev.ix.1.queue5.rx_packets: 0 dev.ix.1.queue6.rx_packets: 0 dev.ix.1.queue7.rx_packets: 0 -bash-4.2$ What is going on here? Should I be seeing more pkts in rx_packets? Should I see more queues being used? I am using stock ixgbe at this point. I believe RSS is enabled by default? I apologize if I am asking obvious/answered questions here. cheers, Hiren ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: netmap: understanding pkg-gen.c
On Wed, Oct 2, 2013 at 1:18 PM, hiren panchasara wrote: > > Thanks Luigi. > > Coming back to this thread to actually understand what's going on. > > > On Tue, Sep 24, 2013 at 8:37 PM, Luigi Rizzo wrote: > >> > >> > On this box, I have 2 interfaces igb0 (10.73.149.28) and ix1 >> (10.73.149.17) >> > and this is how I am using this binary: >> > >> > -bash-4.2$ sudo ./pkt-gen -i ix1 -f tx -n 1 -c 8 -p 8 -d >> > 10.73.149.17 -s 10.73.149.28 >> > > So, my intention is to *send* 10gbps data to ix1 and see the card use all > of its 8 queues. > > Above command is the correct one? > > I kldunloaded/loaded ixbge to clear out all the stats and tested it again: > > > -bash-4.2$ sudo ./pkt-gen -i ix1 -f tx -n 1 -c 8 -p 8 -d > 10.73.149.17 -s 10.73.149.28 > extract_ip_range [143] extract IP range from 10.73.149.28 > extract_ip_range [178] range is 10.73.149.28 0 to 10.73.149.28 0 > extract_ip_range [143] extract IP range from 10.73.149.17 > extract_ip_range [178] range is 10.73.149.17 0 to 10.73.149.17 0 > extract_mac_range [184] extract MAC range from 90:e2:ba:30:68:c5 > extract_mac_range [199] 90:e2:ba:30:68:c5 starts at 90:e2:ba:30:68:c5 > extract_mac_range [184] extract MAC range from ff:ff:ff:ff:ff:ff > extract_mac_range [199] ff:ff:ff:ff:ff:ff starts at ff:ff:ff:ff:ff:ff > main [1530] map size is 334980 Kb > main [1552] mapping 334980 Kbytes > Sending on ix1: 8 queues, 8 threads and 8 cpus. > 10.73.149.28 -> 10.73.149.17 (90:e2:ba:30:68:c5 -> ff:ff:ff:ff:ff:ff) > main [1622] Sending 512 packets every 0.0 ns > main [1624] Wait 2 secs for phy reset > main [1626] Ready... > sender_body [775] start > sender_body [775] start > sender_body [775] start > sender_body [775] start > sender_body [775] start > sender_body [775] start > sender_body [775] start > sender_body [848] drop copy > > sender_body [775] start > sender_body [848] drop copy > sender_body [848] drop copy > sender_body [848] drop copy > sender_body [848] drop copy > sender_body [848] drop copy > sender_body [848] drop copy > main_thread [1192] 15122963 pps (15130434 pkts in 1000494 usec) > main_thread [1192] 14881444 pps (14896266 pkts in 1000996 usec) > sender_body [841] poll error/timeout on queue 1 > main_thread [1192] 14880708 pps (15659371 pkts in 1052327 usec) > main_thread [1192] 14878611 pps (14888684 pkts in 1000677 usec) > main_thread [1192] 14882655 pps (14897538 pkts in 1001000 usec) > main_thread [1192] 11900044 pps (12029754 pkts in 1010900 usec) > main_thread [1212] ouch, thread 1 exited with error > Sent 87502047 packets, 60 bytes each, in 5.86 seconds. > Speed: 14.92 Mpps Bandwidth: 7.16 Gbps (raw 10.03 Gbps) > -bash-4.2$ > > But looking at the queue stats: > > > -bash-4.2$ sysctl -a | grep ix.1 | grep queue | grep rx_packets > dev.ix.1.queue0.rx_packets: 171 > dev.ix.1.queue1.rx_packets: 0 > dev.ix.1.queue2.rx_packets: 0 > dev.ix.1.queue3.rx_packets: 0 > dev.ix.1.queue4.rx_packets: 0 > dev.ix.1.queue5.rx_packets: 0 > dev.ix.1.queue6.rx_packets: 0 > dev.ix.1.queue7.rx_packets: 0 > > And after a few seconds: > > -bash-4.2$ sysctl -a | grep ix.1 | grep queue | grep rx_packets > dev.ix.1.queue0.rx_packets: 310 > dev.ix.1.queue1.rx_packets: 0 > dev.ix.1.queue2.rx_packets: 0 > dev.ix.1.queue3.rx_packets: 0 > dev.ix.1.queue4.rx_packets: 8 > dev.ix.1.queue5.rx_packets: 0 > dev.ix.1.queue6.rx_packets: 0 > dev.ix.1.queue7.rx_packets: 0 > -bash-4.2$ > > What is going on here? Should I be seeing more pkts in rx_packets? Should > I see more queues being used? > > I am using stock ixgbe at this point. I believe RSS is enabled by default? > fdir(flow director)/atr is also enabled by default. I tried to turn that off by following change in /sys/modules/ixgbe/Makefile -CFLAGS+= -I${.CURDIR}/../../dev/ixgbe -DSMP -DIXGBE_FDIR +CFLAGS+= -I${.CURDIR}/../../dev/ixgbe -DSMP But, now interface would not attach to netmap: sudo ./pkt-gen -i ix1 -f tx -n 1 -c 8 -p 8 -d 10.73.149.17 -s 10.73.149.28 extract_ip_range [143] extract IP range from 10.73.149.28 extract_ip_range [178] range is 10.73.149.28 0 to 10.73.149.28 0 extract_ip_range [143] extract IP range from 10.73.149.17 extract_ip_range [178] range is 10.73.149.17 0 to 10.73.149.17 0 extract_mac_range [184] extract MAC range from 90:e2:ba:30:68:c5 extract_mac_range [199] 90:e2:ba:30:68:c5 starts at 90:e2:ba:30:68:c5 extract_mac_range [184] extract MAC range from ff:ff:ff:ff:ff:ff extract_mac_range [199] ff:ff:ff:ff:ff:ff starts at ff:ff:ff:ff:ff:ff main [1530] map size is 334980 Kb main [1536] Unable to get if info for ix1 main [1543] bad nthreads 8, have 0 queues main [1552] mapping 0 Kbytes main [1558] Unable to mmap
Re: netmap: understanding pkg-gen.c
On Wed, Oct 2, 2013 at 3:11 PM, hiren panchasara wrote: > > > > On Wed, Oct 2, 2013 at 1:18 PM, hiren panchasara < > hiren.panchas...@gmail.com> wrote: > >> >> Thanks Luigi. >> >> Coming back to this thread to actually understand what's going on. >> >> >> On Tue, Sep 24, 2013 at 8:37 PM, Luigi Rizzo wrote: >> >>> > >>> > On this box, I have 2 interfaces igb0 (10.73.149.28) and ix1 >>> (10.73.149.17) >>> > and this is how I am using this binary: >>> > >>> > -bash-4.2$ sudo ./pkt-gen -i ix1 -f tx -n 1 -c 8 -p 8 -d >>> > 10.73.149.17 -s 10.73.149.28 >>> >> >> So, my intention is to *send* 10gbps data to ix1 and see the card use all >> of its 8 queues. >> >> Above command is the correct one? >> >> I kldunloaded/loaded ixbge to clear out all the stats and tested it again: >> >> >> -bash-4.2$ sudo ./pkt-gen -i ix1 -f tx -n 1 -c 8 -p 8 -d >> 10.73.149.17 -s 10.73.149.28 >> extract_ip_range [143] extract IP range from 10.73.149.28 >> extract_ip_range [178] range is 10.73.149.28 0 to 10.73.149.28 0 >> extract_ip_range [143] extract IP range from 10.73.149.17 >> extract_ip_range [178] range is 10.73.149.17 0 to 10.73.149.17 0 >> extract_mac_range [184] extract MAC range from 90:e2:ba:30:68:c5 >> extract_mac_range [199] 90:e2:ba:30:68:c5 starts at 90:e2:ba:30:68:c5 >> extract_mac_range [184] extract MAC range from ff:ff:ff:ff:ff:ff >> extract_mac_range [199] ff:ff:ff:ff:ff:ff starts at ff:ff:ff:ff:ff:ff >> main [1530] map size is 334980 Kb >> main [1552] mapping 334980 Kbytes >> Sending on ix1: 8 queues, 8 threads and 8 cpus. >> 10.73.149.28 -> 10.73.149.17 (90:e2:ba:30:68:c5 -> ff:ff:ff:ff:ff:ff) >> main [1622] Sending 512 packets every 0.0 ns >> main [1624] Wait 2 secs for phy reset >> main [1626] Ready... >> sender_body [775] start >> sender_body [775] start >> sender_body [775] start >> sender_body [775] start >> sender_body [775] start >> sender_body [775] start >> sender_body [775] start >> sender_body [848] drop copy >> >> sender_body [775] start >> sender_body [848] drop copy >> sender_body [848] drop copy >> sender_body [848] drop copy >> sender_body [848] drop copy >> sender_body [848] drop copy >> sender_body [848] drop copy >> main_thread [1192] 15122963 pps (15130434 pkts in 1000494 usec) >> main_thread [1192] 14881444 pps (14896266 pkts in 1000996 usec) >> sender_body [841] poll error/timeout on queue 1 >> main_thread [1192] 14880708 pps (15659371 pkts in 1052327 usec) >> main_thread [1192] 14878611 pps (14888684 pkts in 1000677 usec) >> main_thread [1192] 14882655 pps (14897538 pkts in 1001000 usec) >> main_thread [1192] 11900044 pps (12029754 pkts in 1010900 usec) >> main_thread [1212] ouch, thread 1 exited with error >> Sent 87502047 packets, 60 bytes each, in 5.86 seconds. >> Speed: 14.92 Mpps Bandwidth: 7.16 Gbps (raw 10.03 Gbps) >> -bash-4.2$ >> >> But looking at the queue stats: >> >> >> -bash-4.2$ sysctl -a | grep ix.1 | grep queue | grep rx_packets >> dev.ix.1.queue0.rx_packets: 171 >> dev.ix.1.queue1.rx_packets: 0 >> dev.ix.1.queue2.rx_packets: 0 >> dev.ix.1.queue3.rx_packets: 0 >> dev.ix.1.queue4.rx_packets: 0 >> dev.ix.1.queue5.rx_packets: 0 >> dev.ix.1.queue6.rx_packets: 0 >> dev.ix.1.queue7.rx_packets: 0 >> >> And after a few seconds: >> >> -bash-4.2$ sysctl -a | grep ix.1 | grep queue | grep rx_packets >> dev.ix.1.queue0.rx_packets: 310 >> dev.ix.1.queue1.rx_packets: 0 >> dev.ix.1.queue2.rx_packets: 0 >> dev.ix.1.queue3.rx_packets: 0 >> dev.ix.1.queue4.rx_packets: 8 >> dev.ix.1.queue5.rx_packets: 0 >> dev.ix.1.queue6.rx_packets: 0 >> dev.ix.1.queue7.rx_packets: 0 >> -bash-4.2$ >> >> What is going on here? Should I be seeing more pkts in rx_packets? Should >> I see more queues being used? >> >> I am using stock ixgbe at this point. I believe RSS is enabled by >> default? >> > > fdir(flow director)/atr is also enabled by default. > > I tried to turn that off by following change in /sys/modules/ixgbe/Makefile > > -CFLAGS+= -I${.CURDIR}/../../dev/ixgbe -DSMP -DIXGBE_FDIR > +CFLAGS+= -I${.CURDIR}/../../dev/ixgbe -DSMP > > But, now interface would not attach to netmap: > > > sudo ./pkt-gen -i ix1 -f tx -n 1 -c 8 -p 8 -d 10.73.149.17 -s > 10.73.149.28 > extract_ip_range [143] extract IP range from 10.73.149.28 > extract_ip_range [178]
Re: netmap: understanding pkg-gen.c
On Wed, Oct 2, 2013 at 4:15 PM, hiren panchasara wrote: > > > > On Wed, Oct 2, 2013 at 3:11 PM, hiren panchasara < > hiren.panchas...@gmail.com> wrote: > >> >> >> >> On Wed, Oct 2, 2013 at 1:18 PM, hiren panchasara < >> hiren.panchas...@gmail.com> wrote: >> >>> >>> Thanks Luigi. >>> >>> Coming back to this thread to actually understand what's going on. >>> >>> >>> On Tue, Sep 24, 2013 at 8:37 PM, Luigi Rizzo wrote: >>> >>>> > >>>> > On this box, I have 2 interfaces igb0 (10.73.149.28) and ix1 >>>> (10.73.149.17) >>>> > and this is how I am using this binary: >>>> > >>>> > -bash-4.2$ sudo ./pkt-gen -i ix1 -f tx -n 1 -c 8 -p 8 -d >>>> > 10.73.149.17 -s 10.73.149.28 >>>> >>> >>> So, my intention is to *send* 10gbps data to ix1 and see the card use >>> all of its 8 queues. >>> >>> Above command is the correct one? >>> >>> I kldunloaded/loaded ixbge to clear out all the stats and tested it >>> again: >>> >>> >>> -bash-4.2$ sudo ./pkt-gen -i ix1 -f tx -n 1 -c 8 -p 8 -d >>> 10.73.149.17 -s 10.73.149.28 >>> extract_ip_range [143] extract IP range from 10.73.149.28 >>> extract_ip_range [178] range is 10.73.149.28 0 to 10.73.149.28 0 >>> extract_ip_range [143] extract IP range from 10.73.149.17 >>> extract_ip_range [178] range is 10.73.149.17 0 to 10.73.149.17 0 >>> extract_mac_range [184] extract MAC range from 90:e2:ba:30:68:c5 >>> extract_mac_range [199] 90:e2:ba:30:68:c5 starts at 90:e2:ba:30:68:c5 >>> extract_mac_range [184] extract MAC range from ff:ff:ff:ff:ff:ff >>> extract_mac_range [199] ff:ff:ff:ff:ff:ff starts at ff:ff:ff:ff:ff:ff >>> main [1530] map size is 334980 Kb >>> main [1552] mapping 334980 Kbytes >>> Sending on ix1: 8 queues, 8 threads and 8 cpus. >>> 10.73.149.28 -> 10.73.149.17 (90:e2:ba:30:68:c5 -> ff:ff:ff:ff:ff:ff) >>> main [1622] Sending 512 packets every 0.0 ns >>> main [1624] Wait 2 secs for phy reset >>> main [1626] Ready... >>> sender_body [775] start >>> sender_body [775] start >>> sender_body [775] start >>> sender_body [775] start >>> sender_body [775] start >>> sender_body [775] start >>> sender_body [775] start >>> sender_body [848] drop copy >>> >>> sender_body [775] start >>> sender_body [848] drop copy >>> sender_body [848] drop copy >>> sender_body [848] drop copy >>> sender_body [848] drop copy >>> sender_body [848] drop copy >>> sender_body [848] drop copy >>> main_thread [1192] 15122963 pps (15130434 pkts in 1000494 usec) >>> main_thread [1192] 14881444 pps (14896266 pkts in 1000996 usec) >>> sender_body [841] poll error/timeout on queue 1 >>> main_thread [1192] 14880708 pps (15659371 pkts in 1052327 usec) >>> main_thread [1192] 14878611 pps (14888684 pkts in 1000677 usec) >>> main_thread [1192] 14882655 pps (14897538 pkts in 1001000 usec) >>> main_thread [1192] 11900044 pps (12029754 pkts in 1010900 usec) >>> main_thread [1212] ouch, thread 1 exited with error >>> Sent 87502047 packets, 60 bytes each, in 5.86 seconds. >>> Speed: 14.92 Mpps Bandwidth: 7.16 Gbps (raw 10.03 Gbps) >>> -bash-4.2$ >>> >>> But looking at the queue stats: >>> >>> >>> -bash-4.2$ sysctl -a | grep ix.1 | grep queue | grep rx_packets >>> dev.ix.1.queue0.rx_packets: 171 >>> dev.ix.1.queue1.rx_packets: 0 >>> dev.ix.1.queue2.rx_packets: 0 >>> dev.ix.1.queue3.rx_packets: 0 >>> dev.ix.1.queue4.rx_packets: 0 >>> dev.ix.1.queue5.rx_packets: 0 >>> dev.ix.1.queue6.rx_packets: 0 >>> dev.ix.1.queue7.rx_packets: 0 >>> >>> And after a few seconds: >>> >>> -bash-4.2$ sysctl -a | grep ix.1 | grep queue | grep rx_packets >>> dev.ix.1.queue0.rx_packets: 310 >>> dev.ix.1.queue1.rx_packets: 0 >>> dev.ix.1.queue2.rx_packets: 0 >>> dev.ix.1.queue3.rx_packets: 0 >>> dev.ix.1.queue4.rx_packets: 8 >>> dev.ix.1.queue5.rx_packets: 0 >>> dev.ix.1.queue6.rx_packets: 0 >>> dev.ix.1.queue7.rx_packets: 0 >>> -bash-4.2$ >>> >>> What is going on here? Should I be seeing more pkts in rx_packets? >>> Should I see more queues being used? >>> >>> I am using stock ixgbe at this point. I believe
Re: netmap: understanding pkg-gen.c
On Wed, Oct 2, 2013 at 4:39 PM, hiren panchasara wrote: > > > > On Wed, Oct 2, 2013 at 4:15 PM, hiren panchasara < > hiren.panchas...@gmail.com> wrote: > >> >> >> >> On Wed, Oct 2, 2013 at 3:11 PM, hiren panchasara < >> hiren.panchas...@gmail.com> wrote: >> >>> >>> >>> >>> On Wed, Oct 2, 2013 at 1:18 PM, hiren panchasara < >>> hiren.panchas...@gmail.com> wrote: >>> >>>> >>>> Thanks Luigi. >>>> >>>> Coming back to this thread to actually understand what's going on. >>>> >>>> >>>> On Tue, Sep 24, 2013 at 8:37 PM, Luigi Rizzo wrote: >>>> >>>>> > >>>>> > On this box, I have 2 interfaces igb0 (10.73.149.28) and ix1 >>>>> (10.73.149.17) >>>>> > and this is how I am using this binary: >>>>> > >>>>> > -bash-4.2$ sudo ./pkt-gen -i ix1 -f tx -n 10000 -c 8 -p 8 -d >>>>> > 10.73.149.17 -s 10.73.149.28 >>>>> >>>> Thanks to Juli, I realized that I was doing it wrong. Trying to _send_ pkts from ix1 via netmap and looking at ix1's _receive_ buffers. I am setting up another machine with ixgbe to have 2 different machines for proper testing. Apologies for the noise. cheers, Hiren ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: netmap: understanding pkg-gen.c
On Wed, Oct 2, 2013 at 11:05 PM, Luigi Rizzo wrote: > > > > On Thu, Oct 3, 2013 at 6:34 AM, hiren panchasara < > hiren.panchas...@gmail.com> wrote: > >> >> Thanks Luigi. >>>>>> >>>>>> Coming back to this thread to actually understand what's going on. >>>>>> >>>>>> >>>>>> On Tue, Sep 24, 2013 at 8:37 PM, Luigi Rizzo wrote: >>>>>> >>>>>>> > >>>>>>> > On this box, I have 2 interfaces igb0 (10.73.149.28) and ix1 >>>>>>> (10.73.149.17) >>>>>>> > and this is how I am using this binary: >>>>>>> > >>>>>>> > -bash-4.2$ sudo ./pkt-gen -i ix1 -f tx -n 1 -c 8 -p 8 -d >>>>>>> > 10.73.149.17 -s 10.73.149.28 >>>>>>> >>>>>> >> Thanks to Juli, I realized that I was doing it wrong. >> >> Trying to _send_ pkts from ix1 via netmap and looking at ix1's _receive_ >> buffers. >> >> I am setting up another machine with ixgbe to have 2 different machines >> for proper testing. >> > > since you have a dual port card, you can actually run the experiment > on a single machine, connecting the two ports with a > cross cable and run the sender on ix1 and the receiver on ix0. > Thanks but for now I have 2 machines with ixgbe to play with. > > But you will still see traffic only to one queue, because > pkt-gen by default uses the same DST-MAC address so the > way you configure RSS is irrelevant. > What/where is the exact logic/code of how card determines what traffic goes to what queue? Is it based on DST-MAC always? > What you could do is, when prefilling the buffers, use > different mac or ip addresses. There is some unfinished > code in pkt-gen.c to implement that. > I see we are not playing with g.dst_mac.name much. Let me know if you have unfinished code sitting somewhere I can look. Thanks, Hiren ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Exposing sysctls for ixgbe
I am going to commit this change in the weekend. Let me know if there are any objections. Thanks, Hiren On Mon, Sep 30, 2013 at 8:51 AM, hiren panchasara wrote: > + jfv > > > On Wed, Sep 25, 2013 at 5:35 PM, Adrian Chadd wrote: > >> please cc jfv and get them into his driver. :) >> >> -a >> >> >> On 24 September 2013 23:53, hiren panchasara wrote: >> >>> Any comments? >>> >>> cheers, >>> Hiren >>> >>> On Sun, Sep 22, 2013 at 12:01 PM, hiren panchasara >> >wrote: >>> >>> > $ sysctl hw.igb >>> > hw.igb.rxd: 4096 >>> > hw.igb.txd: 4096 >>> > hw.igb.enable_aim: 1 >>> > hw.igb.enable_msix: 1 >>> > hw.igb.max_interrupt_rate: 8000 >>> > hw.igb.buf_ring_size: 4096 >>> > hw.igb.header_split: 0 >>> > hw.igb.num_queues: 1 >>> > hw.igb.rx_process_limit: 100 >>> > $ sysctl hw.ix >>> > sysctl: unknown oid 'hw.ix': No such file or directory >>> > >>> > I thought it would be nice to have these things exposed. So I copied >>> them >>> > from igb: >>> > http://people.freebsd.org/~hiren/ixgbe_sysctls.txt >>> > >>> > Changes for if_igb.c is to expose correct auto-tuned value for a >>> running >>> > system for "hw.igb.num_queues", which is not the case right now. >>> > >>> > Thanks to markj@ for help/pointers. >>> > >>> > Please let me know if the diffs look okay. >>> > >>> > cheers, >>> > Hiren >>> > >>> ___ >>> freebsd-net@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" >>> >> >> > ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Exposing sysctls for ixgbe
Thanks. Committed via: http://svnweb.freebsd.org/base?view=revision&revision=256069 Cheers, Hiren ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Adding Flow Director sysctls to ixgbe(4) (was: netmap: traffic distribution)
On Mon, Oct 7, 2013 at 12:01 AM, Takuya ASADA wrote: > Hi, > > This is updated version of "ixgbetool" patch. I will try to give this a try tomorrow. Cheers, Hiren > Here's improved feature list: > - signature filter list feature available > - user-defined filter can be use with an ATR. > To enable it, add "hw.ixgbe.cooperative_atr=1" on /boot/loader.conf > > Usage is as follows: > ixgbetool [operation] > add_sig_filter > > show_sig_filter > del_sig_filter > > > 2013/9/30 Takuya ASADA > >> Hi, >> >> I just implemented device specific ioctl with device specific >> configuration tool. >> It still doesn't support some important features such as: >> - FDIR enable / disable via sysctl or tunable params >> - ATR enable / disable via sysctl or tunable params >> - IPv6 support on signature filter >> - signature filter list >> - support perfect filter >> But, at least it can configure signature filter manually. >> >> Usage is as follows: >> Usage: ixgbetool [operation] >> add_sig_filter >> del_sig_filter >> >> >> 2013/9/28 hiren panchasara >> >>> >>> >>> >>> On Fri, Sep 27, 2013 at 1:58 AM, Takuya ASADA wrote: >>> >>>> 2013/9/27 Adrian Chadd >>>> >>>>> On 27 September 2013 00:43, hiren panchasara < >>>>> hiren.panchas...@gmail.com> wrote: >>>>> >>>>> >>>>>> Takuya, >>>>>> >>>>>> I see a lot of responses/comments on proposed changes. Was anything >>>>>> decided >>>>>> at the end of it? As far as I can tell, its still not committed to the >>>>>> tree. >>>>>> >>>>> >>>>> I'd rather see an ioctl API for that chipset and then have a separate >>>>> tool program it for now. >>>>> >>>> >>>> Ah, like cxgbetool and cxgbe? (it has device specific tool and ioctls) >>>> http://svnweb.freebsd.org/base/head/tools/tools/cxgbetool/ >>>> >>> >>> Something like this for ixgbe would be nice to start with, imo. >>> >>> Cheers, >>> Hiren >>> >>>> http://svnweb.freebsd.org/base/head/sys/dev/cxgb/ >>>> >>>> >>>>> So, how bout we hack that up? :) >>>>> >>>> >>>> Sound's interesting ;-) >>>> Could you tell me more detail about your idea? >>>> >>>> >>> >> > > ___ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: igb(4) panic: already enqueue
Jack, I am also seeing similar panics at $work on a couple weeks old STABLE-9. Can you please look into this issue? cheers, Hiren 1) HP DL360e Gen8, 2 x Xeon E5-2430 2.20GHz panic: buf=0xfe002810d700 already enqueue at 995 prod=997 cons=995 cpuid = 17 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a/frame 0xff868637b030 kdb_backtrace() at kdb_backtrace+0x37/frame 0xff868637b0f0 panic() at panic+0x1d8/frame 0xff868637b1f0 igb_mq_start() at igb_mq_start+0x1cb/frame 0xff868637b240 ether_output_frame() at ether_output_frame+0x33/frame 0xff868637b260 ether_output() at ether_output+0x52d/frame 0xff868637b2f0 ip_output() at ip_output+0xe38/frame 0xff868637b3e0 tcp_output() at tcp_output+0x122c/frame 0xff868637b5a0 tcp_do_segment() at tcp_do_segment+0x306c/frame 0xff868637b6c0 tcp_input() at tcp_input+0x909/frame 0xff868637b7f0 ip_input() at ip_input+0xbd/frame 0xff868637b840 netisr_dispatch_src() at netisr_dispatch_src+0x152/frame 0xff868637b8a0 ether_demux() at ether_demux+0x17d/frame 0xff868637b8d0 ether_nh_input() at ether_nh_input+0x208/frame 0xff868637b910 netisr_dispatch_src() at netisr_dispatch_src+0x152/frame 0xff868637b970 igb_rxeof() at igb_rxeof+0x394/frame 0xff868637b9e0 igb_handle_que() at igb_handle_que+0x9b/frame 0xff868637ba20 taskqueue_run_locked() at taskqueue_run_locked+0x93/frame 0xff868637ba80 taskqueue_thread_loop() at taskqueue_thread_loop+0x3e/frame 0xff868637baa0 fork_exit() at fork_exit+0x135/frame 0xff868637baf0 fork_trampoline() at fork_trampoline+0xe/frame 0xff868637baf0 2) HP DL160 G6, 2 x Xeon E5620 2.40GHz panic: buf=0xfe01b6334700 already enqueue at 42 prod=43 cons=42 cpuid = 0 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a/frame 0xff800037a620 kdb_backtrace() at kdb_backtrace+0x37/frame 0xff800037a6e0 panic() at panic+0x1d8/frame 0xff800037a7e0 igb_mq_start() at igb_mq_start+0x1cb/frame 0xff800037a830 ether_output_frame() at ether_output_frame+0x33/frame 0xff800037a850 ether_output() at ether_output+0x52d/frame 0xff800037a8e0 ip_output() at ip_output+0xe38/frame 0xff800037a9d0 syncache_respond() at syncache_respond+0x462/frame 0xff800037aa90 syncache_timer() at syncache_timer+0xdf/frame 0xff800037aac0 softclock() at softclock+0x2c6/frame 0xff800037ab60 intr_event_execute_handlers() at intr_event_execute_handlers+0x6a/frame 0xff800037ab90 ithread_loop() at ithread_loop+0xac/frame 0xff800037abe0 fork_exit() at fork_exit+0x135/frame 0xff800037ac30 fork_trampoline() at fork_trampoline+0xe/frame 0xff800037ac30 ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: TCP Initial Window 10 MFC
On Wed, Aug 14, 2013 at 8:46 PM, Lawrence Stewart wrote: > On 08/15/13 02:44, Andre Oppermann wrote: >> On 14.08.2013 04:36, Lawrence Stewart wrote: >>> Hi Andre, >>> >>> [RE team is BCCed so they're aware of this discussion] >>> >>> On 07/06/13 00:58, Andre Oppermann wrote: >>>> Author: andre >>>> Date: Fri Jul 5 14:58:24 2013 >>>> New Revision: 252789 >>>> URL: http://svnweb.freebsd.org/changeset/base/252789 >>>> >>>> Log: >>>>MFC r242266: >>>> >>>> Increase the initial CWND to 10 segments as defined in IETF TCPM >>>> draft-ietf-tcpm-initcwnd-05. It explains why the increased initial >>>> window improves the overall performance of many web services without >>>> risking congestion collapse. >>>> >>>> As long as it remains a draft it is placed under a sysctl marking it >>>> as experimental: >>>> net.inet.tcp.experimental.initcwnd10 = 1 >>>> When it becomes an official RFC soon the sysctl will be changed to >>>> the RFC number and moved to net.inet.tcp. >>>> >>>> This implementation differs from the RFC draft in that it is a bit >>>> more conservative in the case of packet loss on SYN or SYN|ACK >>>> because >>>> we haven't reduced the default RTO to 1 second yet. Also the >>>> restart >>>> window isn't yet increased as allowed. Both will be adjusted with >>>> upcoming changes. >>>> >>>> Is is enabled by default. In Linux it is enabled since kernel 3.0. >>> >>> I haven't been fully alert to FreeBSD happenings this year so apologies >>> for bringing this up so long after the MFC. >>> >>> I don't think this change should have been MFCed, at least not in its >>> current form. Enabling the switch to IW=10 on a stable branch is >>> inappropriate IMO. I also think the "net.inet.tcp.experimental" sysctl >>> branch is poorly named as per the important discussion we had back in >>> February [1]. I would really prefer we didn't get stuck having to keep >>> it around by making a stable release with it being present. >>> >>> I think this commit should be backed out of stable/9 and more >>> importantly, 9.2-RELEASE. >> >> Backing out the patch isn't really necessary, just flip the switch to >> off having it revert to the RFC5681 defaults. Those who want it anyway >> can simply enable it again. > > That doesn't address the sysctl tree naming concern or mechanism issue - > please refer back to the Feb discussion; specifically the proposal to > rename the experimental branch to "net.inet.tcp.nonstandard" and add an > "allowed" leaf which takes a list of non-standard behaviours to allow > tweaking in the stack. > > Leaving the sysctl branch named "experimental" conveys that the things > which live under the branch are being evaluated in some way for becoming > a default, which is very different to "nonstandard" which conveys that > the user is twiddling things in a way which normally shouldn't be. IW=10 > may become a FreeBSD default at some point, but the mechanism for > enabling it should be to specify the initial window as a value in > segments, and as such by allowing any non-standard value (IW=7, IW=50), > I strongly argue in favour for changing the branch name from > "experimental" to "nonstandard". > > In order to continue this discussion in the context of what we started > in Feb, I still request that this change be backed out of releng/9.2 so > that 9.2-RELEASE doesn't ship with it. We can continue discussion for > it's future in stable/9 and head after the backout so that 9.2 isn't > held up. > >> IW10 has become RFC6928 (experimental) in April 2013. > > Great for the draft authors, but irrelevant for this discussion. > >>> As an aside, I am intending to follow up to the Feb discussion with a >>> patch that implements the basic infrastructure I proposed so that we can >>> continue that discussion. >> >> Again I'm deeply concerned and opposed to giving end users direct control >> over the IW value. I've had and seen too many cases of totally bogus >> "tuning" >> by cranking up random sysctls to insane values and then complaining about >> FreeBSD being slow compared to Linux (and then ditching FreeBSD). > > Sorry, but referring to unspecified cases of stupidity resulting in loss > of unquantified numbers of users as a reason against providing a > controlled mechanism to change a default system parameter in a > potentially harmful way is not a rational argument. I do not subscribe to the idea of "Let's not make life of 98% of users better because 2% may do something stupid". I am revisiting this thread because at $work, we need to tweak initcwnd (other than 10) to see how it behaves but there is no easy way to tune it. (or am I missing something?) Can we please make initcwnd a sysctl tunable? cheers, Hiren ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: TCP Initial Window 10 MFC
On Tue, Mar 4, 2014 at 7:38 PM, Lawrence Stewart wrote: > I lost the battle of wills on this topic and 10.0 shipped with IW10 > enabled by default :( > > As for having it configurable, it is a trivial patch which perhaps, > Hiren, you might be willing to take a stab at? I obviously did not > manage to carve out the time last year to push forward with the agenda I > proposed in this thread, but I will get back to it at some point. Hi Lawrence, Let's fix it the right way if possible. Below is a rough/untested quick patch I came up with. Is this how you were planning to have "nonstandard" sysctl knob designed? Index: sys/netinet/tcp_input.c === --- sys/netinet/tcp_input.c (revision 260833) +++ sys/netinet/tcp_input.c (working copy) @@ -164,6 +164,19 @@ &VNET_NAME(tcp_do_initcwnd10), 0, "Enable RFC 6928 (Increasing initial CWND to 10)"); +SYSCTL_NODE(_net_inet_tcp, OID_AUTO, nonstandard, CTLFLAG_RW, 0, +"Nonstandard TCP extensions"); + +VNET_DEFINE(int, tcp_nonstandard_allowed) = 0; +SYSCTL_VNET_INT(_net_inet_tcp_nonstandard, OID_AUTO, allowed, CTLFLAG_RW, +&VNET_NAME(tcp_nonstandard_allowed), 0, +"Allow nonstandard TCP extensions"); + +VNET_DEFINE(int, tcp_nonstandard_initcwnd) = 0; +SYSCTL_VNET_INT(_net_inet_tcp_nonstandard, OID_AUTO, initcwnd, CTLFLAG_RW, +&VNET_NAME(tcp_nonstandard_initcwnd), 0, +"Slow-start flight size (initial congestion window)"); + VNET_DEFINE(int, tcp_do_rfc3465) = 1; SYSCTL_VNET_INT(_net_inet_tcp, OID_AUTO, rfc3465, CTLFLAG_RW, &VNET_NAME(tcp_do_rfc3465), 0, @@ -368,6 +381,8 @@ */ if (tp->snd_cwnd == 1) tp->snd_cwnd = tp->t_maxseg;/* SYN(-ACK) lost */ + else if (V_tcp_nonstandard_allowed && V_tcp_nonstandard_initcwnd) + tp->snd_cwnd = V_tcp_nonstandard_initcwnd * tp->t_maxseg; else if (V_tcp_do_initcwnd10) tp->snd_cwnd = min(10 * tp->t_maxseg, max(2 * tp->t_maxseg, 14600)); Index: sys/netinet/tcp_var.h === --- sys/netinet/tcp_var.h (revision 260833) +++ sys/netinet/tcp_var.h (working copy) @@ -610,6 +610,7 @@ VNET_DECLARE(int, tcp_delack_enabled); VNET_DECLARE(int, tcp_do_rfc3390); VNET_DECLARE(int, tcp_do_initcwnd10); +VNET_DECLARE(int, tcp_nonstandard_allowed); +VNET_DECLARE(int, tcp_nonstandard_initcwnd); VNET_DECLARE(int, tcp_sendspace); VNET_DECLARE(int, tcp_recvspace); VNET_DECLARE(int, path_mtu_discovery); @@ -622,6 +623,7 @@ #defineV_tcp_delack_enabledVNET(tcp_delack_enabled) #defineV_tcp_do_rfc3390VNET(tcp_do_rfc3390) #defineV_tcp_do_initcwnd10 VNET(tcp_do_initcwnd10) +#defineV_tcp_nonstandard_allowed VNET(tcp_nonstandard_allowed) +#defineV_tcp_nonstandard_initcwnd VNET(tcp_nonstandard_initcwnd) #defineV_tcp_sendspace VNET(tcp_sendspace) #defineV_tcp_recvspace VNET(tcp_recvspace) #defineV_path_mtu_discoveryVNET(path_mtu_discovery) ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Include port number in "Listen queue overflow" messages
I am thinking of committing following change that includes port number in "Listen queue overflow" messages. New message would look something like: sonewconn: pcb 0xf8001b155760: Listen queue overflow on port 13120: 1 already in queue awaiting acceptance (454 occurrences) I've recently ran into a situation at $work where I could not catch the culprit application via "netstat -A" and had to dive into kgdb to find the port from pcb where this application was listening to. IMO, this change will make debugging easier. cheers, Hiren Index: sys/kern/uipc_socket.c === --- sys/kern/uipc_socket.c (revision 262861) +++ sys/kern/uipc_socket.c (working copy) @@ -136,6 +136,7 @@ #include #include #include +#include #include @@ -491,8 +492,11 @@ static int overcount; struct socket *so; + struct inpcb *inp; int over; + inp = sotoinpcb(head); + ACCEPT_LOCK(); over = (head->so_qlen > 3 * head->so_qlimit / 2); ACCEPT_UNLOCK(); @@ -504,10 +508,12 @@ overcount++; if (ratecheck(&lastover, &overinterval)) { - log(LOG_DEBUG, "%s: pcb %p: Listen queue overflow: " - "%i already in queue awaiting acceptance " + log(LOG_DEBUG, "%s: pcb %p: Listen queue overflow on " + "port %d: %i already in queue awaiting acceptance " "(%d occurrences)\n", - __func__, head->so_pcb, head->so_qlen, overcount); + __func__, head->so_pcb, + ntohs(inp->inp_inc.inc_lport), head->so_qlen, + overcount); overcount = 0; } ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Include port number in "Listen queue overflow" messages
On Wed, Mar 19, 2014 at 4:35 PM, Navdeep Parhar wrote: > On Tue, Mar 18, 2014 at 11:05:23PM -0700, Julian Elischer wrote: >> On 3/18/14, 8:33 PM, George Neville-Neil wrote: >> >On Mar 7, 2014, at 1:23 , hiren panchasara >> >wrote: >> > >> >>I am thinking of committing following change that includes port number >> >>in "Listen queue overflow" messages. >> I think it's a good idea. There is even more information available >> but this is probably enough. >> >> >> >I like it. >> > >> >Best, >> >George >> > > > I think the suggested change isn't correct as is assumes every socket's pcb is > an inpcb. You are right. I'd need to think a bit more about a possible solution. Thanks for your help. cheers, Hiren > > Navdeep > >> >>New message would look something like: >> >>sonewconn: pcb 0xf8001b155760: Listen queue overflow on port >> >>13120: 1 already in queue awaiting acceptance (454 occurrences) >> >> >> >>I've recently ran into a situation at $work where I could not catch >> >>the culprit application via "netstat -A" and had to dive into kgdb to >> >>find the port from pcb where this application was listening to. >> >> >> >>IMO, this change will make debugging easier. >> >> >> >>cheers, >> >>Hiren >> >> >> >>Index: sys/kern/uipc_socket.c >> >>=== >> >>--- sys/kern/uipc_socket.c (revision 262861) >> >>+++ sys/kern/uipc_socket.c (working copy) >> >>@@ -136,6 +136,7 @@ >> >>#include >> >>#include >> >>#include >> >>+#include >> >> >> >>#include >> >> >> >>@@ -491,8 +492,11 @@ >> >>static int overcount; >> >> >> >>struct socket *so; >> >>+ struct inpcb *inp; >> >>int over; >> >> >> >>+ inp = sotoinpcb(head); >> >>+ >> >>ACCEPT_LOCK(); >> >>over = (head->so_qlen > 3 * head->so_qlimit / 2); >> >>ACCEPT_UNLOCK(); >> >>@@ -504,10 +508,12 @@ >> >>overcount++; >> >> >> >>if (ratecheck(&lastover, &overinterval)) { >> >>- log(LOG_DEBUG, "%s: pcb %p: Listen queue overflow: >> >>" >> >>- "%i already in queue awaiting acceptance " >> >>+ log(LOG_DEBUG, "%s: pcb %p: Listen queue overflow >> >>on " >> >>+ "port %d: %i already in queue awaiting >> >>acceptance " >> >>"(%d occurrences)\n", >> >>- __func__, head->so_pcb, head->so_qlen, >> >>overcount); >> >>+ __func__, head->so_pcb, >> >>+ ntohs(inp->inp_inc.inc_lport), head->so_qlen, >> >>+ overcount); >> >> >> >>overcount = 0; >> >>} >> >>___ >> >>freebsd-net@freebsd.org mailing list >> >>http://lists.freebsd.org/mailman/listinfo/freebsd-net >> >>To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" >> >___ >> >freebsd-net@freebsd.org mailing list >> >http://lists.freebsd.org/mailman/listinfo/freebsd-net >> >To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" >> > >> >> ___ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
hostcache's effect on initcwnd calculation
Host A:1 talks to Host B:2 (1 and 2 being ports) and it gets logged in tcp's hostcache. Now if a new connection happens between A:10001 and Host B:2, the initcwnd (initial congestion window) would not look at the CWND from hostcache but will always use what we've set as tp->snd_cwnd in cc_conn_init() of tcp_input.c - is this a correct assumption? What I want to confirm is, whether hostcache has any effect on initial congestion window of a new connection being setup. >From my reading of the code, cc_conn_init() always happens before we call tcp_hc_get() to get data out of hostcache to update other parameters of a connection. I do not see hc_entry->rmx_cwnd being actually used anywhere in tcp_input.c while connection bring up. Am I missing anything? cheers, Hiren ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: questions about (system) dhcp
On Mon, Mar 31, 2014 at 6:31 AM, Robert Huff wrote: > Hello: > Is this the correct place to ask? Yes. Cheers, Hiren ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: DCTCP implementation
On Sun, Mar 30, 2014 at 10:37 PM, Midori Kato wrote: > Hi FreeBSD developpers, > > I'm Midori Kato. I'm working on the DCTCP implementation in the FreeBSD with > Lars Eggert. I mail you because I would like to ask you a code review and > testing. The attached patch is not good enough to test our code. Please give > me your message. I will send an ECN marking implmenetation in dummynet and > test scripts personally to you. Hi Midori, Thank you for your work and I'd like to test this. Please let me know how you setup the dummynet testing cluster and I'll try it. cheers, Hiren ccing gnn@ as he was also asking for the same (testing details). ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: DCTCP implementation
On Sun, Mar 30, 2014 at 10:37 PM, Midori Kato wrote: > Hi FreeBSD developpers, > > I'm Midori Kato. I'm working on the DCTCP implementation in the FreeBSD with > Lars Eggert. I mail you because I would like to ask you a code review and > testing. The attached patch is not good enough to test our code. Please give > me your message. I will send an ECN marking implmenetation in dummynet and > test scripts personally to you. Midori, First thing I noticed in your dctcp.patch is that you are dropping r256920 which adds a new argument tlen to DELAY_ACK() macro. I also see you removed and re-added that macro definition without any changes to it. Is that intentional? If not, can you please fix that and resend the patch? It is usually a good idea to work on -HEAD for such things so patching/committing is easier. cheers, Hiren ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: DCTCP implementation
On Thu, Apr 3, 2014 at 4:27 AM, Midori Kato wrote: > Hi Hiren, > > Yes, I intentionally replace the location of DELAY_ACK() macro. > A DCTCP receiver sends an immediate ACK when incoming packets sets CE and > non-CE bit by turns. To implement this processing, I prepare the > cc_ecnpkt_handler() function. This function calls the DEKAY_ACK() macro to > check if this packet is in delayed ack or not. This is why I replace the > macro position. That is fine but macro definition is also changed. This is how it looks on -HEAD right now: #define DELAY_ACK(tp, tlen) \ ((!tcp_timer_active(tp, TT_DELACK) && \ (tp->t_flags & TF_RXWIN0SENT) == 0) && \ (tlen <= tp->t_maxopd) && \ (V_tcp_delack_enabled || (tp->t_flags & TF_NEEDSYN))) We need to pass this new argument "tlen" when calling DELAY_ACK() from cc_ecnpkt_handler() function. We can probably pass that to cc_encpkg_handler() while calling it from tcp_do_segment(). I'll make that change and see how it goes. cheers, Hiren > > Regards, > -- Midori > > > (4/3/14, 10:24 AM), hiren panchasara wrote: >> >> On Sun, Mar 30, 2014 at 10:37 PM, Midori Kato >> wrote: >>> >>> Hi FreeBSD developpers, >>> >>> I'm Midori Kato. I'm working on the DCTCP implementation in the FreeBSD >>> with >>> Lars Eggert. I mail you because I would like to ask you a code review and >>> testing. The attached patch is not good enough to test our code. Please >>> give >>> me your message. I will send an ECN marking implmenetation in dummynet >>> and >>> test scripts personally to you. >> >> Midori, >> >> First thing I noticed in your dctcp.patch is that you are dropping >> r256920 which adds a new argument tlen to DELAY_ACK() macro. I also >> see you removed and re-added that macro definition without any changes >> to it. >> >> Is that intentional? If not, can you please fix that and resend the patch? >> >> It is usually a good idea to work on -HEAD for such things so >> patching/committing is easier. >> >> cheers, >> Hiren >> ___ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" > > ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: netisr 0 : %100 and other netisr threads are waiting
On Thu, Apr 3, 2014 at 9:54 PM, Özkan KIRIK wrote: > Hi, > > I am trying to use suricata on FreeBSD 10 amd64. > FreeBSD behaves as a VLAN router and NAT Box. > > Traffic is about 400Mbps. > When i diverted traffic to suricata, swi: netisr 0 thread gets %100 cpu. > other netisr threads are %0. And Even I remove the divert rule, netisr > still eats %100 cpu. I think that something looping :) To be clear, this happens only *after* you divert traffic to suricata, right? > And after 1-2 minutes, one of igb0 and igb1 stops working. > Only reboot solves problem. > > Hardware has 8 cores, 24GB Ram > > My loader.conf : > > hw.igb.txd="4096" > hw.igb.rxd="4096" > hw.igb.rx_process_limit=1024 > hw.igb.num_queues=3 > net.isr.maxthreads=3 > net.isr.bindthreads=1 > net.isr.defaultqlimit=4096 > net.isr.maxqlimit=20480 > net.link.ifqmaxlen=10240 > > How can I debug this situation? > Any suggestions? I am not an expert here but please upload o/p for "sysctl net.isr" and "sysctl dev.igb" which would show error counters to get some idea about why igb0 or igb1 stops working. Whether we are running out of some resources or something else is going on. cheers, Hiren ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: ECN marking implenetation for dummynet
On Tue, Apr 8, 2014 at 8:46 PM, Adrian Chadd wrote: > Hi! Cool! can you file a FreeBSD PR with this? I'm testing this patch right now. I will make sure it doesn't get lost. :-) cheers, Hiren > > > -a > > > On 2 April 2014 04:48, Midori Kato wrote: >> Hi FreeBSD developers, >> >> I'm Midori Kato. I was working with Lars Eggert about DCTCP. >> I would like to share our patch for an ECN marking mechanism on >> dummynet, which I used for DCTCP testing. >> >> My implementation allows to set ECN with RED as an AQM scheme. The >> following command is an example: >> $ ipfw pipe config red 1/10/10/0.0 ecn >> >> Our implementation includes both DCTCP and RFC 3168 ECN marking methodology. >> >> If you are interested in our ECN implemention, I'm very happy to receive >> your review! (I have already submitted my patch to Luigi and hope he >> will merge ours in near future.) >> >> Regards, >> -- Midori >> >> ___ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" > ___ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
netisr observations
48943960 3 3 ip 067 5930661800 203888563 263168729 4 4 ip 0 0 7702510800 0 77025108 5 5 ip 0 0 5853731000 0 58537310 6 6 ip 0 0 8189642700 0 81896427 7 7 ip 0 0 6953585700 0 69535857 So, looks like only cpu3 is doing all the queuing. But it doesn't look like it's getting hammered or anything: last pid: 75181; load averages: 27.81, 27.08, 26.93 up 0+06:12:37 19:04:33 508 processes: 23 running, 476 sleeping, 1 waiting, 8 lock CPU 0: 71.8% user, 0.0% nice, 13.7% system, 14.5% interrupt, 0.0% idle CPU 1: 80.9% user, 0.0% nice, 14.5% system, 4.6% interrupt, 0.0% idle CPU 2: 77.1% user, 0.0% nice, 17.6% system, 5.3% interrupt, 0.0% idle CPU 3: 88.5% user, 0.0% nice, 9.2% system, 2.3% interrupt, 0.0% idle CPU 4: 80.2% user, 0.0% nice, 14.5% system, 5.3% interrupt, 0.0% idle CPU 5: 79.4% user, 0.0% nice, 16.8% system, 3.1% interrupt, 0.8% idle CPU 6: 83.2% user, 0.0% nice, 11.5% system, 4.6% interrupt, 0.8% idle CPU 7: 68.7% user, 0.0% nice, 18.3% system, 13.0% interrupt, 0.0% idle CPU 8: 88.5% user, 0.0% nice, 11.5% system, 0.0% interrupt, 0.0% idle CPU 9: 87.8% user, 0.0% nice, 10.7% system, 0.0% interrupt, 1.5% idle CPU 10: 87.0% user, 0.0% nice, 10.7% system, 2.3% interrupt, 0.0% idle CPU 11: 80.9% user, 0.0% nice, 16.8% system, 2.3% interrupt, 0.0% idle CPU 12: 86.3% user, 0.0% nice, 11.5% system, 2.3% interrupt, 0.0% idle CPU 13: 84.7% user, 0.0% nice, 14.5% system, 0.8% interrupt, 0.0% idle CPU 14: 87.0% user, 0.0% nice, 12.2% system, 0.8% interrupt, 0.0% idle CPU 15: 87.8% user, 0.0% nice, 9.9% system, 2.3% interrupt, 0.0% idle Mem: 17G Active, 47G Inact, 3712M Wired, 674M Cache, 1655M Buf, 1300M Free Swap: 8192M Total, 638M Used, 7554M Free, 7% Inuse, 4K In My conclusion after lookinag it for a bunch of times that all CPUs are equally doing work (if we believe top -P stats) Finally, the question: why is cpu3 doing all the queuing. and what does that actually mean? Can I improve performance OR reduce cpu load any other way? Should I change anything in my netisr settings? cheers, Hiren ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Patches for RFC6937 and draft-ietf-tcpm-newcwv-00
On Fri, Apr 11, 2014 at 4:15 AM, Eggert, Lars wrote: > Hi, > > since folks are playing with Midori's DCTCP patch, I wanted to make sure that > you were also aware of the patches that Aris did for PRR and NewCWV... >> >> Lars, There are no actual patches attached here. (Or the mailing-list dropped them.) cheers, Hiren ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Patches for RFC6937 and draft-ietf-tcpm-newcwv-00
On Fri, Apr 11, 2014 at 4:16 PM, hiren panchasara wrote: > On Fri, Apr 11, 2014 at 4:15 AM, Eggert, Lars wrote: >> Hi, >> >> since folks are playing with Midori's DCTCP patch, I wanted to make sure >> that you were also aware of the patches that Aris did for PRR and NewCWV... > >>> >>> > > Lars, > > There are no actual patches attached here. (Or the mailing-list dropped them.) Ah, my bad. I think you are referring to the patches in original email. I can see them. cheers, Hiren ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: netisr observations
On Fri, Apr 11, 2014 at 11:30 AM, Patrick Kelsey wrote: > > The output of netstat -Q shows IP dispatch is set to default, which is > direct (NETISR_DISPATCH_DIRECT). That means each IP packet will be > processed on the same CPU that the Ethernet processing for that packet was > performed on, so CPU selection for IP packets will not be based on flowid. > The output of netstat -Q shows Ethernet dispatch is set to direct > (NETISR_DISPATCH_DIRECT if you wind up reading the code), so the Ethernet > processing for each packet will take place on the same CPU that the driver > receives that packet on. > > For the igb driver with queues autoconfigured and msix enabled, as the > sysctl output shows you have, the driver will create a number of queues > subject to device limitations, msix message limitations, and the number of > CPUs in the system, establish a separate interrupt handler for each one, and > bind each of those interrupt handlers to a separate CPU. It also creates a > separate single-threaded taskqueue for each queue. Each queue interrupt > handler sends work to its associated taskqueue when the interrupt fires. > Those taskqueues are where the Ethernet packets are received and processed > by the driver. The question is where those taskqueue threads will be run. > I don't see anything in the driver that makes an attempt to bind those > taskqueue threads to specific CPUs, so really the location of all of the > packet processing is up to the scheduler (i.e., arbitrary). > > The summary is: > > 1. the hardware schedules each received packet to one of its queues and > raises the interrupt for that queue > 2. that queue interrupt is serviced on the same CPU all the time, which is > different from the CPUs for all other queues on that interface > 3. the interrupt handler notifies the corresponding task queue, which runs > its task in a thread on whatever CPU the scheduler chooses > 4. that task dispatches the packet for Ethernet processing via netisr, which > processes it on whatever the current CPU is > 5. Ethernet processing dispatches that packet for IP processing via netisr, > which processes it on whatever the current CPU is I really appreciate you taking time and explaining this. Thank you. I am specially confused with ip "Queued" column from netstat -Q showing 203888563 only for cpu3. Does this mean that cpu3 queues everything and then distributes among other cpus? Where does this queuing on cpu3 happens out of 5 stages you mentioned above? This value gets populated in snwp->snw_queued field for each cpu inside sysctl_netisr_work(). > > You might want to try changing the default netisr dispatch policy to > 'deferred' (sysctl net.isr.dispatch=deferred). If you do that, the Ethernet > processing will still happen on an arbitrary CPU chosen by the scheduler, > but the IP processing should then get mapped to a CPU based on the flowid > assigned by the driver. Since igb assigns flowids based on received queue > number, all IP (and above) processing for that packet should then be > performed on the same CPU the queue interrupt was bound to. I will give this a try and see how things behave. I was also thinking about net.isr.bindthreads. netisr_start_swi() does intr_event_bind() if we have it bindthreads set to 1. What would that gain me, if anything? Would it stop moving intr{swi1: netisr 3} on to different cpus (as I am seeing in 'top' o/p) and bind it to a single cpu? I've came across a thread discussing some side-effects of this though: http://lists.freebsd.org/pipermail/freebsd-hackers/2012-January/037597.html Thanks a ton, again. cheers, Hiren ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: kern/184311: [bge] [panic] kernel panic with bge(4) on SunFire X2100
The following reply was made to PR kern/184311; it has been noted by GNATS. From: hiren panchasara To: bug-follo...@freebsd.org, benjamin.st...@ub.uni-tuebingen.de Cc: Subject: Re: kern/184311: [bge] [panic] kernel panic with bge(4) on SunFire X2100 Date: Mon, 19 May 2014 13:08:14 -0700 I just hit this on FreeBSD10. vgapci0: mem 0xc900-0xc9ff,0xd000-0xd7ff irq 16 at device 5.0 on pci1 vgapci0: Boot video device pcib2: at device 12.0 on pci0 pci2: on pcib2 bge0: mem 0xca00-0xca00 irq 16 at device 0.0 on pci2 bge0: CHIP ID 0x4101; ASIC REV 0x04; CHIP REV 0x41; PCI-E miibus0: on bge0 brgphy0: PHY 1 on miibus0 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow bge0: Ethernet address: 00:17:08:92:b6:e9 pcib3: at device 13.0 on pci0 pci3: on pcib3 bge1: mem 0xca10-0xca10 irq 19 at device 0.0 on pci3 bge1: CHIP ID 0x4101; ASIC REV 0x04; CHIP REV 0x41; PCI-E miibus1: on bge1 brgphy1: PHY 1 on miibus1 brgphy1: no media present ifmedia_set: no match for 0x0/0xfff panic: ifmedia_set cpuid = 0 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0x81a3f1b0 kdb_backtrace() at kdb_backtrace+0x39/frame 0x81a3f260 vpanic() at vpanic+0x126/frame 0x81a3f2a0 panic() at panic+0x43/frame 0x81a3f300 ifmedia_set() at ifmedia_set+0x5a/frame 0x81a3f310 brgphy_attach() at brgphy_attach+0x3a4/frame 0x81a3f350 device_attach() at device_attach+0x3a2/frame 0x81a3f3b0 bus_generic_attach() at bus_generic_attach+0x4a/frame 0x81a3f3d0 miibus_attach() at miibus_attach+0xbd/frame 0x81a3f410 device_attach() at device_attach+0x3a2/frame 0x81a3f470 bus_generic_attach() at bus_generic_attach+0x4a/frame 0x81a3f490 mii_attach() at mii_attach+0x435/frame 0x81a3f520 bge_attach() at bge_attach+0x4151/frame 0x81a3f600 device_attach() at device_attach+0x3a2/frame 0x81a3f660 bus_generic_attach() at bus_generic_attach+0x4a/frame 0x81a3f680 acpi_pci_attach() at acpi_pci_attach+0x15f/frame 0x81a3f6d0 device_attach() at device_attach+0x3a2/frame 0x81a3f730 bus_generic_attach() at bus_generic_attach+0x4a/frame 0x81a3f750 acpi_pcib_attach() at acpi_pcib_attach+0x23d/frame 0x81a3f7a0 acpi_pcib_pci_attach() at acpi_pcib_pci_attach+0x9f/frame 0x81a3f7e0 device_attach() at device_attach+0x3a2/frame 0x81a3f840 bus_generic_attach() at bus_generic_attach+0x4a/frame 0x81a3f860 acpi_pci_attach() at acpi_pci_attach+0x15f/frame 0x81a3f8b0 device_attach() at device_attach+0x3a2/frame 0x81a3f910 bus_generic_attach() at bus_generic_attach+0x4a/frame 0x81a3f930 acpi_pcib_attach() at acpi_pcib_attach+0x23d/frame 0x81a3f980 acpi_pcib_acpi_attach() at acpi_pcib_acpi_attach+0x2a9/frame 0x81a3f9d0 device_attach() at device_attach+0x3a2/frame 0x81a3fa30 bus_generic_attach() at bus_generic_attach+0x4a/frame 0x81a3fa50 acpi_attach() at acpi_attach+0xdd4/frame 0x81a3fb10 device_attach() at device_attach+0x3a2/frame 0x81a3fb70 bus_generic_attach() at bus_generic_attach+0x4a/frame 0x81a3fb90 nexus_acpi_attach() at nexus_acpi_attach+0x76/frame 0x81a3fbc0 device_attach() at device_attach+0x3a2/frame 0x81a3fc20 bus_generic_new_pass() at bus_generic_new_pass+0x116/frame 0x81a3fc50 bus_set_pass() at bus_set_pass+0x8f/frame 0x81a3fc80 configure() at configure+0xa/frame 0x81a3fc90 mi_startup() at mi_startup+0x118/frame 0x81a3fcb0 btext() at btext+0x2c Uptime: 1s Automatic reboot in 15 seconds - press a key on the console to abort Rebooting... ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: kern/190102: [tcp] net.inet.tcp.drop_synfin=1 no longer works on FreeBSD 10+ [regression]
On Wed, May 28, 2014 at 10:46 PM, Eygene Ryabinkin wrote: > I assume that your pf(4) is enabled during these tests, you have > "scrub" statements in the ruleset and removing "scrub" will restore > the expected behaviour on 10.x? I can confirm that I see exactly what you are saying on a stable/10 box. cheers, Hiren ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: kern/190102: [tcp] net.inet.tcp.drop_synfin=1 no longer works on FreeBSD 10+ [regression]
The following reply was made to PR kern/190102; it has been noted by GNATS. From: hiren panchasara To: Eygene Ryabinkin Cc: FreeBSD GNATS followup , "freebsd-net@freebsd.org" Subject: Re: kern/190102: [tcp] net.inet.tcp.drop_synfin=1 no longer works on FreeBSD 10+ [regression] Date: Wed, 28 May 2014 23:52:51 -0700 On Wed, May 28, 2014 at 10:46 PM, Eygene Ryabinkin wrote: > I assume that your pf(4) is enabled during these tests, you have > "scrub" statements in the ruleset and removing "scrub" will restore > the expected behaviour on 10.x? I can confirm that I see exactly what you are saying on a stable/10 box. cheers, Hiren ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: kern/190102: [tcp] net.inet.tcp.drop_synfin=1 no longer works on FreeBSD 10+ [regression]
- bugs (as this is not related to it) On Wed, May 28, 2014 at 10:46 PM, Eygene Ryabinkin wrote: > clearing FIN bit for SYN packets was > the standard behaviour of pf since approximately at least 10 years, > > http://svnweb.freebsd.org/base/vendor-sys/pf/dist/sys/contrib/pf/net/pf_norm.c?view=markup&pathrev=126258#l1242 I am curious, what's the rationale for this behavior? Why does PF clear the FIN bit for such a packet being a firewall? Cheers, Hiren ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: ECN marking implenetation for dummynet
On Tue, Apr 8, 2014 at 9:14 PM, hiren panchasara wrote: > On Tue, Apr 8, 2014 at 8:46 PM, Adrian Chadd wrote: >> Hi! Cool! can you file a FreeBSD PR with this? > > I'm testing this patch right now. > > I will make sure it doesn't get lost. :-) Committed as r266941. cheers, Hiren ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: ECN marking implenetation for dummynet
On Sun, Jun 1, 2014 at 2:08 AM, Luigi Rizzo wrote: > > > On Sunday, June 1, 2014, hiren panchasara > wrote: >> >> On Tue, Apr 8, 2014 at 9:14 PM, hiren panchasara >> wrote: >> > On Tue, Apr 8, 2014 at 8:46 PM, Adrian Chadd wrote: >> >> Hi! Cool! can you file a FreeBSD PR with this? >> > >> > I'm testing this patch right now. >> > >> > I will make sure it doesn't get lost. :-) >> >> Committed as r266941. >> > > I don't think we need the DNOLD_IS_ECN flag and translation. That stuff is > meant for old binaries Thanks for pointing it out. Committed as r266955. cheers, Hiren ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: ECN marking implenetation for dummynet
On Sun, Jun 1, 2014 at 12:31 AM, hiren panchasara wrote: > On Tue, Apr 8, 2014 at 9:14 PM, hiren panchasara > wrote: >> On Tue, Apr 8, 2014 at 8:46 PM, Adrian Chadd wrote: >>> Hi! Cool! can you file a FreeBSD PR with this? >> >> I'm testing this patch right now. >> >> I will make sure it doesn't get lost. :-) > > Committed as r266941. I wish to MFC this change and following actual dctcp changes (which I'll commit to -head in coming weeks) to 10. Let me know if there is any objection to it. cheers, Hiren ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: ECN marking implenetation for dummynet
On Mon, Jun 9, 2014 at 1:31 AM, Lawrence Stewart wrote: > On 6/8/2014 9:07 PM, hiren panchasara wrote: >> I wish to MFC this change and following actual dctcp changes (which >> I'll commit to -head in coming weeks) to 10. >> >> Let me know if there is any objection to it. > > > Please don't MFC the DCTCP stuff until I've had a chance to consider the KPI > changes. Prodding me to do so is likely a good idea. I am in no rush to MFC. But I didn't quite get what you are talking about wrt KPI changes. cheers, Hiren ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: LEDBAT (RFC-6817)i n FreeBSD as mod_cc(9)?
On Mon, Jun 16, 2014 at 6:25 AM, Lev Serebryakov wrote: > Hello, Freebsd-net. > > It looks like, that some TCP connections could benefit from LEDBAT > (RFC-6871) cognestion control algorithm (not all, of course, it should not > be default). > > Also, it looks like Apple implements one > > (http://www.opensource.apple.com/source/xnu/xnu-1699.32.7/bsd/netinet/tcp_ledbat.c), > but it uses much more "callbacks" from TCP/SCTP core to CC module, that > FreeBSD has. > > Does somebody evaluate, is it possible to bring LEDBAT to FreeBSD? I'd guess there is nothing wrong in having this as a cc module. Someone has to do the necessary legwork :-) cheers, Hiren ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
getpeername returning ENOTCONN for a connected socket
Reviving an old thread where Steve found this problem: A call to getpeername on a connected tcp socket returns ENOTCONN with no prior errors being reported by previous socket calls. Please look at http://lists.freebsd.org/pipermail/freebsd-net/2011-January/027647.html for more details. Here is a proposed patch derived from $src/sys/netsmb/smb_trantcp.c:nbssn_recv()'s way of handling a similar situation: Index: sys/kern/uipc_syscalls.c === --- sys/kern/uipc_syscalls.c(revision 267693) +++ sys/kern/uipc_syscalls.c(working copy) @@ -1755,6 +1755,12 @@ if (error != 0) return (error); so = fp->f_data; + if ((so->so_state & (SS_ISDISCONNECTED|SS_ISDISCONNECTING)) || + (so->so_rcv.sb_state & SBS_CANTRCVMORE)) { + error = ECONNRESET; + goto done; + } if ((so->so_state & (SS_ISCONNECTED|SS_ISCONFIRMING)) == 0) { error = ENOTCONN; goto done; Does this look correct? cheers, Hiren ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: getpeername returning ENOTCONN for a connected socket
On Sat, Jun 21, 2014 at 9:00 AM, Sean Bruno wrote: > On Fri, 2014-06-20 at 16:21 -0700, hiren panchasara wrote: >> Reviving an old thread where Steve found this problem: A call to >> getpeername on a connected tcp socket returns ENOTCONN with no prior >> errors being reported by previous socket calls. >> >> Please look at >> http://lists.freebsd.org/pipermail/freebsd-net/2011-January/027647.html >> for more details. >> >> Here is a proposed patch derived from >> $src/sys/netsmb/smb_trantcp.c:nbssn_recv()'s way of handling a similar >> situation: >> >> Index: sys/kern/uipc_syscalls.c >> === >> --- sys/kern/uipc_syscalls.c(revision 267693) >> +++ sys/kern/uipc_syscalls.c(working copy) >> @@ -1755,6 +1755,12 @@ >> if (error != 0) >> return (error); >> so = fp->f_data; >> + if ((so->so_state & (SS_ISDISCONNECTED|SS_ISDISCONNECTING)) || >> + (so->so_rcv.sb_state & SBS_CANTRCVMORE)) { >> + error = ECONNRESET; >> + goto done; >> + } >> if ((so->so_state & (SS_ISCONNECTED|SS_ISCONFIRMING)) == 0) { >> error = ENOTCONN; >> goto done; >> >> Does this look correct? >> >> cheers, >> Hiren > > Has this been tested in "anger" anywhere? No. This patch is from code observation after looking at the problem. I should at least writeup a small module to do local testing as Steve did in original report. I'll do that and get back. I'd appreciate if someone can point me to a better way of testing this. (specially in "anger" ;-)) cheers, Hiren ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Add netbw option to systat
On Wed, Jul 2, 2014 at 4:50 PM, Bryan Venteicher wrote: > Awhile back, DragonlyFlyBSD added a netbw option to systat that I've ported > to FreeBSD and found handy at various times: > >netbw Display aggregate and per-connection TCP receive and transmit > rates. Only active TCP connections are shown. > > Leading to output such as: > > tcp acceptsconnects rcv 1.192G snd 15.77K rexmit > > 192.168.10.80:22 192.168.10.20:23103 rcvsnd 415.7 [ NTSX ] > 192.168.10.80:22 192.168.10.20:46560 rcv 19.80M snd 14.47K [ NTSX ] > 192.168.10.80:22 192.168.10.20:60699 rcvsnd 886.3 [ NTSX ] > 192.168.10.81:5201192.168.10.51:60844 rcv 293.2M snd[R TSX ] > 192.168.10.81:5201192.168.10.51:60845 rcv 293.5M snd[R TSX ] > 192.168.10.81:5201192.168.10.51:60846 rcv 293.2M snd[R TSX ] > 192.168.10.81:5201192.168.10.51:60847 rcv 292.9M snd[R TSX ] > > It uses the sequences number from the 'struct tcpcb' to derive the rates, > which is usually good but certainly not perfect (i.e., don't set the > interval too long). > > I'd like to commit this if anybody else thinks they'd find it useful. > > http://people.freebsd.org/~bryanv/patches/systat-netbw.patch I like the idea. A few things about the patch: 1) You may want to remove the code hidden behind "#if 0" at 2 places. 2) I am not entirely clear on why/if we need the last column with flags but if we keep it (for compatibility of any other reason), It would be nice to have those flags explained in the manpage: + mvwprintw(wnd, LINES-2, 0, + "Rate/sec, " + "R=rxpend T=txpend N=nodelay T=tstmp " + "S=sack X=winscale F=fastrec"); 3) I feel that the header line for o/p (specially 'tcp accepts and connects' terminology) can be improved but I do not have a better suggestion :-) It looks okay me otherwise and thanks for your work. cheers, Hiren ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: r256920 missing in stable/9 and releng/9.3
+ freebsd-net@, On Mon, Jul 7, 2014 at 4:37 AM, Harald Schmalzbauer wrote: > Bezüglich Jan Mikkelsen's Nachricht vom 24.06.2014 04:49 (localtime): >> Hi, >> >> I’m bringing 9.3-RC1 into our local Perforce depot and moving our local >> patches to 9.2 forward. >> >> I noticed that r256920 (changing sys/netinet/tcp_input.c) has not been >> MFC’d. It was listed as “MFC after 3 days” back in October 2013. >> >> Is this patch missing for a reason? > > I'm wondering too if there's any good reason not to MFC? I also don't see any obvious reason. If nobody objects on -net@, I can do it. cheers, Hiren ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: r256920 missing in stable/9 and releng/9.3
On Thu, Jul 10, 2014 at 10:36 AM, John Baldwin wrote: > On Monday, July 07, 2014 12:58:53 pm hiren panchasara wrote: >> + freebsd-net@, >> >> On Mon, Jul 7, 2014 at 4:37 AM, Harald Schmalzbauer >> wrote: >> > Bezüglich Jan Mikkelsen's Nachricht vom 24.06.2014 04:49 (localtime): >> >> Hi, >> >> >> >> I’m bringing 9.3-RC1 into our local Perforce depot and moving our local > patches to 9.2 forward. >> >> >> >> I noticed that r256920 (changing sys/netinet/tcp_input.c) has not been > MFC’d. It was listed as “MFC after 3 days” back in October 2013. >> >> >> >> Is this patch missing for a reason? >> > >> > I'm wondering too if there's any good reason not to MFC? >> >> I also don't see any obvious reason. >> >> If nobody objects on -net@, I can do it. > > I think this looks fine to merge. Thanks John for confirming. Committed as r268506 to stable/9. It's a bit too late for 9.3-R. cheers, Hiren ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Why is r250764 not in 9.3?
+ Alexander On Tue, Jul 15, 2014 at 2:32 AM, Kajetan Staszkiewicz wrote: > The time has come to upgrade my routers to FreeBSD 9.3. > > While going through list of patches I had on 9.1, I've noticed that r248070 > got > into 9.3 but r250764 did not. Why is that? Probably just missed it. cheers, Hiren ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
UDP sendto() returning ENOBUFS - "No buffer space available"
Return values in sendto() manpage says: [ENOBUFS] The system was unable to allocate an internal buffer. The operation may succeed when buffers become avail- able. [ENOBUFS] The output queue for a network interface was full. This generally indicates that the interface has stopped sending, but may be caused by transient con- gestion. If I hit the first condition, it should reflect as failures in "netstat -m". Is that a correct assumption? I want to understand what happens when/if we hit the second condition. And how to prevent that from happening. Is it just application's job to rate-limit data it sends to the n/w interface card so that it doesn't saturate? Does kernel do any sort of queuing in the case of ENOBUFS? OR does the message just gets dropped? For an application sending a lot of UDP data and returning ENOBUFS, what all udp and other tunables I should tweak? I can only think of: - number of tx ring descriptors - increasing this will get us more txds. - kern.ipc.maxsockbuf: Increasing this will increase buffer size allocated for sockets. what else? Any comments/suggestions/corrections? cheers, Hiren ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: UDP sendto() returning ENOBUFS - "No buffer space available"
On Wed, Jul 16, 2014 at 11:00 AM, Adrian Chadd wrote: > Hi! > > So the UDP transmit path is udp_usrreqs->pru_send() == udp_send() -> > udp_output() -> ip_output() > > udp_output() does do a M_PREPEND() which can return ENOBUFS. ip_output > can also return ENOBUFS. > > it doesn't look like the socket code (eg sosend_dgram()) is doing any > buffering - it's just copying the frame and stuffing it up to the > driver. No queuing involved before the NIC. Right. Thanks for confirming. > > So a _well behaved_ driver will return ENOBUFS _and_ not queue the > frame. However, it's entirely plausible that the driver isn't well > behaved - the intel drivers screwed up here and there with transmit > queue and failure to queue vs failure to transmit. > > So yeah, try tweaking the tx ring descriptor for the driver your'e > using and see how big a bufring it's allocating. Yes, so I am dealing with Broadcom BCM5706/BCM5708 Gigabit Ethernet, i.e. bce(4). I bumped up tx_pages from 2 (default) to 8 where each page is 255 buffer descriptors. I am seeing quite nice improvement on stable/10 where I can send *more* stuff :-) cheers, Hiren ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"