[Differential] [Request, 117 lines] D5185: tcp/lro: Allow network drivers to set the limit for TCP ACK/data segment aggregation limit
sepherosa_gmail.com created this revision. sepherosa_gmail.com added reviewers: network, adrian, delphij, royger, decui_microsoft.com, honzhan_microsoft.com, howard0su_gmail.com, gallatin, hselasky, np. sepherosa_gmail.com added subscribers: freebsd-net-list, freebsd-virtualization-list. Herald added a reviewer: transport. REVISION SUMMARY It's append_cnt based. Unless the network driver sets these two limits, its an NO-OP. For hn(4): - Set TCP ACK append limit to 1, i.e. aggregate 2 ACKs at most. Aggregate anything more than 2 hurts TCP sending performance in hyperv. This significantly improves the TCP sending performance when the number of concurrent connetion is low (2~8). And greatly stabilize the TCP sending performance in other cases. - Set TCP data segments append limit to 25. Without this limitation, hn(4) could aggregate ~45 TCP data segments for each connection (even at 64 or more connections) before dispatching them to socket code; large aggregation slows down ACK sending and eventually hurts/destabilizes TCP reception performance. This setting stabilizes and improves TCP reception performance for >4 concurrent connections significantly. Make them sysctls so they could be adjusted. REVISION DETAIL https://reviews.freebsd.org/D5185 AFFECTED FILES sys/dev/hyperv/netvsc/hv_net_vsc.h sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c sys/netinet/tcp_lro.c sys/netinet/tcp_lro.h EMAIL PREFERENCES https://reviews.freebsd.org/settings/panel/emailpreferences/ To: sepherosa_gmail.com, network, transport, adrian, delphij, royger, decui_microsoft.com, honzhan_microsoft.com, howard0su_gmail.com, gallatin, hselasky, np Cc: freebsd-virtualization-list, freebsd-net-list diff --git a/sys/netinet/tcp_lro.h b/sys/netinet/tcp_lro.h --- a/sys/netinet/tcp_lro.h +++ b/sys/netinet/tcp_lro.h @@ -91,6 +91,8 @@ unsigned lro_cnt; unsigned lro_mbuf_count; unsigned lro_mbuf_max; + unsigned short lro_ack_append_lim; + unsigned short lro_data_append_lim; struct lro_head lro_active; struct lro_head lro_free; diff --git a/sys/netinet/tcp_lro.c b/sys/netinet/tcp_lro.c --- a/sys/netinet/tcp_lro.c +++ b/sys/netinet/tcp_lro.c @@ -88,6 +88,8 @@ lc->lro_mbuf_max = lro_mbufs; lc->lro_cnt = lro_entries; lc->ifp = ifp; + lc->lro_ack_append_lim = 0; + lc->lro_data_append_lim = 0; SLIST_INIT(&lc->lro_free); SLIST_INIT(&lc->lro_active); @@ -646,6 +648,16 @@ if (tcp_data_len == 0) { m_freem(m); + /* + * Flush this LRO entry, if this ACK should + * not be further delayed. + */ + if (lc->lro_ack_append_lim && + le->append_cnt >= lc->lro_ack_append_lim) { +SLIST_REMOVE(&lc->lro_active, le, lro_entry, +next); +tcp_lro_flush(lc, le); + } return (0); } @@ -664,9 +676,12 @@ /* * If a possible next full length packet would cause an - * overflow, pro-actively flush now. + * overflow, pro-actively flush now. And if we are asked + * to limit the data aggregate, flush this LRO entry now. */ - if (le->p_len > (65535 - lc->ifp->if_mtu)) { + if (le->p_len > (65535 - lc->ifp->if_mtu) || + (lc->lro_data_append_lim && + le->append_cnt >= lc->lro_data_append_lim)) { SLIST_REMOVE(&lc->lro_active, le, lro_entry, next); tcp_lro_flush(lc, le); } else diff --git a/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c b/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c --- a/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c +++ b/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c @@ -176,14 +176,8 @@ #define HN_CSUM_ASSIST_WIN8 (CSUM_TCP) #define HN_CSUM_ASSIST (CSUM_IP | CSUM_UDP | CSUM_TCP) -/* XXX move to netinet/tcp_lro.h */ -#define HN_LRO_HIWAT_MAX65535 -#define HN_LRO_HIWAT_DEFHN_LRO_HIWAT_MAX -/* YYY 2*MTU is a bit rough, but should be good enough. */ -#define HN_LRO_HIWAT_MTULIM(ifp) (2 * (ifp)->if_mtu) -#define HN_LRO_HIWAT_ISVALID(sc, hiwat) \ -((hiwat) >= HN_LRO_HIWAT_MTULIM((sc)->hn_ifp) || \ - (hiwat) <= HN_LRO_HIWAT_MAX) +#define HN_LRO_ACK_APPEND_LIM 1 +#define HN_LRO_DATA_APPEND_LIM 25 /* * Be aware that this sleepable mutex will exhibit WITNESS errors when @@ -253,27 +247,16 @@ static void hn_start_txeof(struct ifnet *ifp); static int hn_ifmedia_upd(struct ifnet *ifp); static void hn_ifmedia_sts(struct ifnet *ifp, struct ifmediareq *ifmr); -#ifdef HN_LRO_HIWAT -static int hn_lro_hiwat_sysctl(SYSCTL_HANDLER_ARGS); -#endif static int hn_trust_hcsum_sysctl(SYSCTL_HANDLER_ARGS); static int hn_tx_chimney_size_sysctl(SYSCTL_HANDLER_ARGS); +static int hn_lro_append_lim_sysctl(SYSCTL_HANDLER_ARGS); static int hn_check_iplen(const struct mbuf *, int); static int hn_create_tx_ring(struct hn_softc *sc); static void hn_destroy_tx_ring(struct hn_softc *sc); static void hn_start_taskfunc(void *xsc, int pending); static void hn_txeof_taskfunc(void *xsc, int pending); static int hn_encap(struct hn_softc *, struct hn_txdesc *, struct mbuf **); -static __inline void -hn_se
[Differential] [Updated] D5185: tcp/lro: Allow network drivers to set the limit for TCP ACK/data segment aggregation limit
sepherosa_gmail.com updated the summary for this revision. REVISION DETAIL https://reviews.freebsd.org/D5185 EMAIL PREFERENCES https://reviews.freebsd.org/settings/panel/emailpreferences/ To: sepherosa_gmail.com, network, adrian, delphij, royger, decui_microsoft.com, honzhan_microsoft.com, howard0su_gmail.com, gallatin, hselasky, np, transport Cc: freebsd-virtualization-list, freebsd-net-list ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: ixgbe: Network performance tuning (#TCP connections)
On 02/03/16 14:37, Meyer, Wolfgang wrote: Hello, we are evaluating network performance on a DELL-Server (PowerEdge R930 with 4 Sockets, hw.model: Intel(R) Xeon(R) CPU E7-8891 v3 @ 2.80GHz) with 10 GbE-Cards. We use programs that on server side accepts connections on a IP-address+port from the client side and after establishing the connection data is sent in turns between server and client in a predefined pattern (server side sends more data than client side) with sleeps in between the send phases. The test set-up is chosen in such way that every client process initiates 500 connections handled in threads and on the server side each process representing an IP/Port pair also handles 500 connections in threads. The number of connections is then increased and the overall network througput is observed using nload. On FreeBSD (on server side) roughly at 50,000 connections errors begin to occur and the overall throughput won't increase further with more connections. With Linux on the server side it is possible to establish more than 120,000 connections and at 50,000 connections the overall throughput ist double that of FreeBSD with the same sending pattern. Furthermore system load on FreeBSD is much higher with 50 % system usage on each core and 80 % interrupt usage on the 8 cores handling the interrupt queues for the NIC. In comparison Linux has <10 % system usage, <10 % user usage and about 15 % interrupt usage on the 16 cores handling the network interrupts for 50,000 connections. Varying the numbers for the NIC interrupt queues won't change the performance (rather worsens the situation). Disabling Hyperthreading (utilising 40 cores) degrades the performance. Increasing MAXCPU to utilise all 80 cores won't improve compared to 64 cores, atkbd and uart had to be disabled to avoid kernel panics with increased MAXCPU (thanks to Andre Oppermann for investigating this). Initiallly the tests were made on 10.2 Release, later I switched to 10 Stable (later with ixgbe driver version 3.1.0) but that didn't change the numbers. Some sysctl configurables were modified along the network performance guidelines found on the net (e.g. https://calomel.org/freebsd_network_tuning.html, https://www.freebsd.org/doc/handbook/configtuning-kernel-limits.html, https://pleiades.ucsc.edu/hyades/FreeBSD_Network_Tuning) but most of them didn't have any measuarable impact. Final sysctl.conf and loader.conf settings see below. Actually the only tunables that provided any improvement were identified to be hw.ix.txd, and hw.ix.rxd that were reduced (!) to the minimum value of 64 and hw.ix.tx_process_limit and hw.ix.rx_process_limit that were set to -1. Any ideas what tunables might be changed to get a higher number of TCP connections (it's not a question of the overall throughput as changing the sending pattern allows me to fully utilise the 10Gb bandwidth)? How can I determine where the kernel is spending its time that causes the high CPU load? Any pointers are highly appreciated, I can't believe that there is such a blatant difference in network performance compared to Linux. Regards, Wolfgang [SNIP] Hi Wolfgang, hwpmc is your friend here if you need to investigate where are your processors wasting their time. Either you will find them contending for network stack (probably the pcb hash table), either they are fighting each other in the scheduler's lock(s) trying to steal jobs from working ones. Also check QPI links activity that may reveal interesting facts about PCI root-complexes geography vs processes locations and migration. You have two options here: Either you persist in using a 4x10 core machine and you will have a long time rearranging stickyness of processes and interrupt to specific cores/packages (Driver, then isr rings, then userland) and police the whole thing (read peacekeeping the riot), either you go to the much simpler solution that is 1 (yes, one) socket machine, fastest available proc with low core (E5-1630v2/3 or 1650) that can handle 10G links hands down out-of-the-box. Also note that there are specific and interesting optimization in the L2 generation on -head that you may want to try if the problem is stack-centered. You may also have a threading problem (userland ones). In the domain of counting instructions per packets (you can practice that with netmap as a wonderfull mean of really 'sensing' what 40Gbps is), threading is bad (and Hyperthreading is evil). Thanks. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Bug 203630] [Hyper-V] [nat] [tcp] 10.2 NAT bug in TCP stack or hyperv netsvc driver
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=203630 --- Comment #28 from Franco Fichtner --- Hello, We've run into this too over at OPNsense. This is a harsh regression from 10.1 to 10.2. It needs an errata for 10.2. Thank you, Franco -- You are receiving this mail because: You are on the CC list for the bug. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Bug 203630] [Hyper-V] [nat] [tcp] 10.2 NAT bug in TCP stack or hyperv netsvc driver
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=203630 --- Comment #29 from Eddy --- Hello everybody, The issue was fixed with patch r291156. I tested it on a clean FreeBSD install by recompiling the kernel in a test environment and it worked. It was merged to the STABLE 10 branch (Fri Dec 18 14:56:49 UTC 2015). I assume that the latest build include the fix, however I'm running 10.2-RELEASE-p12 on my production server but the problem still occurs. -- You are receiving this mail because: You are on the CC list for the bug. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
RE: ixgbe: Network performance tuning (#TCP connections)
Did you enable LRO on FreeBSD side (check 'ifconfig' output)? Linux default enables GRO (see the output of 'ethtool -k eth0'). Thanks Hongjiang Zhang -Original Message- From: owner-freebsd-...@freebsd.org [mailto:owner-freebsd-...@freebsd.org] On Behalf Of Meyer, Wolfgang Sent: Wednesday, February 3, 2016 9:37 PM To: 'freebsd-net@FreeBSD.org' Cc: 'freebsd-performa...@freebsd.org' Subject: ixgbe: Network performance tuning (#TCP connections) Hello, we are evaluating network performance on a DELL-Server (PowerEdge R930 with 4 Sockets, hw.model: Intel(R) Xeon(R) CPU E7-8891 v3 @ 2.80GHz) with 10 GbE-Cards. We use programs that on server side accepts connections on a IP-address+port from the client side and after establishing the connection data is sent in turns between server and client in a predefined pattern (server side sends more data than client side) with sleeps in between the send phases. The test set-up is chosen in such way that every client process initiates 500 connections handled in threads and on the server side each process representing an IP/Port pair also handles 500 connections in threads. The number of connections is then increased and the overall network througput is observed using nload. On FreeBSD (on server side) roughly at 50,000 connections errors begin to occur and the overall throughput won't increase further with more connections. With Linux on the server side it is possible to establish more than 120,000 connections and at 50,000 connections the overall throughput ist double that of FreeBSD with the same sending pattern. Furthermore system load on FreeBSD is much higher with 50 % system usage on each core and 80 % interrupt usage on the 8 cores handling the interrupt queues for the NIC. In comparison Linux has <10 % system usage, <10 % user usage and about 15 % interrupt usage on the 16 cores handling the network interrupts for 50,000 connections. Varying the numbers for the NIC interrupt queues won't change the performance (rather worsens the situation). Disabling Hyperthreading (utilising 40 cores) degrades the performance. Increasing MAXCPU to utilise all 80 cores won't improve compared to 64 cores, atkbd and uart had to be disabled to avoid kernel panics with increased MAXCPU (thanks to Andre Oppermann for investigating this). Initiallly the tests were made on 10.2 Release, later I switched to 10 Stable (later with ixgbe driver version 3.1.0) but that didn't change the numbers. Some sysctl configurables were modified along the network performance guidelines found on the net (e.g. https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fcalomel.org%2ffreebsd_network_tuning.html%2c&data=01%7c01%7chonzhan%40064d.mgd.microsoft.com%7cf827a05328ca4ca9781608d32c9f5b12%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=xsMoC%2b1ZcnoHBnPqhLUMDIr8VLBcLejnrXgkRyDWzYc%3d https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fwww.freebsd.org%2fdoc%2fhandbook%2fconfigtuning-kernel-limits.html%2c&data=01%7c01%7chonzhan%40064d.mgd.microsoft.com%7cf827a05328ca4ca9781608d32c9f5b12%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=XNqvrYfTNzfe2btrip%2f5FoX3iTTpTSbNrDjbhtVBevo%3d https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fpleiades.ucsc.edu%2fhyades%2fFreeBSD_Network_Tuning&data=01%7c01%7chonzhan%40064d.mgd.microsoft.com%7cf827a05328ca4ca9781608d32c9f5b12%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=%2bQ66X%2frnqNakX%2fSGcK08QTTrsDjUUWBHOXu6%2fOBIBN Q%3d) but most of them didn't have any measuarable impact. Final sysctl.conf and loader.conf settings see below. Actually the only tunables that provided any improvement were identified to be hw.ix.txd, and hw.ix.rxd that were reduced (!) to the minimum value of 64 and hw.ix.tx_process_limit and hw.ix.rx_process_limit that were set to -1. Any ideas what tunables might be changed to get a higher number of TCP connections (it's not a question of the overall throughput as changing the sending pattern allows me to fully utilise the 10Gb bandwidth)? How can I determine where the kernel is spending its time that causes the high CPU load? Any pointers are highly appreciated, I can't believe that there is such a blatant difference in network performance compared to Linux. Regards, Wolfgang : cc_htcp_load="YES" hw.ix.txd="64" hw.ix.rxd="64" hw.ix.tx_process_limit="-1" hw.ix.rx_process_limit="-1" hw.ix.num_queues="8" #hw.ix.enable_aim="0" #hw.ix.max_interrupt_rate="31250" #net.isr.maxthreads="16" : kern.ipc.soacceptqueue=1024 kern.ipc.maxsockbuf=16777216 net.inet.tcp.sendbuf_max=16777216 net.inet.tcp.recvbuf_max=16777216 net.inet.tcp.tso=0 net.inet.tcp.mssdflt=1460 net.inet.tcp.minmss=1300 net.inet.tcp.nolocaltimewait=1 net.inet.tcp.syncache.rexmtlimit=0 #net.inet.tcp.syncookies=0 net.inet.tcp.drop_synfin=1 net.inet.tcp.fast_finwait2_recycle=1 net.inet.tcp.icmp_may_rst=0 net.inet.tcp.msl=5000 net.inet.tcp.path_mtu_discovery=0 net.inet.tcp.blackhole=1 net.inet.udp.bl
RE: ixgbe: Network performance tuning (#TCP connections)
Please check whether LRO is enabled on your FreeBSD server with "ifconfig". Linux default enables GRO (see the output of 'ethtool -k eth0'), which covers LRO optimization. Thanks Hongjiang Zhang -Original Message- From: owner-freebsd-...@freebsd.org [mailto:owner-freebsd-...@freebsd.org] On Behalf Of Meyer, Wolfgang Sent: Wednesday, February 3, 2016 9:37 PM To: 'freebsd-net@FreeBSD.org' Cc: 'freebsd-performa...@freebsd.org' Subject: ixgbe: Network performance tuning (#TCP connections) Hello, we are evaluating network performance on a DELL-Server (PowerEdge R930 with 4 Sockets, hw.model: Intel(R) Xeon(R) CPU E7-8891 v3 @ 2.80GHz) with 10 GbE-Cards. We use programs that on server side accepts connections on a IP-address+port from the client side and after establishing the connection data is sent in turns between server and client in a predefined pattern (server side sends more data than client side) with sleeps in between the send phases. The test set-up is chosen in such way that every client process initiates 500 connections handled in threads and on the server side each process representing an IP/Port pair also handles 500 connections in threads. The number of connections is then increased and the overall network througput is observed using nload. On FreeBSD (on server side) roughly at 50,000 connections errors begin to occur and the overall throughput won't increase further with more connections. With Linux on the server side it is possible to establish more than 120,000 connections and at 50,000 connections the overall throughput ist double that of FreeBSD with the same sending pattern. Furthermore system load on FreeBSD is much higher with 50 % system usage on each core and 80 % interrupt usage on the 8 cores handling the interrupt queues for the NIC. In comparison Linux has <10 % system usage, <10 % user usage and about 15 % interrupt usage on the 16 cores handling the network interrupts for 50,000 connections. Varying the numbers for the NIC interrupt queues won't change the performance (rather worsens the situation). Disabling Hyperthreading (utilising 40 cores) degrades the performance. Increasing MAXCPU to utilise all 80 cores won't improve compared to 64 cores, atkbd and uart had to be disabled to avoid kernel panics with increased MAXCPU (thanks to Andre Oppermann for investigating this). Initiallly the tests were made on 10.2 Release, later I switched to 10 Stable (later with ixgbe driver version 3.1.0) but that didn't change the numbers. Some sysctl configurables were modified along the network performance guidelines found on the net (e.g. https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fcalomel.org%2ffreebsd_network_tuning.html%2c&data=01%7c01%7chonzhan%40064d.mgd.microsoft.com%7cf827a05328ca4ca9781608d32c9f5b12%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=xsMoC%2b1ZcnoHBnPqhLUMDIr8VLBcLejnrXgkRyDWzYc%3d https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fwww.freebsd.org%2fdoc%2fhandbook%2fconfigtuning-kernel-limits.html%2c&data=01%7c01%7chonzhan%40064d.mgd.microsoft.com%7cf827a05328ca4ca9781608d32c9f5b12%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=XNqvrYfTNzfe2btrip%2f5FoX3iTTpTSbNrDjbhtVBevo%3d https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fpleiades.ucsc.edu%2fhyades%2fFreeBSD_Network_Tuning&data=01%7c01%7chonzhan%40064d.mgd.microsoft.com%7cf827a05328ca4ca9781608d32c9f5b12%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=%2bQ66X%2frnqNakX%2fSGcK08QTTrsDjUUWBHOXu6%2fOBIBN Q%3d) but most of them didn't have any measuarable impact. Final sysctl.conf and loader.conf settings see below. Actually the only tunables that provided any improvement were identified to be hw.ix.txd, and hw.ix.rxd that were reduced (!) to the minimum value of 64 and hw.ix.tx_process_limit and hw.ix.rx_process_limit that were set to -1. Any ideas what tunables might be changed to get a higher number of TCP connections (it's not a question of the overall throughput as changing the sending pattern allows me to fully utilise the 10Gb bandwidth)? How can I determine where the kernel is spending its time that causes the high CPU load? Any pointers are highly appreciated, I can't believe that there is such a blatant difference in network performance compared to Linux. Regards, Wolfgang : cc_htcp_load="YES" hw.ix.txd="64" hw.ix.rxd="64" hw.ix.tx_process_limit="-1" hw.ix.rx_process_limit="-1" hw.ix.num_queues="8" #hw.ix.enable_aim="0" #hw.ix.max_interrupt_rate="31250" #net.isr.maxthreads="16" : kern.ipc.soacceptqueue=1024 kern.ipc.maxsockbuf=16777216 net.inet.tcp.sendbuf_max=16777216 net.inet.tcp.recvbuf_max=16777216 net.inet.tcp.tso=0 net.inet.tcp.mssdflt=1460 net.inet.tcp.minmss=1300 net.inet.tcp.nolocaltimewait=1 net.inet.tcp.syncache.rexmtlimit=0 #net.inet.tcp.syncookies=0 net.inet.tcp.drop_synfin=1 net.inet.tcp.fast_finwait2_recycle=1 net.inet.tcp.icmp_may_rst=0 net.inet.tcp.msl=5000 net.inet.tcp.path_mtu_discov
dev.netmap.buf_size and packett size from host
hello, I have a netmap application which has host mode bridge/fwd, with default settings I have the following error some often: 884.260394 [2950] netmap_transmit igb1 from_host, drop packet size 2962 > 2048 the only application which relies on host mode is bird, so those packets are probably from bird daemon, when I get those errors I get bird sessions failing and restart I raised dev.netmap.buf_size to 5000 it ajusted to 5120, things got better but I still have logs: netmap_transmit igb1 from_host, drop packet size 5858 > 5120 Now the main question is, when dev.netmap.buf_size is 2048 the application uses 1.3G of RAM but when I raise to 5120 it uses 3G of RAM. So I need to understand, is this packet size really related from what I get from the application packets coming from host to netmap? If so can I allow for bigger sizes, like 16k (lo0 mtu) without pre-alloc so much more RAM? thank you E. Meyer ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: dev.netmap.buf_size and packett size from host
Make sure you disable TSO on the interface used in netmap mode, and then check that you use an MTU of 1500 on that interface. You should not receive frames larger than MTU coming from the host in these conditions. cheers luigi On Thu, Feb 4, 2016 at 3:26 PM, Eduardo Meyer wrote: > hello, > > I have a netmap application which has host mode bridge/fwd, with default > settings I have the following error some often: > > 884.260394 [2950] netmap_transmit igb1 from_host, drop packet > size 2962 > 2048 > > the only application which relies on host mode is bird, so those packets > are probably from bird daemon, when I get those errors I get bird sessions > failing and restart > > I raised dev.netmap.buf_size to 5000 it ajusted to 5120, things got better > but I still have logs: > > netmap_transmit igb1 from_host, drop packet size 5858 > 5120 > > Now the main question is, when dev.netmap.buf_size is 2048 the application > uses 1.3G of RAM but when I raise to 5120 it uses 3G of RAM. > > So I need to understand, is this packet size really related from what I get > from the application packets coming from host to netmap? If so can I allow > for bigger sizes, like 16k (lo0 mtu) without pre-alloc so much more RAM? > > thank you > > E. Meyer > ___ > freebsd-net@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" -- -+--- Prof. Luigi RIZZO, ri...@iet.unipi.it . Dip. di Ing. dell'Informazione http://www.iet.unipi.it/~luigi/. Universita` di Pisa TEL +39-050-2217533 . via Diotisalvi 2 Mobile +39-338-6809875 . 56122 PISA (Italy) -+--- ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: dev.netmap.buf_size and packett size from host
disable all hardware accelerations when using netmap. cheers luigi On Thu, Feb 4, 2016 at 3:34 PM, Eduardo Meyer wrote: > mtu is good, TSO was on, thank you will retest right now. > > which other port features should I disable? I only disabled txcsum and > rxcsum before, now tso on the list, anything else in netmap mode? > > On Thu, Feb 4, 2016 at 12:29 PM, Luigi Rizzo wrote: >> >> Make sure you disable TSO on the interface used in netmap >> mode, and then check that you use an MTU of 1500 on that >> interface. >> You should not receive frames larger than MTU coming from >> the host in these conditions. >> >> cheers >> luigi >> >> >> On Thu, Feb 4, 2016 at 3:26 PM, Eduardo Meyer >> wrote: >> > hello, >> > >> > I have a netmap application which has host mode bridge/fwd, with default >> > settings I have the following error some often: >> > >> > 884.260394 [2950] netmap_transmit igb1 from_host, drop packet >> > size 2962 > 2048 >> > >> > the only application which relies on host mode is bird, so those packets >> > are probably from bird daemon, when I get those errors I get bird >> > sessions >> > failing and restart >> > >> > I raised dev.netmap.buf_size to 5000 it ajusted to 5120, things got >> > better >> > but I still have logs: >> > >> > netmap_transmit igb1 from_host, drop packet size 5858 > 5120 >> > >> > Now the main question is, when dev.netmap.buf_size is 2048 the >> > application >> > uses 1.3G of RAM but when I raise to 5120 it uses 3G of RAM. >> > >> > So I need to understand, is this packet size really related from what I >> > get >> > from the application packets coming from host to netmap? If so can I >> > allow >> > for bigger sizes, like 16k (lo0 mtu) without pre-alloc so much more RAM? >> > >> > thank you >> > >> > E. Meyer >> > ___ >> > freebsd-net@freebsd.org mailing list >> > https://lists.freebsd.org/mailman/listinfo/freebsd-net >> > To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" >> >> >> >> -- >> -+--- >> Prof. Luigi RIZZO, ri...@iet.unipi.it . Dip. di Ing. dell'Informazione >> http://www.iet.unipi.it/~luigi/. Universita` di Pisa >> TEL +39-050-2217533 . via Diotisalvi 2 >> Mobile +39-338-6809875 . 56122 PISA (Italy) >> -+--- > > > > > -- > === > Eduardo Meyer > pessoal: dudu.me...@gmail.com > profissional: ddm.farmac...@saude.gov.br -- -+--- Prof. Luigi RIZZO, ri...@iet.unipi.it . Dip. di Ing. dell'Informazione http://www.iet.unipi.it/~luigi/. Universita` di Pisa TEL +39-050-2217533 . via Diotisalvi 2 Mobile +39-338-6809875 . 56122 PISA (Italy) -+--- ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: dev.netmap.buf_size and packett size from host
mtu is good, TSO was on, thank you will retest right now. which other port features should I disable? I only disabled txcsum and rxcsum before, now tso on the list, anything else in netmap mode? On Thu, Feb 4, 2016 at 12:29 PM, Luigi Rizzo wrote: > Make sure you disable TSO on the interface used in netmap > mode, and then check that you use an MTU of 1500 on that > interface. > You should not receive frames larger than MTU coming from > the host in these conditions. > > cheers > luigi > > > On Thu, Feb 4, 2016 at 3:26 PM, Eduardo Meyer > wrote: > > hello, > > > > I have a netmap application which has host mode bridge/fwd, with default > > settings I have the following error some often: > > > > 884.260394 [2950] netmap_transmit igb1 from_host, drop packet > > size 2962 > 2048 > > > > the only application which relies on host mode is bird, so those packets > > are probably from bird daemon, when I get those errors I get bird > sessions > > failing and restart > > > > I raised dev.netmap.buf_size to 5000 it ajusted to 5120, things got > better > > but I still have logs: > > > > netmap_transmit igb1 from_host, drop packet size 5858 > 5120 > > > > Now the main question is, when dev.netmap.buf_size is 2048 the > application > > uses 1.3G of RAM but when I raise to 5120 it uses 3G of RAM. > > > > So I need to understand, is this packet size really related from what I > get > > from the application packets coming from host to netmap? If so can I > allow > > for bigger sizes, like 16k (lo0 mtu) without pre-alloc so much more RAM? > > > > thank you > > > > E. Meyer > > ___ > > freebsd-net@freebsd.org mailing list > > https://lists.freebsd.org/mailman/listinfo/freebsd-net > > To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" > > > > -- > -+--- > Prof. Luigi RIZZO, ri...@iet.unipi.it . Dip. di Ing. dell'Informazione > http://www.iet.unipi.it/~luigi/. Universita` di Pisa > TEL +39-050-2217533 . via Diotisalvi 2 > Mobile +39-338-6809875 . 56122 PISA (Italy) > -+--- > -- === Eduardo Meyer pessoal: dudu.me...@gmail.com profissional: ddm.farmac...@saude.gov.br ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Differential] [Updated] D5185: tcp/lro: Allow network drivers to set the limit for TCP ACK/data segment aggregation limit
gallatin added a comment. It might be nice to make these general tunables that could be done centrally and apply to all drivers, but that's probably outside the scope of the review. INLINE COMMENTS sys/netinet/tcp_lro.c:655 Can you just initialize ack_append_limit to the max value for whatever type it is and eliminate the check for a 0 ack_append_limit? That would eliminate one clause from this conditional. sys/netinet/tcp_lro.c:684 Rather than adding more clauses to this condition, how would to feel about setting an append limit in bytes, and replacing the hard-coded 65535 with this new limit? The default lro init would initialize the new limit to 65535. And hn(4) would initialize it in terms of multiples of its MTU. REVISION DETAIL https://reviews.freebsd.org/D5185 EMAIL PREFERENCES https://reviews.freebsd.org/settings/panel/emailpreferences/ To: sepherosa_gmail.com, network, adrian, delphij, royger, decui_microsoft.com, honzhan_microsoft.com, howard0su_gmail.com, hselasky, np, transport, gallatin Cc: freebsd-virtualization-list, freebsd-net-list ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Differential] [Commented On] D5185: tcp/lro: Allow network drivers to set the limit for TCP ACK/data segment aggregation limit
adrian added inline comments. INLINE COMMENTS sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c:455 this should be a separate commit REVISION DETAIL https://reviews.freebsd.org/D5185 EMAIL PREFERENCES https://reviews.freebsd.org/settings/panel/emailpreferences/ To: sepherosa_gmail.com, network, adrian, delphij, royger, decui_microsoft.com, honzhan_microsoft.com, howard0su_gmail.com, hselasky, np, transport, gallatin Cc: freebsd-virtualization-list, freebsd-net-list ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: 10.2-RELEASE-p12 pf+GRE crashing
On 2/3/2016 6:47 PM, Matthew Grooms wrote: This turned out to be another issue that was patched in head but not back ported to stable. I can't explain why it didn't get tripped when GRE tunnels were disabled. With the patch applied, I can reload my rule sets again without crashing ... https://svnweb.freebsd.org/base?view=revision&revision=264689 I wanted to clarify in case another user runs into this issue and searches the mailing list history for a solution: The patch I applied to fix this particular kernel crash wasn't 264689, it was ... https://svnweb.freebsd.org/base?view=revision&revision=264915 Sorry for the misinformation. I cut and pasted the wrong link. -Matthew (kgdb) bt #0 doadump (textdump=) at pcpu.h:219 #1 0x807c81f2 in kern_reboot (howto=260) at ../../../kern/kern_shutdown.c:451 #2 0x807c85d5 in vpanic (fmt=, ap=optimized out>) at ../../../kern/kern_shutdown.c:758 #3 0x807c8463 in panic (fmt=0x0) at ../../../kern/kern_shutdown.c:687 #4 0x80bdc10b in trap_fatal (frame=, eva=) at ../../../amd64/amd64/trap.c:851 #5 0x80bdc40d in trap_pfault (frame=0xfe233a80, usermode=) at ../../../amd64/amd64/trap.c:674 #6 0x80bdbaaa in trap (frame=0xfe233a80) at ../../../amd64/amd64/trap.c:440 #7 0x80bc1fa2 in calltrap () at ../../../amd64/amd64/exception.S:236 #8 0x809c07f4 in pfr_detach_table (kt=0x0) at ../../../netpfil/pf/pf_table.c:2047 #9 0x809a91f4 in pf_empty_pool (poola=0x813c3d68) at ../../../netpfil/pf/pf_ioctl.c:354 #10 0x809ab3e5 in pfioctl (dev=, cmd=, addr=0xf8005eaf6800 "", flags=, td=optimized out>) at ../../../netpfil/pf/pf_ioctl.c:2189 #11 0x806b5659 in devfs_ioctl_f (fp=0xf8000a2927d0, com=3295691827, data=0xf8005eaf6800, cred=, td=0xf8000a25f000) at ../../../fs/devfs/devfs_vnops.c:785 #12 0x8081b805 in kern_ioctl (td=0xf8000a25f000, fd=optimized out>, com=2) at file.h:320 #13 0x8081b500 in sys_ioctl (td=0xf8000a25f000, uap=0xfe234b40) at ../../../kern/sys_generic.c:718 #14 0x80bdca27 in amd64_syscall (td=0xf8000a25f000, traced=0) at subr_syscall.c:134 #15 0x80bc228b in Xfast_syscall () at ../../../amd64/amd64/exception.S:396 #16 0x000800dd9fda in ?? () Previous frame inner to this frame (corrupt stack?) Current language: auto; currently minimal -Matthew ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: dev.netmap.buf_size and packett size from host
when i disabled LRO it ruined communication on the port to network (altough from host was ok), everything else looks good and so far I had no problem with big packets coming from host, so -tso did it, thank you! On Thu, Feb 4, 2016 at 12:35 PM, Luigi Rizzo wrote: > disable all hardware accelerations when using netmap. > > cheers > luigi > > On Thu, Feb 4, 2016 at 3:34 PM, Eduardo Meyer > wrote: > > mtu is good, TSO was on, thank you will retest right now. > > > > which other port features should I disable? I only disabled txcsum and > > rxcsum before, now tso on the list, anything else in netmap mode? > > > > On Thu, Feb 4, 2016 at 12:29 PM, Luigi Rizzo wrote: > >> > >> Make sure you disable TSO on the interface used in netmap > >> mode, and then check that you use an MTU of 1500 on that > >> interface. > >> You should not receive frames larger than MTU coming from > >> the host in these conditions. > >> > >> cheers > >> luigi > >> > >> > >> On Thu, Feb 4, 2016 at 3:26 PM, Eduardo Meyer > >> wrote: > >> > hello, > >> > > >> > I have a netmap application which has host mode bridge/fwd, with > default > >> > settings I have the following error some often: > >> > > >> > 884.260394 [2950] netmap_transmit igb1 from_host, drop > packet > >> > size 2962 > 2048 > >> > > >> > the only application which relies on host mode is bird, so those > packets > >> > are probably from bird daemon, when I get those errors I get bird > >> > sessions > >> > failing and restart > >> > > >> > I raised dev.netmap.buf_size to 5000 it ajusted to 5120, things got > >> > better > >> > but I still have logs: > >> > > >> > netmap_transmit igb1 from_host, drop packet size 5858 > 5120 > >> > > >> > Now the main question is, when dev.netmap.buf_size is 2048 the > >> > application > >> > uses 1.3G of RAM but when I raise to 5120 it uses 3G of RAM. > >> > > >> > So I need to understand, is this packet size really related from what > I > >> > get > >> > from the application packets coming from host to netmap? If so can I > >> > allow > >> > for bigger sizes, like 16k (lo0 mtu) without pre-alloc so much more > RAM? > >> > > >> > thank you > >> > > >> > E. Meyer > >> > ___ > >> > freebsd-net@freebsd.org mailing list > >> > https://lists.freebsd.org/mailman/listinfo/freebsd-net > >> > To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org > " > >> > >> > >> > >> -- > >> > -+--- > >> Prof. Luigi RIZZO, ri...@iet.unipi.it . Dip. di Ing. > dell'Informazione > >> http://www.iet.unipi.it/~luigi/. Universita` di Pisa > >> TEL +39-050-2217533 . via Diotisalvi 2 > >> Mobile +39-338-6809875 . 56122 PISA (Italy) > >> > -+--- > > > > > > > > > > -- > > === > > Eduardo Meyer > > pessoal: dudu.me...@gmail.com > > profissional: ddm.farmac...@saude.gov.br > > > > -- > -+--- > Prof. Luigi RIZZO, ri...@iet.unipi.it . Dip. di Ing. dell'Informazione > http://www.iet.unipi.it/~luigi/. Universita` di Pisa > TEL +39-050-2217533 . via Diotisalvi 2 > Mobile +39-338-6809875 . 56122 PISA (Italy) > -+--- > -- === Eduardo Meyer pessoal: dudu.me...@gmail.com profissional: ddm.farmac...@saude.gov.br ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Fwd: swaping ring slots between NIC ring and Host ring does not always success
Hi Luigi, Thanks for your explanation. I used three machines to do this experiment. They are directly connected. [(machine1) eth1]---[eth2 (machine2) eth3]---[eth4 (machine3)]. First, I tried to run bridge.c on machine2 using the command *bridge -i netmap:eth2 -i netmap:eth3*. (sender receiver or XYZ were not running on machine 1or3) For my understanding, in this setup, machine2 will be transparent to machine1&3 since it forwards packet from its eth2 to eth3 and vice versa without any modification to the packets. I tried to ping machine 3 from machine 1 using the command like *ping 10.11.10.3*. However, it still does not success. This is because that before machine1 sends ping message to machine3, it will first send a ARP request message to get the mac address of machine3. machine3 gets that ARP request, and send the reply back (I use tcpdump to verify that machine3 gets the ARP request and send out the ARP reply). However, machine1 does not get the ARP reply. I checked that the bridge can only forwarding packet in one direction at the same time. it gets the ARP request but doesn't see the ARP reply (*pkt_queued* always returns 0 for one nic...). This behavior looks very weird to me. Do you think there is a compatibility issues between netmap and the os I am using? Is there a verified linux distribution (also the version) that perfectly works well with netmap? The OS I use is 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt11-1 (2015-05-24) x86_64 GNU/Linux. Linux kernel version is *3.16.0-4-amd64* Thanks! Xiaoye On Wed, Feb 3, 2016 at 2:12 AM, Luigi Rizzo wrote: > On Tue, Feb 2, 2016 at 10:48 PM, Xiaoye Sun wrote: > > > > > > On Mon, Feb 1, 2016 at 11:34 PM, Luigi Rizzo wrote: > >> > >> On Tue, Feb 2, 2016 at 6:23 AM, Xiaoye Sun wrote: > >> > Hi Luigi, > >> > > >> > I have to clarify about the *jumping issue* about the slot indexes. > >> > In the bridge.c program, the slot index never jumps and it increases > >> > sequentially. > >> > In the receiver.c program, the udp packet seq jumps and I showed the > >> > slot > >> > index that each udp packet uses. So the slot index jumps together with > >> > the > >> > udp seq (at the receiver program only). > >> > >> So let me understand, is the "slot" some information written > >> in the packet by bridge.c (referring to the rx or tx slot, > >> I am not sure) and then read and printed by receiver.c > >> (which gets the packet through recvfrom so there isn't > >> really any slot index) ? > >> > > It works in the other way: > > The bridge.c checks the seq numbers of the udp packets in netmap slots > (in > > nic rx ring) before the swap; then it records the seq number, slot > > number(both rx and tx (tx indexes were not shown in the previous email > since > > they all look correct)) and buf_idx (rx and tx). The bridge.c does not > > change anything in the buffer and it knows the slot and buf_idx that a > > packet uses. Please refer to the added code in *process_rings* function > > http://www.owlnet.rice.edu/~xs6/bridge.c > > The receiver.c checks the seq numbers only and print out the seq numbers > it > > receive sequentially. > > With these information, I manually match the seq number I got from > > receiver.c and the seq number I got from bridge.c. So we know what is the > > seq order the receiver sees and which slot a packet uses when bridge.c > swaps > > the buf_idxs. > > > >> Do you see any ordering inversion when the receiver > >> gets packets through the NETMAP API (e.g. using bridge.c > >> instead of receiver.c) ? > >> > > There is no ordering inversion seen by bridge.c (As I said in the > previous > > paragraph, the bridge.c checks the seq number and I did not see any order > > inversion in THIS simple experiment (In my multicast protocol (mentioned > in > > the first email), there is ordering inversion. But let us solve the > simple > > bridge.c's problem first. I think they are two relatively independent > > issues.)). > > Sorry there was a misunderstanding. > I wanted you to check the following setup: > > [1: send.c] ->- [2: bridge.c] ->- [3: XYZ] > > where in XYZ you replace your receiver.c with some > netmap-based receiver (it could be pkt-gen in rx mode, > or possibly even another instance of bridge.c where > you connect the output port to a vale switch so > traffic is dropped), and then in XYZ print the content > of the packets. > > From your previous report we know that node 2: sees packets > in order, and node 3: sees packets out of order. > However, if the problem were due to bridge.c sending > the old buffer and not the new one, you'd see not only > reordering but also replication of packets. > > The fact that you see only the reordering in 3: makes > me think that the problem is in that node, and it could > be the network stack in 3: that does something strange. > So if you can run something netmap based in 3: and make > sure there is only one queue to read from, we could > at least figure out what is going on. > > cheers > luigi > >
[Bug 206932] Realtek 8111 card stops responding under high load in netmap mode
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=206932 Mark Linimon changed: What|Removed |Added Assignee|freebsd-b...@freebsd.org|freebsd-net@FreeBSD.org -- You are receiving this mail because: You are the assignee for the bug. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Bug 206904] tailq crash/nd inet6
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=206904 Mark Linimon changed: What|Removed |Added Assignee|freebsd-b...@freebsd.org|freebsd-net@FreeBSD.org -- You are receiving this mail because: You are the assignee for the bug. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: swaping ring slots between NIC ring and Host ring does not always success
Both interfaces are up? Like ifconfig... up I had this the same problem and I solve with commands above Em quinta-feira, 4 de fevereiro de 2016, Xiaoye Sun escreveu: > Hi Luigi, > > Thanks for your explanation. > > I used three machines to do this experiment. They are directly connected. > > [(machine1) eth1]---[eth2 (machine2) eth3]---[eth4 (machine3)]. > > First, I tried to run bridge.c on machine2 using the command *bridge -i > netmap:eth2 -i netmap:eth3*. (sender receiver or XYZ were not running on > machine 1or3) > > For my understanding, in this setup, machine2 will be transparent to > machine1&3 since it forwards packet from its eth2 to eth3 and vice versa > without any modification to the packets. > > I tried to ping machine 3 from machine 1 using the command like *ping > 10.11.10.3*. However, it still does not success. > This is because that before machine1 sends ping message to machine3, it > will first send a ARP request message to get the mac address of machine3. > machine3 gets that ARP request, and send the reply back (I use tcpdump to > verify that machine3 gets the ARP request and send out the ARP reply). > However, machine1 does not get the ARP reply. > > I checked that the bridge can only forwarding packet in one direction at > the same time. it gets the ARP request but doesn't see the ARP reply > (*pkt_queued* always returns 0 for one nic...). > > This behavior looks very weird to me. Do you think there is a compatibility > issues between netmap and the os I am using? Is there a verified linux > distribution (also the version) that perfectly works well with netmap? > > The OS I use is 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt11-1 (2015-05-24) > x86_64 GNU/Linux. > Linux kernel version is *3.16.0-4-amd64* > > > Thanks! > Xiaoye > > > > > > > On Wed, Feb 3, 2016 at 2:12 AM, Luigi Rizzo > wrote: > > > On Tue, Feb 2, 2016 at 10:48 PM, Xiaoye Sun > wrote: > > > > > > > > > On Mon, Feb 1, 2016 at 11:34 PM, Luigi Rizzo > wrote: > > >> > > >> On Tue, Feb 2, 2016 at 6:23 AM, Xiaoye Sun > wrote: > > >> > Hi Luigi, > > >> > > > >> > I have to clarify about the *jumping issue* about the slot indexes. > > >> > In the bridge.c program, the slot index never jumps and it increases > > >> > sequentially. > > >> > In the receiver.c program, the udp packet seq jumps and I showed the > > >> > slot > > >> > index that each udp packet uses. So the slot index jumps together > with > > >> > the > > >> > udp seq (at the receiver program only). > > >> > > >> So let me understand, is the "slot" some information written > > >> in the packet by bridge.c (referring to the rx or tx slot, > > >> I am not sure) and then read and printed by receiver.c > > >> (which gets the packet through recvfrom so there isn't > > >> really any slot index) ? > > >> > > > It works in the other way: > > > The bridge.c checks the seq numbers of the udp packets in netmap slots > > (in > > > nic rx ring) before the swap; then it records the seq number, slot > > > number(both rx and tx (tx indexes were not shown in the previous email > > since > > > they all look correct)) and buf_idx (rx and tx). The bridge.c does not > > > change anything in the buffer and it knows the slot and buf_idx that a > > > packet uses. Please refer to the added code in *process_rings* function > > > http://www.owlnet.rice.edu/~xs6/bridge.c > > > The receiver.c checks the seq numbers only and print out the seq > numbers > > it > > > receive sequentially. > > > With these information, I manually match the seq number I got from > > > receiver.c and the seq number I got from bridge.c. So we know what is > the > > > seq order the receiver sees and which slot a packet uses when bridge.c > > swaps > > > the buf_idxs. > > > > > >> Do you see any ordering inversion when the receiver > > >> gets packets through the NETMAP API (e.g. using bridge.c > > >> instead of receiver.c) ? > > >> > > > There is no ordering inversion seen by bridge.c (As I said in the > > previous > > > paragraph, the bridge.c checks the seq number and I did not see any > order > > > inversion in THIS simple experiment (In my multicast protocol > (mentioned > > in > > > the first email), there is ordering inversion. But let us solve the > > simple > > > bridge.c's problem first. I think they are two relatively independent > > > issues.)). > > > > Sorry there was a misunderstanding. > > I wanted you to check the following setup: > > > > [1: send.c] ->- [2: bridge.c] ->- [3: XYZ] > > > > where in XYZ you replace your receiver.c with some > > netmap-based receiver (it could be pkt-gen in rx mode, > > or possibly even another instance of bridge.c where > > you connect the output port to a vale switch so > > traffic is dropped), and then in XYZ print the content > > of the packets. > > > > From your previous report we know that node 2: sees packets > > in order, and node 3: sees packets out of order. > > However, if the problem were due to bridge.c sending > > the old buffer and not the new one, you'd
Re: swaping ring slots between NIC ring and Host ring does not always success
Yes. all the interfaces are up. Are you able to get ARP request when the interfaces are down? On Thursday, February 4, 2016, Victor Detoni wrote: > Both interfaces are up? Like ifconfig... up > > I had this the same problem and I solve with commands above > > Em quinta-feira, 4 de fevereiro de 2016, Xiaoye Sun > escreveu: > >> Hi Luigi, >> >> Thanks for your explanation. >> >> I used three machines to do this experiment. They are directly connected. >> >> [(machine1) eth1]---[eth2 (machine2) eth3]---[eth4 (machine3)]. >> >> First, I tried to run bridge.c on machine2 using the command *bridge -i >> netmap:eth2 -i netmap:eth3*. (sender receiver or XYZ were not running on >> machine 1or3) >> >> For my understanding, in this setup, machine2 will be transparent to >> machine1&3 since it forwards packet from its eth2 to eth3 and vice versa >> without any modification to the packets. >> >> I tried to ping machine 3 from machine 1 using the command like *ping >> 10.11.10.3*. However, it still does not success. >> This is because that before machine1 sends ping message to machine3, it >> will first send a ARP request message to get the mac address of machine3. >> machine3 gets that ARP request, and send the reply back (I use tcpdump to >> verify that machine3 gets the ARP request and send out the ARP reply). >> However, machine1 does not get the ARP reply. >> >> I checked that the bridge can only forwarding packet in one direction at >> the same time. it gets the ARP request but doesn't see the ARP reply >> (*pkt_queued* always returns 0 for one nic...). >> >> This behavior looks very weird to me. Do you think there is a >> compatibility >> issues between netmap and the os I am using? Is there a verified linux >> distribution (also the version) that perfectly works well with netmap? >> >> The OS I use is 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt11-1 (2015-05-24) >> x86_64 GNU/Linux. >> Linux kernel version is *3.16.0-4-amd64* >> >> >> Thanks! >> Xiaoye >> >> >> >> >> >> >> On Wed, Feb 3, 2016 at 2:12 AM, Luigi Rizzo wrote: >> >> > On Tue, Feb 2, 2016 at 10:48 PM, Xiaoye Sun >> wrote: >> > > >> > > >> > > On Mon, Feb 1, 2016 at 11:34 PM, Luigi Rizzo >> wrote: >> > >> >> > >> On Tue, Feb 2, 2016 at 6:23 AM, Xiaoye Sun >> wrote: >> > >> > Hi Luigi, >> > >> > >> > >> > I have to clarify about the *jumping issue* about the slot indexes. >> > >> > In the bridge.c program, the slot index never jumps and it >> increases >> > >> > sequentially. >> > >> > In the receiver.c program, the udp packet seq jumps and I showed >> the >> > >> > slot >> > >> > index that each udp packet uses. So the slot index jumps together >> with >> > >> > the >> > >> > udp seq (at the receiver program only). >> > >> >> > >> So let me understand, is the "slot" some information written >> > >> in the packet by bridge.c (referring to the rx or tx slot, >> > >> I am not sure) and then read and printed by receiver.c >> > >> (which gets the packet through recvfrom so there isn't >> > >> really any slot index) ? >> > >> >> > > It works in the other way: >> > > The bridge.c checks the seq numbers of the udp packets in netmap slots >> > (in >> > > nic rx ring) before the swap; then it records the seq number, slot >> > > number(both rx and tx (tx indexes were not shown in the previous email >> > since >> > > they all look correct)) and buf_idx (rx and tx). The bridge.c does not >> > > change anything in the buffer and it knows the slot and buf_idx that a >> > > packet uses. Please refer to the added code in *process_rings* >> function >> > > http://www.owlnet.rice.edu/~xs6/bridge.c >> > > The receiver.c checks the seq numbers only and print out the seq >> numbers >> > it >> > > receive sequentially. >> > > With these information, I manually match the seq number I got from >> > > receiver.c and the seq number I got from bridge.c. So we know what is >> the >> > > seq order the receiver sees and which slot a packet uses when bridge.c >> > swaps >> > > the buf_idxs. >> > > >> > >> Do you see any ordering inversion when the receiver >> > >> gets packets through the NETMAP API (e.g. using bridge.c >> > >> instead of receiver.c) ? >> > >> >> > > There is no ordering inversion seen by bridge.c (As I said in the >> > previous >> > > paragraph, the bridge.c checks the seq number and I did not see any >> order >> > > inversion in THIS simple experiment (In my multicast protocol >> (mentioned >> > in >> > > the first email), there is ordering inversion. But let us solve the >> > simple >> > > bridge.c's problem first. I think they are two relatively independent >> > > issues.)). >> > >> > Sorry there was a misunderstanding. >> > I wanted you to check the following setup: >> > >> > [1: send.c] ->- [2: bridge.c] ->- [3: XYZ] >> > >> > where in XYZ you replace your receiver.c with some >> > netmap-based receiver (it could be pkt-gen in rx mode, >> > or possibly even another instance of bridge.c where >> > you connect the output port to a vale switch so >> > traffic
Re: swaping ring slots between NIC ring and Host ring does not always success
I'm sorry, I made mistake. To workaround this try `ip link set $IFACE promisc on` On Thu, Feb 4, 2016 at 10:04 PM, Xiaoye Sun wrote: > Yes. all the interfaces are up. Are you able to get ARP request when the > interfaces are down? > > > On Thursday, February 4, 2016, Victor Detoni > wrote: > >> Both interfaces are up? Like ifconfig... up >> >> I had this the same problem and I solve with commands above >> >> Em quinta-feira, 4 de fevereiro de 2016, Xiaoye Sun >> escreveu: >> >>> Hi Luigi, >>> >>> Thanks for your explanation. >>> >>> I used three machines to do this experiment. They are directly connected. >>> >>> [(machine1) eth1]---[eth2 (machine2) eth3]---[eth4 (machine3)]. >>> >>> First, I tried to run bridge.c on machine2 using the command *bridge -i >>> netmap:eth2 -i netmap:eth3*. (sender receiver or XYZ were not running on >>> machine 1or3) >>> >>> For my understanding, in this setup, machine2 will be transparent to >>> machine1&3 since it forwards packet from its eth2 to eth3 and vice versa >>> without any modification to the packets. >>> >>> I tried to ping machine 3 from machine 1 using the command like *ping >>> 10.11.10.3*. However, it still does not success. >>> This is because that before machine1 sends ping message to machine3, it >>> will first send a ARP request message to get the mac address of machine3. >>> machine3 gets that ARP request, and send the reply back (I use tcpdump to >>> verify that machine3 gets the ARP request and send out the ARP reply). >>> However, machine1 does not get the ARP reply. >>> >>> I checked that the bridge can only forwarding packet in one direction at >>> the same time. it gets the ARP request but doesn't see the ARP reply >>> (*pkt_queued* always returns 0 for one nic...). >>> >>> This behavior looks very weird to me. Do you think there is a >>> compatibility >>> issues between netmap and the os I am using? Is there a verified linux >>> distribution (also the version) that perfectly works well with netmap? >>> >>> The OS I use is 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt11-1 (2015-05-24) >>> x86_64 GNU/Linux. >>> Linux kernel version is *3.16.0-4-amd64* >>> >>> >>> Thanks! >>> Xiaoye >>> >>> >>> >>> >>> >>> >>> On Wed, Feb 3, 2016 at 2:12 AM, Luigi Rizzo wrote: >>> >>> > On Tue, Feb 2, 2016 at 10:48 PM, Xiaoye Sun >>> wrote: >>> > > >>> > > >>> > > On Mon, Feb 1, 2016 at 11:34 PM, Luigi Rizzo >>> wrote: >>> > >> >>> > >> On Tue, Feb 2, 2016 at 6:23 AM, Xiaoye Sun >>> wrote: >>> > >> > Hi Luigi, >>> > >> > >>> > >> > I have to clarify about the *jumping issue* about the slot >>> indexes. >>> > >> > In the bridge.c program, the slot index never jumps and it >>> increases >>> > >> > sequentially. >>> > >> > In the receiver.c program, the udp packet seq jumps and I showed >>> the >>> > >> > slot >>> > >> > index that each udp packet uses. So the slot index jumps together >>> with >>> > >> > the >>> > >> > udp seq (at the receiver program only). >>> > >> >>> > >> So let me understand, is the "slot" some information written >>> > >> in the packet by bridge.c (referring to the rx or tx slot, >>> > >> I am not sure) and then read and printed by receiver.c >>> > >> (which gets the packet through recvfrom so there isn't >>> > >> really any slot index) ? >>> > >> >>> > > It works in the other way: >>> > > The bridge.c checks the seq numbers of the udp packets in netmap >>> slots >>> > (in >>> > > nic rx ring) before the swap; then it records the seq number, slot >>> > > number(both rx and tx (tx indexes were not shown in the previous >>> email >>> > since >>> > > they all look correct)) and buf_idx (rx and tx). The bridge.c does >>> not >>> > > change anything in the buffer and it knows the slot and buf_idx that >>> a >>> > > packet uses. Please refer to the added code in *process_rings* >>> function >>> > > http://www.owlnet.rice.edu/~xs6/bridge.c >>> > > The receiver.c checks the seq numbers only and print out the seq >>> numbers >>> > it >>> > > receive sequentially. >>> > > With these information, I manually match the seq number I got from >>> > > receiver.c and the seq number I got from bridge.c. So we know what >>> is the >>> > > seq order the receiver sees and which slot a packet uses when >>> bridge.c >>> > swaps >>> > > the buf_idxs. >>> > > >>> > >> Do you see any ordering inversion when the receiver >>> > >> gets packets through the NETMAP API (e.g. using bridge.c >>> > >> instead of receiver.c) ? >>> > >> >>> > > There is no ordering inversion seen by bridge.c (As I said in the >>> > previous >>> > > paragraph, the bridge.c checks the seq number and I did not see any >>> order >>> > > inversion in THIS simple experiment (In my multicast protocol >>> (mentioned >>> > in >>> > > the first email), there is ordering inversion. But let us solve the >>> > simple >>> > > bridge.c's problem first. I think they are two relatively independent >>> > > issues.)). >>> > >>> > Sorry there was a misunderstanding. >>> > I wanted you to check the following setup:
[Bug 206904] tailq crash/nd inet6
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=206904 --- Comment #2 from Larry Rosenman --- Created attachment 166582 --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=166582&action=edit another one -- You are receiving this mail because: You are the assignee for the bug. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Bug 206904] tailq crash/nd inet6
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=206904 --- Comment #3 from Larry Rosenman --- Created attachment 166583 --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=166583&action=edit and a 3rd -- You are receiving this mail because: You are the assignee for the bug. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Bug 206904] tailq crash/nd inet6
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=206904 --- Comment #4 from Larry Rosenman --- vmcore's are ALL available, and I can give a @FreeBSD.org dev access. -- You are receiving this mail because: You are the assignee for the bug. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Differential] [Commented On] D5185: tcp/lro: Allow network drivers to set the limit for TCP ACK/data segment aggregation limit
sepherosa_gmail.com added inline comments. INLINE COMMENTS sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c:455 OK, I will split it out. REVISION DETAIL https://reviews.freebsd.org/D5185 EMAIL PREFERENCES https://reviews.freebsd.org/settings/panel/emailpreferences/ To: sepherosa_gmail.com, network, adrian, delphij, royger, decui_microsoft.com, honzhan_microsoft.com, howard0su_gmail.com, hselasky, np, transport, gallatin Cc: freebsd-virtualization-list, freebsd-net-list ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Differential] [Commented On] D5185: tcp/lro: Allow network drivers to set the limit for TCP ACK/data segment aggregation limit
sepherosa_gmail.com added a comment. I will adjust the patch accordingly. INLINE COMMENTS sys/netinet/tcp_lro.c:655 Sure :) sys/netinet/tcp_lro.c:684 Sounds fine to me. I did the byte limit before (https://reviews.freebsd.org/D4825). But it turns out the ACKs need seperate limit (append count based). To make them consistent, I changed the original patch to use append count too. REVISION DETAIL https://reviews.freebsd.org/D5185 EMAIL PREFERENCES https://reviews.freebsd.org/settings/panel/emailpreferences/ To: sepherosa_gmail.com, network, adrian, delphij, royger, decui_microsoft.com, honzhan_microsoft.com, howard0su_gmail.com, hselasky, np, transport, gallatin Cc: freebsd-virtualization-list, freebsd-net-list ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Bug 206932] Realtek 8111 card stops responding under high load in netmap mode
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=206932 Olivier - interfaSys sà rl changed: What|Removed |Added Version|10.2-STABLE |11.0-CURRENT --- Comment #1 from Olivier - interfaSys sà rl --- I've just tested on 11-CURRENT and got the same results. -- You are receiving this mail because: You are the assignee for the bug. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Differential] [Updated, 114 lines] D5185: tcp/lro: Allow network drivers to set the limit for TCP ACK/data segment aggregation limit
sepherosa_gmail.com updated the summary for this revision. sepherosa_gmail.com updated this revision to Diff 13028. sepherosa_gmail.com added a comment. Address gallatin and adrian's concern. CHANGES SINCE LAST UPDATE https://reviews.freebsd.org/D5185?vs=12995&id=13028 REVISION DETAIL https://reviews.freebsd.org/D5185 AFFECTED FILES sys/dev/hyperv/netvsc/hv_net_vsc.h sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c sys/netinet/tcp_lro.c sys/netinet/tcp_lro.h EMAIL PREFERENCES https://reviews.freebsd.org/settings/panel/emailpreferences/ To: sepherosa_gmail.com, network, adrian, delphij, royger, decui_microsoft.com, honzhan_microsoft.com, howard0su_gmail.com, hselasky, np, gallatin, transport Cc: freebsd-virtualization-list, freebsd-net-list diff --git a/sys/netinet/tcp_lro.h b/sys/netinet/tcp_lro.h --- a/sys/netinet/tcp_lro.h +++ b/sys/netinet/tcp_lro.h @@ -91,11 +91,16 @@ unsigned lro_cnt; unsigned lro_mbuf_count; unsigned lro_mbuf_max; + unsigned short lro_ackcnt_lim; /* max # of aggregated ACKs */ + unsigned short lro_length_lim; /* max len of aggregated data */ struct lro_head lro_active; struct lro_head lro_free; }; +#define TCP_LRO_LENGTH_MAX 65535 +#define TCP_LRO_ACKCNT_MAX 65535 /* unlimited */ + int tcp_lro_init(struct lro_ctrl *); int tcp_lro_init_args(struct lro_ctrl *, struct ifnet *, unsigned, unsigned); void tcp_lro_free(struct lro_ctrl *); diff --git a/sys/netinet/tcp_lro.c b/sys/netinet/tcp_lro.c --- a/sys/netinet/tcp_lro.c +++ b/sys/netinet/tcp_lro.c @@ -87,6 +87,8 @@ lc->lro_mbuf_count = 0; lc->lro_mbuf_max = lro_mbufs; lc->lro_cnt = lro_entries; + lc->lro_ackcnt_lim = TCP_LRO_ACKCNT_MAX; + lc->lro_length_lim = TCP_LRO_LENGTH_MAX; lc->ifp = ifp; SLIST_INIT(&lc->lro_free); SLIST_INIT(&lc->lro_active); @@ -608,7 +610,7 @@ } /* Flush now if appending will result in overflow. */ - if (le->p_len > (65535 - tcp_data_len)) { + if (le->p_len > (lc->lro_length_lim - tcp_data_len)) { SLIST_REMOVE(&lc->lro_active, le, lro_entry, next); tcp_lro_flush(lc, le); break; @@ -646,6 +648,15 @@ if (tcp_data_len == 0) { m_freem(m); + /* + * Flush this LRO entry, if this ACK should not + * be further delayed. + */ + if (le->append_cnt >= lc->lro_ackcnt_lim) { +SLIST_REMOVE(&lc->lro_active, le, lro_entry, +next); +tcp_lro_flush(lc, le); + } return (0); } @@ -666,7 +677,7 @@ * If a possible next full length packet would cause an * overflow, pro-actively flush now. */ - if (le->p_len > (65535 - lc->ifp->if_mtu)) { + if (le->p_len > (lc->lro_length_lim - lc->ifp->if_mtu)) { SLIST_REMOVE(&lc->lro_active, le, lro_entry, next); tcp_lro_flush(lc, le); } else diff --git a/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c b/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c --- a/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c +++ b/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c @@ -176,14 +176,11 @@ #define HN_CSUM_ASSIST_WIN8 (CSUM_TCP) #define HN_CSUM_ASSIST (CSUM_IP | CSUM_UDP | CSUM_TCP) -/* XXX move to netinet/tcp_lro.h */ -#define HN_LRO_HIWAT_MAX65535 -#define HN_LRO_HIWAT_DEFHN_LRO_HIWAT_MAX +#define HN_LRO_LENLIM_DEF (25 * ETHERMTU) /* YYY 2*MTU is a bit rough, but should be good enough. */ -#define HN_LRO_HIWAT_MTULIM(ifp) (2 * (ifp)->if_mtu) -#define HN_LRO_HIWAT_ISVALID(sc, hiwat) \ -((hiwat) >= HN_LRO_HIWAT_MTULIM((sc)->hn_ifp) || \ - (hiwat) <= HN_LRO_HIWAT_MAX) +#define HN_LRO_LENLIM_MIN(ifp) (2 * (ifp)->if_mtu) + +#define HN_LRO_ACKCNT_DEF 1 /* * Be aware that this sleepable mutex will exhibit WITNESS errors when @@ -253,9 +250,8 @@ static void hn_start_txeof(struct ifnet *ifp); static int hn_ifmedia_upd(struct ifnet *ifp); static void hn_ifmedia_sts(struct ifnet *ifp, struct ifmediareq *ifmr); -#ifdef HN_LRO_HIWAT -static int hn_lro_hiwat_sysctl(SYSCTL_HANDLER_ARGS); -#endif +static int hn_lro_lenlim_sysctl(SYSCTL_HANDLER_ARGS); +static int hn_lro_ackcnt_sysctl(SYSCTL_HANDLER_ARGS); static int hn_trust_hcsum_sysctl(SYSCTL_HANDLER_ARGS); static int hn_tx_chimney_size_sysctl(SYSCTL_HANDLER_ARGS); static int hn_check_iplen(const struct mbuf *, int); @@ -265,15 +261,6 @@ static void hn_txeof_taskfunc(void *xsc, int pending); static int hn_encap(struct hn_softc *, struct hn_txdesc *, struct mbuf **); -static __inline void -hn_set_lro_hiwat(struct hn_softc *sc, int hiwat) -{ - sc->hn_lro_hiwat = hiwat; -#ifdef HN_LRO_HIWAT - sc->hn_lro.lro_hiwat = sc->hn_lro_hiwat; -#endif -} - static int hn_ifmedia_upd(struct ifnet *ifp __unused) { @@ -358,7 +345,6 @@ bzero(sc, sizeof(hn_softc_t)); sc->hn_unit = unit; sc->hn_dev = dev; - sc->hn_lro_hiwat = HN_LRO_HIWAT_DEF; sc->hn_direct_tx_size = hn_direct_tx_size; if (hn_trust_hosttcp) sc->hn_trust_hcsum |= HN_TRUST_HCSUM_TCP; @@ -442,9 +428,8 @@ /* Driver private LRO settings */ sc->hn_lro.ifp = ifp; #endif -#ifdef HN_LRO_HIWAT - sc->hn_lro.lro_hiwat = sc->hn_lro_hiwat; -
[Differential] [Closed] D5085: hyperv/hn: Avoid duplicate csum features settings
This revision was automatically updated to reflect the committed changes. Closed by commit rS295296: hyperv/hn: Avoid duplicate csum features settings (authored by sephe). CHANGED PRIOR TO COMMIT https://reviews.freebsd.org/D5085?vs=12744&id=13030#toc REPOSITORY rS FreeBSD src repository CHANGES SINCE LAST UPDATE https://reviews.freebsd.org/D5085?vs=12744&id=13030 REVISION DETAIL https://reviews.freebsd.org/D5085 AFFECTED FILES head/sys/dev/hyperv/netvsc/hv_net_vsc.h head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c EMAIL PREFERENCES https://reviews.freebsd.org/settings/panel/emailpreferences/ To: sepherosa_gmail.com, delphij, royger, decui_microsoft.com, howard0su_gmail.com, honzhan_microsoft.com, adrian, network Cc: freebsd-net-list diff --git a/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c b/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c --- a/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c +++ b/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c @@ -176,6 +176,14 @@ CSUM_IP_ISCSI|CSUM_IP6_UDP|CSUM_IP6_TCP|CSUM_IP6_SCTP| \ CSUM_IP6_TSO|CSUM_IP6_ISCSI) +/* + * Only enable UDP checksum offloading when it is on 2012R2 or + * later. UDP checksum offloading doesn't work on earlier + * Windows releases. + */ +#define HN_CSUM_ASSIST_WIN8 (CSUM_TCP) +#define HN_CSUM_ASSIST (CSUM_UDP | CSUM_TCP) + /* XXX move to netinet/tcp_lro.h */ #define HN_LRO_HIWAT_MAX65535 #define HN_LRO_HIWAT_DEFHN_LRO_HIWAT_MAX @@ -444,15 +452,12 @@ ifp->if_capenable |= IFCAP_VLAN_HWTAGGING | IFCAP_VLAN_MTU | IFCAP_HWCSUM | IFCAP_TSO | IFCAP_LRO; - /* - * Only enable UDP checksum offloading when it is on 2012R2 or - * later. UDP checksum offloading doesn't work on earlier - * Windows releases. - */ + if (hv_vmbus_protocal_version >= HV_VMBUS_VERSION_WIN8_1) - ifp->if_hwassist = CSUM_TCP | CSUM_UDP | CSUM_TSO; + sc->hn_csum_assist = HN_CSUM_ASSIST; else - ifp->if_hwassist = CSUM_TCP | CSUM_TSO; + sc->hn_csum_assist = HN_CSUM_ASSIST_WIN8; + ifp->if_hwassist = sc->hn_csum_assist | CSUM_TSO; error = hv_rf_on_device_add(device_ctx, &device_info); if (error) @@ -1506,47 +1511,40 @@ error = 0; break; case SIOCSIFCAP: + NV_LOCK(sc); + mask = ifr->ifr_reqcap ^ ifp->if_capenable; if (mask & IFCAP_TXCSUM) { - if (IFCAP_TXCSUM & ifp->if_capenable) { -ifp->if_capenable &= ~IFCAP_TXCSUM; -ifp->if_hwassist &= ~(CSUM_TCP | CSUM_UDP); - } else { -ifp->if_capenable |= IFCAP_TXCSUM; -/* - * Only enable UDP checksum offloading on - * Windows Server 2012R2 or later releases. - */ -if (hv_vmbus_protocal_version >= -HV_VMBUS_VERSION_WIN8_1) { - ifp->if_hwassist |= - (CSUM_TCP | CSUM_UDP); -} else { - ifp->if_hwassist |= CSUM_TCP; -} - } + ifp->if_capenable ^= IFCAP_TXCSUM; + if (ifp->if_capenable & IFCAP_TXCSUM) +ifp->if_hwassist |= sc->hn_csum_assist; + else +ifp->if_hwassist &= ~sc->hn_csum_assist; } - if (mask & IFCAP_RXCSUM) { - if (IFCAP_RXCSUM & ifp->if_capenable) { -ifp->if_capenable &= ~IFCAP_RXCSUM; - } else { -ifp->if_capenable |= IFCAP_RXCSUM; - } - } + if (mask & IFCAP_RXCSUM) + ifp->if_capenable ^= IFCAP_RXCSUM; + if (mask & IFCAP_LRO) ifp->if_capenable ^= IFCAP_LRO; if (mask & IFCAP_TSO4) { ifp->if_capenable ^= IFCAP_TSO4; - ifp->if_hwassist ^= CSUM_IP_TSO; + if (ifp->if_capenable & IFCAP_TSO4) +ifp->if_hwassist |= CSUM_IP_TSO; + else +ifp->if_hwassist &= ~CSUM_IP_TSO; } if (mask & IFCAP_TSO6) { ifp->if_capenable ^= IFCAP_TSO6; - ifp->if_hwassist ^= CSUM_IP6_TSO; + if (ifp->if_capenable & IFCAP_TSO6) +ifp->if_hwassist |= CSUM_IP6_TSO; + else +ifp->if_hwassist &= ~CSUM_IP6_TSO; } + NV_UNLOCK(sc); error = 0; break; case SIOCADDMULTI: diff --git a/head/sys/dev/hyperv/netvsc/hv_net_vsc.h b/head/sys/dev/hyperv/netvsc/hv_net_vsc.h --- a/head/sys/dev/hyperv/netvsc/hv_net_vsc.h +++ b/head/sys/dev/hyperv/netvsc/hv_net_vsc.h @@ -1043,6 +1043,8 @@ u_long hn_txdma_failed; u_long hn_tx_collapsed; u_long hn_tx_chimney; + + uint64_t hn_csum_assist; } hn_softc_t; ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Differential] [Closed] D5098: hyperv/hn: Reorganize TX csum offloading
This revision was automatically updated to reflect the committed changes. Closed by commit rS295297: hyperv/hn: Reorganize TX csum offloading (authored by sephe). CHANGED PRIOR TO COMMIT https://reviews.freebsd.org/D5098?vs=12774&id=13031#toc REPOSITORY rS FreeBSD src repository CHANGES SINCE LAST UPDATE https://reviews.freebsd.org/D5098?vs=12774&id=13031 REVISION DETAIL https://reviews.freebsd.org/D5098 AFFECTED FILES head/sys/dev/hyperv/netvsc/hv_net_vsc.h head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c EMAIL PREFERENCES https://reviews.freebsd.org/settings/panel/emailpreferences/ To: sepherosa_gmail.com, delphij, royger, decui_microsoft.com, howard0su_gmail.com, honzhan_microsoft.com, adrian, network Cc: freebsd-net-list diff --git a/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c b/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c --- a/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c +++ b/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c @@ -167,16 +167,6 @@ #define HN_TXD_FLAG_DMAMAP 0x2 /* - * A unified flag for all outbound check sum flags is useful, - * and it helps avoiding unnecessary check sum calculation in - * network forwarding scenario. - */ -#define HV_CSUM_FOR_OUTBOUND \ -(CSUM_IP|CSUM_IP_UDP|CSUM_IP_TCP|CSUM_IP_SCTP|CSUM_IP_TSO| \ -CSUM_IP_ISCSI|CSUM_IP6_UDP|CSUM_IP6_TCP|CSUM_IP6_SCTP| \ -CSUM_IP6_TSO|CSUM_IP6_ISCSI) - -/* * Only enable UDP checksum offloading when it is on 2012R2 or * later. UDP checksum offloading doesn't work on earlier * Windows releases. @@ -265,62 +255,6 @@ #endif } -/* - * NetVsc get message transport protocol type - */ -static uint32_t get_transport_proto_type(struct mbuf *m_head) -{ - uint32_t ret_val = TRANSPORT_TYPE_NOT_IP; - uint16_t ether_type = 0; - int ether_len = 0; - struct ether_vlan_header *eh; -#ifdef INET - struct ip *iph; -#endif -#ifdef INET6 - struct ip6_hdr *ip6; -#endif - - eh = mtod(m_head, struct ether_vlan_header*); - if (eh->evl_encap_proto == htons(ETHERTYPE_VLAN)) { - ether_len = ETHER_HDR_LEN + ETHER_VLAN_ENCAP_LEN; - ether_type = eh->evl_proto; - } else { - ether_len = ETHER_HDR_LEN; - ether_type = eh->evl_encap_proto; - } - - switch (ntohs(ether_type)) { -#ifdef INET6 - case ETHERTYPE_IPV6: - ip6 = (struct ip6_hdr *)(m_head->m_data + ether_len); - - if (IPPROTO_TCP == ip6->ip6_nxt) { - ret_val = TRANSPORT_TYPE_IPV6_TCP; - } else if (IPPROTO_UDP == ip6->ip6_nxt) { - ret_val = TRANSPORT_TYPE_IPV6_UDP; - } - break; -#endif -#ifdef INET - case ETHERTYPE_IP: - iph = (struct ip *)(m_head->m_data + ether_len); - - if (IPPROTO_TCP == iph->ip_p) { - ret_val = TRANSPORT_TYPE_IPV4_TCP; - } else if (IPPROTO_UDP == iph->ip_p) { - ret_val = TRANSPORT_TYPE_IPV4_UDP; - } - break; -#endif - default: - ret_val = TRANSPORT_TYPE_NOT_IP; - break; - } - - return (ret_val); -} - static int hn_ifmedia_upd(struct ifnet *ifp __unused) { @@ -783,16 +717,13 @@ hn_softc_t *sc = ifp->if_softc; struct hv_device *device_ctx = vmbus_get_devctx(sc->hn_dev); netvsc_dev *net_dev = sc->net_dev; - struct ether_vlan_header *eh; rndis_msg *rndis_mesg; rndis_packet *rndis_pkt; rndis_per_packet_info *rppi; ndis_8021q_info *rppi_vlan_info; rndis_tcp_ip_csum_info *csum_info; rndis_tcp_tso_info *tso_info; - int ether_len; uint32_t rndis_msg_size = 0; - uint32_t trans_proto_type; if ((ifp->if_drv_flags & (IFF_DRV_RUNNING | IFF_DRV_OACTIVE)) != IFF_DRV_RUNNING) @@ -872,101 +803,78 @@ m_head->m_pkthdr.ether_vtag & 0xfff; } - /* Only check the flags for outbound and ignore the ones for inbound */ - if (0 == (m_head->m_pkthdr.csum_flags & HV_CSUM_FOR_OUTBOUND)) { - goto pre_send; - } - - eh = mtod(m_head, struct ether_vlan_header*); - if (eh->evl_encap_proto == htons(ETHERTYPE_VLAN)) { - ether_len = ETHER_HDR_LEN + ETHER_VLAN_ENCAP_LEN; - } else { - ether_len = ETHER_HDR_LEN; - } - - trans_proto_type = get_transport_proto_type(m_head); - if (TRANSPORT_TYPE_NOT_IP == trans_proto_type) { - goto pre_send; - } - - /* - * TSO packet needless to setup the send side checksum - * offload. - */ if (m_head->m_pkthdr.csum_flags & CSUM_TSO) { - goto do_tso; - } + struct ether_vlan_header *eh; + int ether_len; - /* setup checksum offload */ - rndis_msg_size += RNDIS_CSUM_PPI_SIZE; - rppi = hv_set_rppi_data(rndis_mesg, RNDIS_CSUM_PPI_SIZE, - tcpip_chksum_info); - csum_info = (rndis_tcp_ip_csum_info *)((char*)rppi + - rppi->per_packet_info_offset); - - if (trans_proto_type & (TYPE_IPV4 << 16)) { - csum_info->xmit.is_ipv4 = 1; - } else { - csum_info->xmit.is_ipv6 = 1; - } + eh = mtod(m_head, struct ether_vlan_header*); + if (eh->evl_encap_proto == htons(ETHERTYPE_VLAN)) { +ether_len = ETHER_HDR_LEN + +ETHER_VLAN_ENCAP_LEN; + } else { +ether_len = ETHER_HDR_LEN; + } - if (trans_proto_type & TYPE_TCP) { - csum_info->xmit.tcp_csum = 1; - csum_info->xmit.tcp_header_offset = 0; -
[Differential] [Closed] D5099: hyperv/hn: Enable IP header checksum offloading
This revision was automatically updated to reflect the committed changes. Closed by commit rS295298: hyperv/hn: Enable IP header checksum offloading (authored by sephe). CHANGED PRIOR TO COMMIT https://reviews.freebsd.org/D5099?vs=12775&id=13032#toc REPOSITORY rS FreeBSD src repository CHANGES SINCE LAST UPDATE https://reviews.freebsd.org/D5099?vs=12775&id=13032 REVISION DETAIL https://reviews.freebsd.org/D5099 AFFECTED FILES head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c CHANGE DETAILS diff --git a/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c b/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c --- a/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c +++ b/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c @@ -172,7 +172,7 @@ * Windows releases. */ #define HN_CSUM_ASSIST_WIN8 (CSUM_TCP) -#define HN_CSUM_ASSIST (CSUM_UDP | CSUM_TCP) +#define HN_CSUM_ASSIST (CSUM_IP | CSUM_UDP | CSUM_TCP) /* XXX move to netinet/tcp_lro.h */ #define HN_LRO_HIWAT_MAX 65535 @@ -867,6 +867,9 @@ rppi->per_packet_info_offset); csum_info->xmit.is_ipv4 = 1; + if (m_head->m_pkthdr.csum_flags & CSUM_IP) + csum_info->xmit.ip_header_csum = 1; + if (m_head->m_pkthdr.csum_flags & CSUM_TCP) { csum_info->xmit.tcp_csum = 1; csum_info->xmit.tcp_header_offset = 0; EMAIL PREFERENCES https://reviews.freebsd.org/settings/panel/emailpreferences/ To: sepherosa_gmail.com, delphij, royger, decui_microsoft.com, honzhan_microsoft.com, howard0su_gmail.com, adrian, network Cc: freebsd-net-list diff --git a/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c b/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c --- a/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c +++ b/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c @@ -172,7 +172,7 @@ * Windows releases. */ #define HN_CSUM_ASSIST_WIN8 (CSUM_TCP) -#define HN_CSUM_ASSIST (CSUM_UDP | CSUM_TCP) +#define HN_CSUM_ASSIST (CSUM_IP | CSUM_UDP | CSUM_TCP) /* XXX move to netinet/tcp_lro.h */ #define HN_LRO_HIWAT_MAX65535 @@ -867,6 +867,9 @@ rppi->per_packet_info_offset); csum_info->xmit.is_ipv4 = 1; + if (m_head->m_pkthdr.csum_flags & CSUM_IP) +csum_info->xmit.ip_header_csum = 1; + if (m_head->m_pkthdr.csum_flags & CSUM_TCP) { csum_info->xmit.tcp_csum = 1; csum_info->xmit.tcp_header_offset = 0; ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Differential] [Closed] D5102: hyperv/hn: Enable UDP RXCSUM
This revision was automatically updated to reflect the committed changes. Closed by commit rS295299: hyperv/hn: Enable UDP RXCSUM (authored by sephe). CHANGED PRIOR TO COMMIT https://reviews.freebsd.org/D5102?vs=12780&id=13033#toc REPOSITORY rS FreeBSD src repository CHANGES SINCE LAST UPDATE https://reviews.freebsd.org/D5102?vs=12780&id=13033 REVISION DETAIL https://reviews.freebsd.org/D5102 AFFECTED FILES head/sys/dev/hyperv/netvsc/hv_net_vsc.h head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c CHANGE DETAILS diff --git a/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c b/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c --- a/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c +++ b/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c @@ -456,6 +456,8 @@ CTLFLAG_RW, &sc->hn_csum_ip, "RXCSUM IP"); SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "csum_tcp", CTLFLAG_RW, &sc->hn_csum_tcp, "RXCSUM TCP"); + SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "csum_udp", + CTLFLAG_RW, &sc->hn_csum_udp, "RXCSUM UDP"); SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "csum_trusted", CTLFLAG_RW, &sc->hn_csum_trusted, "# of TCP segements that we trust host's csum verification"); @@ -1156,20 +1158,24 @@ m_new->m_pkthdr.rcvif = ifp; /* receive side checksum offload */ - if (NULL != csum_info) { + if (csum_info != NULL) { /* IP csum offload */ if (csum_info->receive.ip_csum_succeeded) { m_new->m_pkthdr.csum_flags |= (CSUM_IP_CHECKED | CSUM_IP_VALID); sc->hn_csum_ip++; } - /* TCP csum offload */ - if (csum_info->receive.tcp_csum_succeeded) { + /* TCP/UDP csum offload */ + if (csum_info->receive.tcp_csum_succeeded || + csum_info->receive.udp_csum_succeeded) { m_new->m_pkthdr.csum_flags |= (CSUM_DATA_VALID | CSUM_PSEUDO_HDR); m_new->m_pkthdr.csum_data = 0x; - sc->hn_csum_tcp++; + if (csum_info->receive.tcp_csum_succeeded) + sc->hn_csum_tcp++; + else + sc->hn_csum_udp++; } if (csum_info->receive.ip_csum_succeeded && diff --git a/head/sys/dev/hyperv/netvsc/hv_net_vsc.h b/head/sys/dev/hyperv/netvsc/hv_net_vsc.h --- a/head/sys/dev/hyperv/netvsc/hv_net_vsc.h +++ b/head/sys/dev/hyperv/netvsc/hv_net_vsc.h @@ -1036,6 +1036,7 @@ u_long hn_csum_ip; u_long hn_csum_tcp; + u_long hn_csum_udp; u_long hn_csum_trusted; u_long hn_lro_tried; u_long hn_small_pkts; EMAIL PREFERENCES https://reviews.freebsd.org/settings/panel/emailpreferences/ To: sepherosa_gmail.com, delphij, royger, decui_microsoft.com, honzhan_microsoft.com, howard0su_gmail.com, adrian, network Cc: freebsd-net-list diff --git a/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c b/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c --- a/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c +++ b/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c @@ -456,6 +456,8 @@ CTLFLAG_RW, &sc->hn_csum_ip, "RXCSUM IP"); SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "csum_tcp", CTLFLAG_RW, &sc->hn_csum_tcp, "RXCSUM TCP"); + SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "csum_udp", + CTLFLAG_RW, &sc->hn_csum_udp, "RXCSUM UDP"); SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "csum_trusted", CTLFLAG_RW, &sc->hn_csum_trusted, "# of TCP segements that we trust host's csum verification"); @@ -1156,20 +1158,24 @@ m_new->m_pkthdr.rcvif = ifp; /* receive side checksum offload */ - if (NULL != csum_info) { + if (csum_info != NULL) { /* IP csum offload */ if (csum_info->receive.ip_csum_succeeded) { m_new->m_pkthdr.csum_flags |= (CSUM_IP_CHECKED | CSUM_IP_VALID); sc->hn_csum_ip++; } - /* TCP csum offload */ - if (csum_info->receive.tcp_csum_succeeded) { + /* TCP/UDP csum offload */ + if (csum_info->receive.tcp_csum_succeeded || + csum_info->receive.udp_csum_succeeded) { m_new->m_pkthdr.csum_flags |= (CSUM_DATA_VALID | CSUM_PSEUDO_HDR); m_new->m_pkthdr.csum_data = 0x; - sc->hn_csum_tcp++; + if (csum_info->receive.tcp_csum_succeeded) +sc->hn_csum_tcp++; + else +sc->hn_csum_udp++; } if (csum_info->receive.ip_csum_succeeded && diff --git a/head/sys/dev/hyperv/netvsc/hv_net_vsc.h b/head/sys/dev/hyperv/netvsc/hv_net_vsc.h --- a/head/sys/dev/hyperv/netvsc/hv_net_vsc.h +++ b/head/sys/dev/hyperv/netvsc/hv_net_vsc.h @@ -1036,6 +1036,7 @@ u_long hn_csum_ip; u_long hn_csum_tcp; + u_long hn_csum_udp; u_long hn_csum_trusted; u_long hn_lro_tried; u
[Differential] [Closed] D5103: hyperv/hn: Add sysctl to trust host side UDP and IP csum verification
This revision was automatically updated to reflect the committed changes. Closed by commit rS295300: hyperv/hn: Add sysctls to trust host side UDP and IP csum verification (authored by sephe). CHANGED PRIOR TO COMMIT https://reviews.freebsd.org/D5103?vs=12782&id=13034#toc REPOSITORY rS FreeBSD src repository CHANGES SINCE LAST UPDATE https://reviews.freebsd.org/D5103?vs=12782&id=13034 REVISION DETAIL https://reviews.freebsd.org/D5103 AFFECTED FILES head/sys/dev/hyperv/netvsc/hv_net_vsc.h head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c EMAIL PREFERENCES https://reviews.freebsd.org/settings/panel/emailpreferences/ To: sepherosa_gmail.com, delphij, royger, decui_microsoft.com, howard0su_gmail.com, honzhan_microsoft.com, adrian, network Cc: freebsd-net-list diff --git a/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c b/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c --- a/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c +++ b/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c @@ -210,6 +210,14 @@ static int hn_trust_hosttcp = 1; TUNABLE_INT("dev.hn.trust_hosttcp", &hn_trust_hosttcp); +/* Trust udp datagrams verification on host side. */ +static int hn_trust_hostudp = 1; +TUNABLE_INT("dev.hn.trust_hostudp", &hn_trust_hostudp); + +/* Trust ip packets verification on host side. */ +static int hn_trust_hostip = 1; +TUNABLE_INT("dev.hn.trust_hostip", &hn_trust_hostip); + #if __FreeBSD_version >= 1100045 /* Limit TSO burst size */ static int hn_tso_maxlen = 0; @@ -239,6 +247,7 @@ #ifdef HN_LRO_HIWAT static int hn_lro_hiwat_sysctl(SYSCTL_HANDLER_ARGS); #endif +static int hn_trust_hcsum_sysctl(SYSCTL_HANDLER_ARGS); static int hn_tx_chimney_size_sysctl(SYSCTL_HANDLER_ARGS); static int hn_check_iplen(const struct mbuf *, int); static int hn_create_tx_ring(struct hn_softc *sc); @@ -335,8 +344,13 @@ sc->hn_unit = unit; sc->hn_dev = dev; sc->hn_lro_hiwat = HN_LRO_HIWAT_DEF; - sc->hn_trust_hosttcp = hn_trust_hosttcp; sc->hn_direct_tx_size = hn_direct_tx_size; + if (hn_trust_hosttcp) + sc->hn_trust_hcsum |= HN_TRUST_HCSUM_TCP; + if (hn_trust_hostudp) + sc->hn_trust_hcsum |= HN_TRUST_HCSUM_UDP; + if (hn_trust_hostip) + sc->hn_trust_hcsum |= HN_TRUST_HCSUM_IP; sc->hn_tx_taskq = taskqueue_create_fast("hn_tx", M_WAITOK, taskqueue_thread_enqueue, &sc->hn_tx_taskq); @@ -448,19 +462,30 @@ CTLTYPE_INT | CTLFLAG_RW, sc, 0, hn_lro_hiwat_sysctl, "I", "LRO high watermark"); #endif - SYSCTL_ADD_INT(ctx, child, OID_AUTO, "trust_hosttcp", - CTLFLAG_RW, &sc->hn_trust_hosttcp, 0, + SYSCTL_ADD_PROC(ctx, child, OID_AUTO, "trust_hosttcp", + CTLTYPE_INT | CTLFLAG_RW, sc, HN_TRUST_HCSUM_TCP, + hn_trust_hcsum_sysctl, "I", "Trust tcp segement verification on host side, " "when csum info is missing"); + SYSCTL_ADD_PROC(ctx, child, OID_AUTO, "trust_hostudp", + CTLTYPE_INT | CTLFLAG_RW, sc, HN_TRUST_HCSUM_UDP, + hn_trust_hcsum_sysctl, "I", + "Trust udp datagram verification on host side, " + "when csum info is missing"); + SYSCTL_ADD_PROC(ctx, child, OID_AUTO, "trust_hostip", + CTLTYPE_INT | CTLFLAG_RW, sc, HN_TRUST_HCSUM_IP, + hn_trust_hcsum_sysctl, "I", + "Trust ip packet verification on host side, " + "when csum info is missing"); SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "csum_ip", CTLFLAG_RW, &sc->hn_csum_ip, "RXCSUM IP"); SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "csum_tcp", CTLFLAG_RW, &sc->hn_csum_tcp, "RXCSUM TCP"); SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "csum_udp", CTLFLAG_RW, &sc->hn_csum_udp, "RXCSUM UDP"); SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "csum_trusted", CTLFLAG_RW, &sc->hn_csum_trusted, - "# of TCP segements that we trust host's csum verification"); + "# of packets that we trust host's csum verification"); SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "small_pkts", CTLFLAG_RW, &sc->hn_small_pkts, "# of small packets received"); SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "no_txdescs", @@ -503,6 +528,14 @@ CTLFLAG_RD, &hn_trust_hosttcp, 0, "Trust tcp segement verification on host side, " "when csum info is missing (global setting)"); + SYSCTL_ADD_INT(dc_ctx, dc_child, OID_AUTO, "trust_hostudp", + CTLFLAG_RD, &hn_trust_hostudp, 0, + "Trust udp datagram verification on host side, " + "when csum info is missing (global setting)"); + SYSCTL_ADD_INT(dc_ctx, dc_child, OID_AUTO, "trust_hostip", + CTLFLAG_RD, &hn_trust_hostip, 0, + "Trust ip packet verification on host side, " + "when csum info is missing (global setting)"); SYSCTL_ADD_INT(dc_ctx, dc_child, OID_AUTO, "tx_chimney_size", CTLFLAG_RD, &hn_tx_chimney_size, 0, "Chimney send packet size limit"); @@ -1206,15 +1239,28 @@ pr = hn_check_iplen(m_new, hoff); if (pr == IPPROTO_TCP) { -if (sc->hn_trust_hosttcp) { +if (sc->hn_trust_hcsum & HN_TRUST_HCSUM_TCP) { sc->hn_csum_trusted++; m_new->m_pkthdr.csum_flags |= (CS
[Differential] [Closed] D5104: hyperv/hn: Obey IFCAP_RXCSUM
This revision was automatically updated to reflect the committed changes. Closed by commit rS295301: hyperv/hn: Obey IFCAP_RXCSUM configure (authored by sephe). CHANGED PRIOR TO COMMIT https://reviews.freebsd.org/D5104?vs=12783&id=13035#toc REPOSITORY rS FreeBSD src repository CHANGES SINCE LAST UPDATE https://reviews.freebsd.org/D5104?vs=12783&id=13035 REVISION DETAIL https://reviews.freebsd.org/D5104 AFFECTED FILES head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c CHANGE DETAILS diff --git a/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c b/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c --- a/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c +++ b/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c @@ -1142,7 +1142,7 @@ struct mbuf *m_new; struct ifnet *ifp; device_t dev = device_ctx->device; - int size, do_lro = 0; + int size, do_lro = 0, do_csum = 1; if (sc == NULL) { return (0); /* TODO: KYS how can this be! */ @@ -1190,18 +1190,21 @@ } m_new->m_pkthdr.rcvif = ifp; + if (__predict_false((ifp->if_capenable & IFCAP_RXCSUM) == 0)) + do_csum = 0; + /* receive side checksum offload */ if (csum_info != NULL) { /* IP csum offload */ - if (csum_info->receive.ip_csum_succeeded) { + if (csum_info->receive.ip_csum_succeeded && do_csum) { m_new->m_pkthdr.csum_flags |= (CSUM_IP_CHECKED | CSUM_IP_VALID); sc->hn_csum_ip++; } /* TCP/UDP csum offload */ - if (csum_info->receive.tcp_csum_succeeded || - csum_info->receive.udp_csum_succeeded) { + if ((csum_info->receive.tcp_csum_succeeded || + csum_info->receive.udp_csum_succeeded) && do_csum) { m_new->m_pkthdr.csum_flags |= (CSUM_DATA_VALID | CSUM_PSEUDO_HDR); m_new->m_pkthdr.csum_data = 0x; @@ -1239,7 +1242,8 @@ pr = hn_check_iplen(m_new, hoff); if (pr == IPPROTO_TCP) { - if (sc->hn_trust_hcsum & HN_TRUST_HCSUM_TCP) { + if (do_csum && + (sc->hn_trust_hcsum & HN_TRUST_HCSUM_TCP)) { sc->hn_csum_trusted++; m_new->m_pkthdr.csum_flags |= (CSUM_IP_CHECKED | CSUM_IP_VALID | @@ -1249,14 +1253,15 @@ /* Rely on SW csum verification though... */ do_lro = 1; } else if (pr == IPPROTO_UDP) { - if (sc->hn_trust_hcsum & HN_TRUST_HCSUM_UDP) { + if (do_csum && + (sc->hn_trust_hcsum & HN_TRUST_HCSUM_UDP)) { sc->hn_csum_trusted++; m_new->m_pkthdr.csum_flags |= (CSUM_IP_CHECKED | CSUM_IP_VALID | CSUM_DATA_VALID | CSUM_PSEUDO_HDR); m_new->m_pkthdr.csum_data = 0x; } - } else if (pr != IPPROTO_DONE && + } else if (pr != IPPROTO_DONE && do_csum && (sc->hn_trust_hcsum & HN_TRUST_HCSUM_IP)) { sc->hn_csum_trusted++; m_new->m_pkthdr.csum_flags |= EMAIL PREFERENCES https://reviews.freebsd.org/settings/panel/emailpreferences/ To: sepherosa_gmail.com, delphij, royger, decui_microsoft.com, honzhan_microsoft.com, howard0su_gmail.com, adrian, network Cc: freebsd-net-list diff --git a/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c b/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c --- a/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c +++ b/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c @@ -1142,7 +1142,7 @@ struct mbuf *m_new; struct ifnet *ifp; device_t dev = device_ctx->device; - int size, do_lro = 0; + int size, do_lro = 0, do_csum = 1; if (sc == NULL) { return (0); /* TODO: KYS how can this be! */ @@ -1190,18 +1190,21 @@ } m_new->m_pkthdr.rcvif = ifp; + if (__predict_false((ifp->if_capenable & IFCAP_RXCSUM) == 0)) + do_csum = 0; + /* receive side checksum offload */ if (csum_info != NULL) { /* IP csum offload */ - if (csum_info->receive.ip_csum_succeeded) { + if (csum_info->receive.ip_csum_succeeded && do_csum) { m_new->m_pkthdr.csum_flags |= (CSUM_IP_CHECKED | CSUM_IP_VALID); sc->hn_csum_ip++; } /* TCP/UDP csum offload */ - if (csum_info->receive.tcp_csum_succe
[Differential] [Closed] D5158: hyperv/hn: Factor out hn_encap from hn_start_locked()
This revision was automatically updated to reflect the committed changes. Closed by commit rS295302: hyperv/hn: Factor out hn_encap() from hn_start_locked() (authored by sephe). CHANGED PRIOR TO COMMIT https://reviews.freebsd.org/D5158?vs=12925&id=13036#toc REPOSITORY rS FreeBSD src repository CHANGES SINCE LAST UPDATE https://reviews.freebsd.org/D5158?vs=12925&id=13036 REVISION DETAIL https://reviews.freebsd.org/D5158 AFFECTED FILES head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c EMAIL PREFERENCES https://reviews.freebsd.org/settings/panel/emailpreferences/ To: sepherosa_gmail.com, delphij, royger, decui_microsoft.com, honzhan_microsoft.com, howard0su_gmail.com, adrian, network Cc: freebsd-net-list diff --git a/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c b/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c --- a/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c +++ b/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c @@ -254,6 +254,7 @@ static void hn_destroy_tx_ring(struct hn_softc *sc); static void hn_start_taskfunc(void *xsc, int pending); static void hn_txeof_taskfunc(void *xsc, int pending); +static int hn_encap(struct hn_softc *, struct hn_txdesc *, struct mbuf **); static __inline void hn_set_lro_hiwat(struct hn_softc *sc, int hiwat) @@ -744,31 +745,235 @@ } /* - * Start a transmit of one or more packets + * NOTE: + * This this function fails, then both txd and m_head0 will be freed */ static int -hn_start_locked(struct ifnet *ifp, int len) +hn_encap(struct hn_softc *sc, struct hn_txdesc *txd, struct mbuf **m_head0) { - hn_softc_t *sc = ifp->if_softc; - struct hv_device *device_ctx = vmbus_get_devctx(sc->hn_dev); - netvsc_dev *net_dev = sc->net_dev; + bus_dma_segment_t segs[HN_TX_DATA_SEGCNT_MAX]; + int error, nsegs, i; + struct mbuf *m_head = *m_head0; + netvsc_packet *packet; rndis_msg *rndis_mesg; rndis_packet *rndis_pkt; rndis_per_packet_info *rppi; - ndis_8021q_info *rppi_vlan_info; - rndis_tcp_ip_csum_info *csum_info; - rndis_tcp_tso_info *tso_info; - uint32_t rndis_msg_size = 0; + uint32_t rndis_msg_size; + + packet = &txd->netvsc_pkt; + packet->is_data_pkt = TRUE; + packet->tot_data_buf_len = m_head->m_pkthdr.len; + + /* + * extension points to the area reserved for the + * rndis_filter_packet, which is placed just after + * the netvsc_packet (and rppi struct, if present; + * length is updated later). + */ + rndis_mesg = txd->rndis_msg; + /* XXX not necessary */ + memset(rndis_mesg, 0, HN_RNDIS_MSG_LEN); + rndis_mesg->ndis_msg_type = REMOTE_NDIS_PACKET_MSG; + + rndis_pkt = &rndis_mesg->msg.packet; + rndis_pkt->data_offset = sizeof(rndis_packet); + rndis_pkt->data_length = packet->tot_data_buf_len; + rndis_pkt->per_pkt_info_offset = sizeof(rndis_packet); + + rndis_msg_size = RNDIS_MESSAGE_SIZE(rndis_packet); + + if (m_head->m_flags & M_VLANTAG) { + ndis_8021q_info *rppi_vlan_info; + + rndis_msg_size += RNDIS_VLAN_PPI_SIZE; + rppi = hv_set_rppi_data(rndis_mesg, RNDIS_VLAN_PPI_SIZE, + ieee_8021q_info); + + rppi_vlan_info = (ndis_8021q_info *)((uint8_t *)rppi + + rppi->per_packet_info_offset); + rppi_vlan_info->u1.s1.vlan_id = + m_head->m_pkthdr.ether_vtag & 0xfff; + } + + if (m_head->m_pkthdr.csum_flags & CSUM_TSO) { + rndis_tcp_tso_info *tso_info; + struct ether_vlan_header *eh; + int ether_len; + + /* + * XXX need m_pullup and use mtodo + */ + eh = mtod(m_head, struct ether_vlan_header*); + if (eh->evl_encap_proto == htons(ETHERTYPE_VLAN)) + ether_len = ETHER_HDR_LEN + ETHER_VLAN_ENCAP_LEN; + else + ether_len = ETHER_HDR_LEN; + + rndis_msg_size += RNDIS_TSO_PPI_SIZE; + rppi = hv_set_rppi_data(rndis_mesg, RNDIS_TSO_PPI_SIZE, + tcp_large_send_info); + + tso_info = (rndis_tcp_tso_info *)((uint8_t *)rppi + + rppi->per_packet_info_offset); + tso_info->lso_v2_xmit.type = + RNDIS_TCP_LARGE_SEND_OFFLOAD_V2_TYPE; + +#ifdef INET + if (m_head->m_pkthdr.csum_flags & CSUM_IP_TSO) { + struct ip *ip = + (struct ip *)(m_head->m_data + ether_len); + unsigned long iph_len = ip->ip_hl << 2; + struct tcphdr *th = + (struct tcphdr *)((caddr_t)ip + iph_len); + + tso_info->lso_v2_xmit.ip_version = + RNDIS_TCP_LARGE_SEND_OFFLOAD_IPV4; + ip->ip_len = 0; + ip->ip_sum = 0; + + th->th_sum = in_pseudo(ip->ip_src.s_addr, + ip->ip_dst.s_addr, htons(IPPROTO_TCP)); + } +#endif +#if defined(INET6) && defined(INET) + else +#endif +#ifdef INET6 + { + struct ip6_hdr *ip6 = (struct ip6_hdr *) + (m_head->m_data + ether_len); + struct tcphdr *th = (struct tcphdr *)(ip6 + 1); + + tso_info->lso_v2_xmit.ip_version = + RNDIS_TCP_LARGE_SEND_OFFLOAD_IPV6; + ip6->ip6_plen = 0; + th->th_sum = in6_cksum_pseudo(ip6, 0, IPPROTO_TCP, 0); + } +#endif + tso_info->lso_v2_xmit.tcp_header_offset = 0; + tso_info->lso_v2_xmit.mss = m_head->m_pkthdr.tso_segsz; + } else if (m_head->m_pkthdr.csum_flags & sc->hn_csum_assist) { + rndis_tcp_ip_csum_info *csum_info; + + rndis_
[Differential] [Closed] D5159: hyperv/hn: Recover half of the chimney sending space
This revision was automatically updated to reflect the committed changes. Closed by commit rS295303: hyperv/hn: Recover half of the chimney sending space (authored by sephe). CHANGED PRIOR TO COMMIT https://reviews.freebsd.org/D5159?vs=12926&id=13037#toc REPOSITORY rS FreeBSD src repository CHANGES SINCE LAST UPDATE https://reviews.freebsd.org/D5159?vs=12926&id=13037 REVISION DETAIL https://reviews.freebsd.org/D5159 AFFECTED FILES head/sys/dev/hyperv/netvsc/hv_net_vsc.c CHANGE DETAILS diff --git a/head/sys/dev/hyperv/netvsc/hv_net_vsc.c b/head/sys/dev/hyperv/netvsc/hv_net_vsc.c --- a/head/sys/dev/hyperv/netvsc/hv_net_vsc.c +++ b/head/sys/dev/hyperv/netvsc/hv_net_vsc.c @@ -136,15 +136,15 @@ int i; for (i = 0; i < bitsmap_words; i++) { - idx = ffs(~bitsmap[i]); + idx = ffsl(~bitsmap[i]); if (0 == idx) continue; idx--; - if (i * BITS_PER_LONG + idx >= net_dev->send_section_count) - return (ret); + KASSERT(i * BITS_PER_LONG + idx < net_dev->send_section_count, + ("invalid i %d and idx %lu", i, idx)); - if (synch_test_and_set_bit(idx, &bitsmap[i])) + if (atomic_testandset_long(&bitsmap[i], idx)) continue; ret = i * BITS_PER_LONG + idx; @@ -789,8 +789,27 @@ if (NULL != net_vsc_pkt) { if (net_vsc_pkt->send_buf_section_idx != NVSP_1_CHIMNEY_SEND_INVALID_SECTION_INDEX) { - synch_change_bit(net_vsc_pkt->send_buf_section_idx, - net_dev->send_section_bitsmap); + u_long mask; + int idx; + + idx = net_vsc_pkt->send_buf_section_idx / + BITS_PER_LONG; + KASSERT(idx < net_dev->bitsmap_words, + ("invalid section index %u", + net_vsc_pkt->send_buf_section_idx)); + mask = 1UL << + (net_vsc_pkt->send_buf_section_idx % + BITS_PER_LONG); + + KASSERT(net_dev->send_section_bitsmap[idx] & + mask, + ("index bitmap 0x%lx, section index %u, " + "bitmap idx %d, bitmask 0x%lx", + net_dev->send_section_bitsmap[idx], + net_vsc_pkt->send_buf_section_idx, + idx, mask)); + atomic_clear_long( + &net_dev->send_section_bitsmap[idx], mask); } /* Notify the layer above us */ EMAIL PREFERENCES https://reviews.freebsd.org/settings/panel/emailpreferences/ To: sepherosa_gmail.com, delphij, royger, decui_microsoft.com, honzhan_microsoft.com, howard0su_gmail.com, adrian, network Cc: freebsd-net-list diff --git a/head/sys/dev/hyperv/netvsc/hv_net_vsc.c b/head/sys/dev/hyperv/netvsc/hv_net_vsc.c --- a/head/sys/dev/hyperv/netvsc/hv_net_vsc.c +++ b/head/sys/dev/hyperv/netvsc/hv_net_vsc.c @@ -136,15 +136,15 @@ int i; for (i = 0; i < bitsmap_words; i++) { - idx = ffs(~bitsmap[i]); + idx = ffsl(~bitsmap[i]); if (0 == idx) continue; idx--; - if (i * BITS_PER_LONG + idx >= net_dev->send_section_count) - return (ret); + KASSERT(i * BITS_PER_LONG + idx < net_dev->send_section_count, + ("invalid i %d and idx %lu", i, idx)); - if (synch_test_and_set_bit(idx, &bitsmap[i])) + if (atomic_testandset_long(&bitsmap[i], idx)) continue; ret = i * BITS_PER_LONG + idx; @@ -789,8 +789,27 @@ if (NULL != net_vsc_pkt) { if (net_vsc_pkt->send_buf_section_idx != NVSP_1_CHIMNEY_SEND_INVALID_SECTION_INDEX) { -synch_change_bit(net_vsc_pkt->send_buf_section_idx, -net_dev->send_section_bitsmap); +u_long mask; +int idx; + +idx = net_vsc_pkt->send_buf_section_idx / +BITS_PER_LONG; +KASSERT(idx < net_dev->bitsmap_words, +("invalid section index %u", + net_vsc_pkt->send_buf_section_idx)); +mask = 1UL << +(net_vsc_pkt->send_buf_section_idx % + BITS_PER_LONG); + +KASSERT(net_dev->send_section_bitsmap[idx] & +mask, +("index bitmap 0x%lx, section index %u, " + "bitmap idx %d, bitmask 0x%lx", + net_dev->send_section_bitsmap[idx], + net_vsc_pkt->send_buf_section_idx, + idx, mask)); +atomic_clear_long( +&net_dev->send_section_bitsmap[idx], mask); } /* Notify the layer above us */ _
[Differential] [Closed] D5166: hyperv/hn: Increase LRO entry count to 128 by default
This revision was automatically updated to reflect the committed changes. Closed by commit rS295304: hyperv/hn: Increase LRO entry count to 128 by default (authored by sephe). CHANGED PRIOR TO COMMIT https://reviews.freebsd.org/D5166?vs=12947&id=13038#toc REPOSITORY rS FreeBSD src repository CHANGES SINCE LAST UPDATE https://reviews.freebsd.org/D5166?vs=12947&id=13038 REVISION DETAIL https://reviews.freebsd.org/D5166 AFFECTED FILES head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c CHANGE DETAILS diff --git a/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c b/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c --- a/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c +++ b/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c @@ -132,6 +132,8 @@ /* YYY should get it from the underlying channel */ #define HN_TX_DESC_CNT 512 +#define HN_LROENT_CNT_DEF128 + #define HN_RNDIS_MSG_LEN \ (sizeof(rndis_msg) + \ RNDIS_VLAN_PPI_SIZE + \ @@ -232,6 +234,13 @@ static int hn_direct_tx_size = HN_DIRECT_TX_SIZE_DEF; TUNABLE_INT("dev.hn.direct_tx_size", &hn_direct_tx_size); +#if defined(INET) || defined(INET6) +#if __FreeBSD_version >= 1100095 +static int hn_lro_entry_count = HN_LROENT_CNT_DEF; +TUNABLE_INT("dev.hn.lro_entry_count", &hn_lro_entry_count); +#endif +#endif + /* * Forward declarations */ @@ -335,6 +344,11 @@ #if __FreeBSD_version >= 1100045 int tso_maxlen; #endif +#if defined(INET) || defined(INET6) +#if __FreeBSD_version >= 1100095 + int lroent_cnt; +#endif +#endif sc = device_get_softc(dev); if (sc == NULL) { @@ -417,9 +431,17 @@ } #if defined(INET) || defined(INET6) +#if __FreeBSD_version >= 1100095 + lroent_cnt = hn_lro_entry_count; + if (lroent_cnt < TCP_LRO_ENTRIES) + lroent_cnt = TCP_LRO_ENTRIES; + tcp_lro_init_args(&sc->hn_lro, ifp, lroent_cnt, 0); + device_printf(dev, "LRO: entry count %d\n", lroent_cnt); +#else tcp_lro_init(&sc->hn_lro); /* Driver private LRO settings */ sc->hn_lro.ifp = ifp; +#endif #ifdef HN_LRO_HIWAT sc->hn_lro.lro_hiwat = sc->hn_lro_hiwat; #endif @@ -547,6 +569,12 @@ SYSCTL_ADD_INT(dc_ctx, dc_child, OID_AUTO, "direct_tx_size", CTLFLAG_RD, &hn_direct_tx_size, 0, "Size of the packet for direct transmission"); +#if defined(INET) || defined(INET6) +#if __FreeBSD_version >= 1100095 + SYSCTL_ADD_INT(dc_ctx, dc_child, OID_AUTO, "lro_entry_count", + CTLFLAG_RD, &hn_lro_entry_count, 0, "LRO entry count"); +#endif +#endif } return (0); EMAIL PREFERENCES https://reviews.freebsd.org/settings/panel/emailpreferences/ To: sepherosa_gmail.com, delphij, royger, decui_microsoft.com, honzhan_microsoft.com, howard0su_gmail.com, adrian, network Cc: freebsd-net-list diff --git a/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c b/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c --- a/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c +++ b/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c @@ -132,6 +132,8 @@ /* YYY should get it from the underlying channel */ #define HN_TX_DESC_CNT 512 +#define HN_LROENT_CNT_DEF 128 + #define HN_RNDIS_MSG_LEN \ (sizeof(rndis_msg) + \ RNDIS_VLAN_PPI_SIZE + \ @@ -232,6 +234,13 @@ static int hn_direct_tx_size = HN_DIRECT_TX_SIZE_DEF; TUNABLE_INT("dev.hn.direct_tx_size", &hn_direct_tx_size); +#if defined(INET) || defined(INET6) +#if __FreeBSD_version >= 1100095 +static int hn_lro_entry_count = HN_LROENT_CNT_DEF; +TUNABLE_INT("dev.hn.lro_entry_count", &hn_lro_entry_count); +#endif +#endif + /* * Forward declarations */ @@ -335,6 +344,11 @@ #if __FreeBSD_version >= 1100045 int tso_maxlen; #endif +#if defined(INET) || defined(INET6) +#if __FreeBSD_version >= 1100095 + int lroent_cnt; +#endif +#endif sc = device_get_softc(dev); if (sc == NULL) { @@ -417,9 +431,17 @@ } #if defined(INET) || defined(INET6) +#if __FreeBSD_version >= 1100095 + lroent_cnt = hn_lro_entry_count; + if (lroent_cnt < TCP_LRO_ENTRIES) + lroent_cnt = TCP_LRO_ENTRIES; + tcp_lro_init_args(&sc->hn_lro, ifp, lroent_cnt, 0); + device_printf(dev, "LRO: entry count %d\n", lroent_cnt); +#else tcp_lro_init(&sc->hn_lro); /* Driver private LRO settings */ sc->hn_lro.ifp = ifp; +#endif #ifdef HN_LRO_HIWAT sc->hn_lro.lro_hiwat = sc->hn_lro_hiwat; #endif @@ -547,6 +569,12 @@ SYSCTL_ADD_INT(dc_ctx, dc_child, OID_AUTO, "direct_tx_size", CTLFLAG_RD, &hn_direct_tx_size, 0, "Size of the packet for direct transmission"); +#if defined(INET) || defined(INET6) +#if __FreeBSD_version >= 1100095 + SYSCTL_ADD_INT(dc_ctx, dc_child, OID_AUTO, "lro_entry_count", + CTLFLAG_RD, &hn_lro_entry_count, 0, "LRO entry count"); +#endif
[Differential] [Closed] D5167: hyperv/hn: Move LRO flush to the channel processing rollup
This revision was automatically updated to reflect the committed changes. Closed by commit rS295305: hyperv/hn: Move LRO flush to the channel processing rollup (authored by sephe). CHANGED PRIOR TO COMMIT https://reviews.freebsd.org/D5167?vs=12948&id=13039#toc REPOSITORY rS FreeBSD src repository CHANGES SINCE LAST UPDATE https://reviews.freebsd.org/D5167?vs=12948&id=13039 REVISION DETAIL https://reviews.freebsd.org/D5167 AFFECTED FILES head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c CHANGE DETAILS diff --git a/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c b/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c --- a/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c +++ b/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c @@ -764,6 +764,15 @@ netvsc_channel_rollup(struct hv_device *device_ctx) { struct hn_softc *sc = device_get_softc(device_ctx->device); +#if defined(INET) || defined(INET6) + struct lro_ctrl *lro = &sc->hn_lro; + struct lro_entry *queued; + + while ((queued = SLIST_FIRST(&lro->lro_active)) != NULL) { + SLIST_REMOVE_HEAD(&lro->lro_active, next); + tcp_lro_flush(lro, queued); + } +#endif if (!sc->hn_txeof) return; @@ -1338,18 +1347,8 @@ } void -netvsc_recv_rollup(struct hv_device *device_ctx) +netvsc_recv_rollup(struct hv_device *device_ctx __unused) { -#if defined(INET) || defined(INET6) - hn_softc_t *sc = device_get_softc(device_ctx->device); - struct lro_ctrl *lro = &sc->hn_lro; - struct lro_entry *queued; - - while ((queued = SLIST_FIRST(&lro->lro_active)) != NULL) { - SLIST_REMOVE_HEAD(&lro->lro_active, next); - tcp_lro_flush(lro, queued); - } -#endif } /* EMAIL PREFERENCES https://reviews.freebsd.org/settings/panel/emailpreferences/ To: sepherosa_gmail.com, delphij, royger, decui_microsoft.com, honzhan_microsoft.com, howard0su_gmail.com, adrian, network Cc: freebsd-net-list diff --git a/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c b/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c --- a/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c +++ b/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c @@ -764,6 +764,15 @@ netvsc_channel_rollup(struct hv_device *device_ctx) { struct hn_softc *sc = device_get_softc(device_ctx->device); +#if defined(INET) || defined(INET6) + struct lro_ctrl *lro = &sc->hn_lro; + struct lro_entry *queued; + + while ((queued = SLIST_FIRST(&lro->lro_active)) != NULL) { + SLIST_REMOVE_HEAD(&lro->lro_active, next); + tcp_lro_flush(lro, queued); + } +#endif if (!sc->hn_txeof) return; @@ -1338,18 +1347,8 @@ } void -netvsc_recv_rollup(struct hv_device *device_ctx) +netvsc_recv_rollup(struct hv_device *device_ctx __unused) { -#if defined(INET) || defined(INET6) - hn_softc_t *sc = device_get_softc(device_ctx->device); - struct lro_ctrl *lro = &sc->hn_lro; - struct lro_entry *queued; - - while ((queued = SLIST_FIRST(&lro->lro_active)) != NULL) { - SLIST_REMOVE_HEAD(&lro->lro_active, next); - tcp_lro_flush(lro, queued); - } -#endif } /* ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Differential] [Closed] D5175: hyperv/hn: Add an option to always do transmission scheduling
This revision was automatically updated to reflect the committed changes. Closed by commit rS295306: hyperv/hn: Add an option to always do transmission scheduling (authored by sephe). CHANGED PRIOR TO COMMIT https://reviews.freebsd.org/D5175?vs=12968&id=13040#toc REPOSITORY rS FreeBSD src repository CHANGES SINCE LAST UPDATE https://reviews.freebsd.org/D5175?vs=12968&id=13040 REVISION DETAIL https://reviews.freebsd.org/D5175 AFFECTED FILES head/sys/dev/hyperv/netvsc/hv_net_vsc.h head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c CHANGE DETAILS diff --git a/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c b/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c --- a/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c +++ b/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c @@ -534,6 +534,10 @@ SYSCTL_ADD_INT(ctx, child, OID_AUTO, "direct_tx_size", CTLFLAG_RW, &sc->hn_direct_tx_size, 0, "Size of the packet for direct transmission"); + SYSCTL_ADD_INT(ctx, child, OID_AUTO, "sched_tx", + CTLFLAG_RW, &sc->hn_sched_tx, 0, + "Always schedule transmission " + "instead of doing direct transmission"); if (unit == 0) { struct sysctl_ctx_list *dc_ctx; @@ -1602,26 +1606,31 @@ static void hn_start(struct ifnet *ifp) { - hn_softc_t *sc; + struct hn_softc *sc = ifp->if_softc; + + if (sc->hn_sched_tx) + goto do_sched; - sc = ifp->if_softc; if (NV_TRYLOCK(sc)) { int sched; sched = hn_start_locked(ifp, sc->hn_direct_tx_size); NV_UNLOCK(sc); if (!sched) return; } +do_sched: taskqueue_enqueue_fast(sc->hn_tx_taskq, &sc->hn_start_task); } static void hn_start_txeof(struct ifnet *ifp) { - hn_softc_t *sc; + struct hn_softc *sc = ifp->if_softc; + + if (sc->hn_sched_tx) + goto do_sched; - sc = ifp->if_softc; if (NV_TRYLOCK(sc)) { int sched; @@ -1633,6 +1642,7 @@ &sc->hn_start_task); } } else { +do_sched: /* * Release the OACTIVE earlier, with the hope, that * others could catch up. The task will clear the diff --git a/head/sys/dev/hyperv/netvsc/hv_net_vsc.h b/head/sys/dev/hyperv/netvsc/hv_net_vsc.h --- a/head/sys/dev/hyperv/netvsc/hv_net_vsc.h +++ b/head/sys/dev/hyperv/netvsc/hv_net_vsc.h @@ -1023,6 +1023,7 @@ int hn_txdesc_avail; int hn_txeof; + int hn_sched_tx; int hn_direct_tx_size; struct taskqueue *hn_tx_taskq; struct task hn_start_task; EMAIL PREFERENCES https://reviews.freebsd.org/settings/panel/emailpreferences/ To: sepherosa_gmail.com, delphij, royger, decui_microsoft.com, howard0su_gmail.com, adrian, network, honzhan_microsoft.com Cc: freebsd-virtualization-list, freebsd-net-list diff --git a/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c b/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c --- a/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c +++ b/head/sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c @@ -534,6 +534,10 @@ SYSCTL_ADD_INT(ctx, child, OID_AUTO, "direct_tx_size", CTLFLAG_RW, &sc->hn_direct_tx_size, 0, "Size of the packet for direct transmission"); + SYSCTL_ADD_INT(ctx, child, OID_AUTO, "sched_tx", + CTLFLAG_RW, &sc->hn_sched_tx, 0, + "Always schedule transmission " + "instead of doing direct transmission"); if (unit == 0) { struct sysctl_ctx_list *dc_ctx; @@ -1602,26 +1606,31 @@ static void hn_start(struct ifnet *ifp) { - hn_softc_t *sc; + struct hn_softc *sc = ifp->if_softc; + + if (sc->hn_sched_tx) + goto do_sched; - sc = ifp->if_softc; if (NV_TRYLOCK(sc)) { int sched; sched = hn_start_locked(ifp, sc->hn_direct_tx_size); NV_UNLOCK(sc); if (!sched) return; } +do_sched: taskqueue_enqueue_fast(sc->hn_tx_taskq, &sc->hn_start_task); } static void hn_start_txeof(struct ifnet *ifp) { - hn_softc_t *sc; + struct hn_softc *sc = ifp->if_softc; + + if (sc->hn_sched_tx) + goto do_sched; - sc = ifp->if_softc; if (NV_TRYLOCK(sc)) { int sched; @@ -1633,6 +1642,7 @@ &sc->hn_start_task); } } else { +do_sched: /* * Release the OACTIVE earlier, with the hope, that * others could catch up. The task will clear the diff --git a/head/sys/dev/hyperv/netvsc/hv_net_vsc.h b/head/sys/dev/hyperv/netvsc/hv_net_vsc.h --- a/head/sys/dev/hyperv/netvsc/hv_net_vsc.h +++ b/head/sys/dev/hyperv/netvsc/hv_net_vsc.h @@ -1023,6 +1023,7 @@ int hn_txdesc_avail; int hn_txeof; + int hn_sched_tx; int hn_direct_tx_size; struct taskqueue *hn_tx_taskq; struct task hn_start_task; ___ freebsd-net@fr
[Differential] [Updated] D4825: tcp/lro: Add network driver configurable LRO entry depth
hselasky added a comment. FYI https://reviews.freebsd.org/D1761 might be related to this one. Should you check that "lc->lro_hiwat" is greater or equal to "lc->ifp->if_mtu" ? --HPS REVISION DETAIL https://reviews.freebsd.org/D4825 EMAIL PREFERENCES https://reviews.freebsd.org/settings/panel/emailpreferences/ To: sepherosa_gmail.com, network, transport, adrian, delphij, decui_microsoft.com, honzhan_microsoft.com, howard0su_gmail.com, glebius Cc: hselasky, np, freebsd-net-list ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Differential] [Commented On] D4825: tcp/lro: Add network driver configurable LRO entry depth
sepherosa_gmail.com added a comment. In https://reviews.freebsd.org/D4825#110653, @hselasky wrote: > FYI > > https://reviews.freebsd.org/D1761 might be related to this one. > > Should you check that "lc->lro_hiwat" is greater or equal to "lc->ifp->if_mtu" ? > > --HPS I have discarded this one, please take a look at this: https://reviews.freebsd.org/D5185 Thanks, sephe REVISION DETAIL https://reviews.freebsd.org/D4825 EMAIL PREFERENCES https://reviews.freebsd.org/settings/panel/emailpreferences/ To: sepherosa_gmail.com, network, transport, adrian, delphij, decui_microsoft.com, honzhan_microsoft.com, howard0su_gmail.com, glebius Cc: hselasky, np, freebsd-net-list ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
[Differential] [Abandoned] D4825: tcp/lro: Add network driver configurable LRO entry depth
sepherosa_gmail.com abandoned this revision. sepherosa_gmail.com added a comment. Updated version at: https://reviews.freebsd.org/D5185 REVISION DETAIL https://reviews.freebsd.org/D4825 EMAIL PREFERENCES https://reviews.freebsd.org/settings/panel/emailpreferences/ To: sepherosa_gmail.com, network, transport, adrian, delphij, decui_microsoft.com, honzhan_microsoft.com, howard0su_gmail.com, glebius Cc: hselasky, np, freebsd-net-list ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"