Re: cxgbe's native netmap support broken since r307394
Hi Navdeep, Indeed, we have reviewed the code, and we think it is ok to implement nm_os_ifnet_lock() with IFNET_RLOCK(), instead of using IFNET_WLOCK(). Since IFNET_RLOCK() results into sx_slock(), this should fix the issue. On FreeBSD, this locking is needed to protect a flag read by nm_iszombie(). However, on Linux the same lock is also needed to protect the call to the nm_hw_register() callback, so we prefer to have an "unified" locking scheme, i.e. always calling nm_hw_register under the lock. Does this make sense to you? Would it be easy for you to make a quick test by replacing IFNET_WLOCK with IFNET_RLOCK? Thanks, Vincenzo 2016-12-17 23:28 GMT+01:00 Navdeep Parhar : > Luigi, Vincenzo, > > The last major update to netmap (r307394 and followups) broke cxgbe's > native netmap support. The problem is that netmap_hw_reg now holds an > rw_lock around the driver's netmap_on/off routines. It has always been > safe for the driver to sleep during these operations but now it panics > instead. > > Why is IFNET_WLOCK needed here? It seems like a regression to disallow > sleep on the control path. > > Regards, > Navdeep > > begin_synchronized_op with the following non-sleepable locks held: > exclusive rw ifnet_rw (ifnet_rw) r = 0 (0x8271d680) locked @ > /root/ws/head/sys/dev/netmap/netmap_freebsd.c:95 > stack backtrace: > #0 0x810837a5 at witness_debugger+0xe5 > #1 0x81084d88 at witness_warn+0x3b8 > #2 0x83ef2bcc at begin_synchronized_op+0x6c > #3 0x83f14beb at cxgbe_netmap_reg+0x5b > #4 0x809846f1 at netmap_hw_reg+0x81 > #5 0x809806de at netmap_do_regif+0x19e > #6 0x8098121d at netmap_ioctl+0x7ad > #7 0x8098682f at freebsd_netmap_ioctl+0x5f -- Vincenzo Maffione ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Avoid using RFC3927 outside of the link
Hi, I have a router that is mutli-homed with BGP. One of my peers is using an RFC3927 address for the connection. If I traceroute to host behind that route where we use a route via this peer to reply, the ICMP reply display that link-local IP: 1. AS12876 195-154-86-1.rev.poneytelecom.eu (195.154.86.1) 0.0%10 0.9 0.8 0.6 1.1 0.0 2. AS12876 a9k2-49e-s46-3.dc3.poneytelecom.eu (195.154.1.86) 0.0%10 0.8 1.1 0.7 2.2 0.5 3. AS12876 pni-th2-a9k2.th2.poneytelecom.eu (195.154.1.75) 0.0%10 1.2 1.5 1.1 3.5 0.6 4. AS??? equinix-th2.quantic-telecom.net (195.42.144.192)0.0%10 1.1 1.0 0.9 1.2 0.0 5. AS198507185.132.75.33 0.0%10 7.3 7.4 7.2 7.9 0.0 6. AS??? 169.254.1.2 0.0%10 7.6 7.6 7.4 7.9 0.0 7. AS204092kaiminus.swordarmor.fr (89.234.186.26) 0.0%10 8.1 11.4 7.8 41.5 10.6 Is it possible to avoid this behaviour and reply with the public IP (89.234.186.1) instead? What I am looking for is an equivalent of `ip addr change 169.254.1.2/30 scope link` on Linux. Thanks, -- alarig signature.asc Description: PGP signature
Re: Avoid using RFC3927 outside of the link
20.12.2016 1:46, Alarig Le Lay пишет: Is it possible to avoid this behaviour and reply with the public IP (89.234.186.1) instead? try: sysctl net.inet.icmp.reply_from_interface=1 ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Avoid using RFC3927 outside of the link
On Tue Dec 20 01:51:17 2016, Eugene Grosbein wrote: > 20.12.2016 1:46, Alarig Le Lay пишет: > > > Is it possible to avoid this behaviour and reply with the public IP > > (89.234.186.1) instead? > > try: sysctl net.inet.icmp.reply_from_interface=1 If an AS choose to go to us thought this peer, packets will come in by this interface, so our router will continue to reply with the apipa IP for those ASes. -- alarig signature.asc Description: PGP signature
Re: cxgbe's native netmap support broken since r307394
IFNET_RLOCK will work, thanks. Navdeep On Mon, Dec 19, 2016 at 3:21 AM, Vincenzo Maffione wrote: > Hi Navdeep, > > Indeed, we have reviewed the code, and we think it is ok to > implement nm_os_ifnet_lock() with IFNET_RLOCK(), instead of using > IFNET_WLOCK(). > Since IFNET_RLOCK() results into sx_slock(), this should fix the issue. > > On FreeBSD, this locking is needed to protect a flag read by nm_iszombie(). > However, on Linux the same lock is also needed to protect the call to > the nm_hw_register() callback, so we prefer to have an "unified" > locking scheme, i.e. always calling nm_hw_register under the lock. > > Does this make sense to you? Would it be easy for you to make a quick > test by replacing IFNET_WLOCK with IFNET_RLOCK? > > Thanks, > Vincenzo > > 2016-12-17 23:28 GMT+01:00 Navdeep Parhar : >> Luigi, Vincenzo, >> >> The last major update to netmap (r307394 and followups) broke cxgbe's >> native netmap support. The problem is that netmap_hw_reg now holds an >> rw_lock around the driver's netmap_on/off routines. It has always been >> safe for the driver to sleep during these operations but now it panics >> instead. >> >> Why is IFNET_WLOCK needed here? It seems like a regression to disallow >> sleep on the control path. >> >> Regards, >> Navdeep >> >> begin_synchronized_op with the following non-sleepable locks held: >> exclusive rw ifnet_rw (ifnet_rw) r = 0 (0x8271d680) locked @ >> /root/ws/head/sys/dev/netmap/netmap_freebsd.c:95 >> stack backtrace: >> #0 0x810837a5 at witness_debugger+0xe5 >> #1 0x81084d88 at witness_warn+0x3b8 >> #2 0x83ef2bcc at begin_synchronized_op+0x6c >> #3 0x83f14beb at cxgbe_netmap_reg+0x5b >> #4 0x809846f1 at netmap_hw_reg+0x81 >> #5 0x809806de at netmap_do_regif+0x19e >> #6 0x8098121d at netmap_ioctl+0x7ad >> #7 0x8098682f at freebsd_netmap_ioctl+0x5f > > > > -- > Vincenzo Maffione > ___ > freebsd-net@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Avoid using RFC3927 outside of the link
20.12.2016 2:05, Alarig Le Lay пишет: On Tue Dec 20 01:51:17 2016, Eugene Grosbein wrote: 20.12.2016 1:46, Alarig Le Lay пишет: Is it possible to avoid this behaviour and reply with the public IP (89.234.186.1) instead? try: sysctl net.inet.icmp.reply_from_interface=1 If an AS choose to go to us thought this peer, packets will come in by this interface, so our router will continue to reply with the apipa IP for those ASes. Well, you can always use brute force instead: ipfw nat 169 config reset ip 89.234.186.1 && \ ipfw add 60 nat 169 ip from 169.254.0.0/16 to any out xmit igb0 That's ugly but works. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: sonewconn: pcb [...]: Listen queue overflow to human-readable form
On 12/16/16 at 11:20P, Andrey V. Elsukov wrote: > On 15.12.2016 20:51, hiren panchasara wrote: > > On 12/15/16 at 05:23P, Eugene M. Zheganin wrote: > >> Hi. > >> > >> Sometimes on one of my servers I got dmesg full of > >> > >> sonewconn: pcb 0xf80373aec000: Listen queue overflow: 49 already in > >> queue awaiting acceptance (6 occurrences) > > [skip] > >> > >> but at the time of investigation the socket is already closed and lsof > >> cannot show me the owner. I wonder if the kernel can itself decode this > >> output and write it in the human-readable form ? > > > > I have this not-quite-correct patch that may help you. (If you follow the > > discussion there, you'd know why its not complete.) > > > > https://lists.freebsd.org/pipermail/freebsd-net/2014-March/038074.html > > Hi Hiren, > > I think the check for socket's domain should be enough? > > > -- > WBR, Andrey V. Elsukov > Index: sys/kern/uipc_socket.c > === > --- sys/kern/uipc_socket.c(revision 309834) > +++ sys/kern/uipc_socket.c(working copy) > @@ -139,6 +139,7 @@ __FBSDID("$FreeBSD$"); > #include > #include > #include > +#include > > #include > > @@ -577,10 +578,15 @@ sonewconn(struct socket *head, int connstatus) > overcount++; > > if (ratecheck(&lastover, &overinterval)) { > - log(LOG_DEBUG, "%s: pcb %p: Listen queue overflow: " > - "%i already in queue awaiting acceptance " > - "(%d occurrences)\n", > - __func__, head->so_pcb, head->so_qlen, overcount); > + if (INP_CHECK_SOCKAF(head, AF_INET) || > + INP_CHECK_SOCKAF(head, AF_INET6)) > + over = ntohs(sotoinpcb(head)->inp_lport); > + else > + over = 0; > + log(LOG_DEBUG, "%s: pcb %p: Listen queue overflow on " > + "port %d: %i already in queue awaiting acceptance " > + "(%d occurrences)\n", __func__, head->so_pcb, > + over, head->so_qlen, overcount); > > overcount = 0; > } Andrey, Thanks, this seems correct to me. :-) Cheers, Hiren pgpXb_qjWpV_Q.pgp Description: PGP signature
Re: Avoid using RFC3927 outside of the link
On Tue Dec 20 02:34:29 2016, Eugene Grosbein wrote: > Well, you can always use brute force instead: > > ipfw nat 169 config reset ip 89.234.186.1 && \ > ipfw add 60 nat 169 ip from 169.254.0.0/16 to any out xmit igb0 > > That's ugly but works. I will work just by side effect: by doing this, I will send BGP packets from 89.234.186.1, which is an IP than the peer learned by BGP. This will create a recursive loop, and the session will be shut. So, no more traffic will transit through this interface, and this IP will not be displayed anymore :p -- alarig signature.asc Description: PGP signature
Re: Avoid using RFC3927 outside of the link
On 19/12/2016 21:01, Alarig Le Lay wrote: On Tue Dec 20 02:34:29 2016, Eugene Grosbein wrote: Well, you can always use brute force instead: ipfw nat 169 config reset ip 89.234.186.1 && \ ipfw add 60 nat 169 ip from 169.254.0.0/16 to any out xmit igb0 That's ugly but works. I will work just by side effect: by doing this, I will send BGP packets from 89.234.186.1, which is an IP than the peer learned by BGP. This will create a recursive loop, and the session will be shut. So, no more traffic will transit through this interface, and this IP will not be displayed anymore :p Use valid addressing and optionally, working config, there is no problem here. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Avoid using RFC3927 outside of the link
20.12.2016 4:01, Alarig Le Lay пишет: On Tue Dec 20 02:34:29 2016, Eugene Grosbein wrote: Well, you can always use brute force instead: ipfw nat 169 config reset ip 89.234.186.1 && \ ipfw add 60 nat 169 ip from 169.254.0.0/16 to any out xmit igb0 That's ugly but works. I will work just by side effect: by doing this, I will send BGP packets from 89.234.186.1, which is an IP than the peer learned by BGP. This will create a recursive loop, and the session will be shut. So, no more traffic will transit through this interface, and this IP will not be displayed anymore :p You could also use another public IP as primary address for interface in question and an address from 169.254.0.0/16 as secondary one. BGP will still work and kernel/ICMP will use public IP. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"