RE: kernel: arpresolve: can't allocate llinfo for 65.59.233.102
Hi! I just noticed that it is my default route that is changing for the aforementioned in the subject IP address. What the "$?% could cause that? Could MPD push that route as default? For what reason? That IP address doesn't even belong to us. -- -Message d'origine- De : Gleb Smirnoff [mailto:gleb...@freebsd.org] Envoyé : 10 septembre 2012 10:03 À : Dominic Blais Cc : freebsd-net@freebsd.org Objet : Re: kernel: arpresolve: can't allocate llinfo for 65.59.233.102 On Mon, Sep 10, 2012 at 09:29:44AM -0400, Dominic Blais wrote: D> Hi, D> D> We have a PPPoE server running FreeBSD 9.0-RELEASE-p3 with mpd-5.6. It used to work very well for 6 weeks and now the "Internet traffic" stops almost each day. D> D> Symptoms: D> D> - I still can ssh to the server from the LAN. D> - The users connected to mpd by PPPoE can't access Internet. D> - I tried to do a "whois 65.59.233.102" to know where it's from and the whois server doesn't answer probably because the request never get sent. D> D> D> I tried to restart mpd, the users are reconnecting but they still can't access the Internet. I need to reboot to recover from that. Note that, each day it happened, it's always the same IP address (65.59.233.102) that's in the error message. That error appears up to 5 times per second. D> D> Actually, this server has only 1 NIC (bge0). It has an IP on that internet and also a vlan interface (vlan0 [pvid 2]) that's used for PPPoE between mpd and the pppoe clients. I'm not sure that this message is directly related to connectivity problems. Have you tried common debugging sequence when internet connectivity breaks: tcpdumping, looking at ARP table, route table, etc? -- Totus tuus, Glebius. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: kernel: arpresolve: can't allocate llinfo for 65.59.233.102
W dniu 2012-09-11 15:19, Dominic Blais pisze: Hi! I just noticed that it is my default route that is changing for the aforementioned in the subject IP address. What the "$?% could cause that? Could MPD push that route as default? For what reason? That IP address doesn't even belong to us. Hi! I have similar problems with 3 machines running FBSD9 as routers... and I do not use MPD at all. http://lists.freebsd.org/pipermail/freebsd-net/2012-March/031879.html The default route is also changing from time to time. Best regards! KB ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: kernel: arpresolve: can't allocate llinfo for 65.59.233.102
On Tue, Sep 11, 2012 at 09:19:59AM -0400, Dominic Blais wrote: D> Hi! D> D> I just noticed that it is my default route that is changing for the aforementioned in the subject IP address. What the "$?% could cause that? Could MPD push that route as default? For what reason? That IP address doesn't even belong to us. Really weird... I'd suggest you to run 'route monitor' to see exact message that inserts the route into the kernel and run periodically 'fstat | grep route' to catch the application that opens routing socket. -- Totus tuus, Glebius. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
RE: kernel: arpresolve: can't allocate llinfo for 65.59.233.102
I could do something about the route monitor but not the fstat | grep route... I mean, I would have to run it in a loop endlessly until the bug happens... Even with it, it's not even sure I would catch it... Is there some way we can fstat everything so I don't miss what's happening between 2 calls to fstat? -- -Message d'origine- De : Gleb Smirnoff [mailto:gleb...@freebsd.org] Envoyé : 11 septembre 2012 10:24 À : Dominic Blais Cc : freebsd-net@freebsd.org Objet : Re: kernel: arpresolve: can't allocate llinfo for 65.59.233.102 On Tue, Sep 11, 2012 at 09:19:59AM -0400, Dominic Blais wrote: D> Hi! D> D> I just noticed that it is my default route that is changing for the aforementioned in the subject IP address. What the "$?% could cause that? Could MPD push that route as default? For what reason? That IP address doesn't even belong to us. Really weird... I'd suggest you to run 'route monitor' to see exact message that inserts the route into the kernel and run periodically 'fstat | grep route' to catch the application that opens routing socket. -- Totus tuus, Glebius. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: kernel: arpresolve: can't allocate llinfo for 65.59.233.102
On 11 September 2012 09:01, Dominic Blais wrote: > I could do something about the route monitor but not the fstat | grep > route... I mean, I would have to run it in a loop endlessly until the bug > happens... Even with it, it's not even sure I would catch it... The route monitor is a good idea. It should tell us whether it's from some tool/program/network event sending a routing table update, or whether it's just plain memory corruption. adrian ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: kernel: arpresolve: can't allocate llinfo for 65.59.233.102
On Tue, Sep 11, 2012 at 10:29 AM, Adrian Chadd wrote: > On 11 September 2012 09:01, Dominic Blais wrote: >> I could do something about the route monitor but not the fstat | grep >> route... I mean, I would have to run it in a loop endlessly until the bug >> happens... Even with it, it's not even sure I would catch it... > > The route monitor is a good idea. It should tell us whether it's from > some tool/program/network event sending a routing table update, or > whether it's just plain memory corruption. Could this be http://svnweb.freebsd.org/base/head/sys/netinet/in.c?r1=226120&r2=226224&pathrev=226331 ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
RE: kernel: arpresolve: can't allocate llinfo for 65.59.233.102
Ok, I'll try the route monitor. Please note that Krzysztof Barcikowski already did it and got nothing: http://lists.freebsd.org/pipermail/freebsd-net/2012-March/031879.html -- -Message d'origine- De : adrian.ch...@gmail.com [mailto:adrian.ch...@gmail.com] De la part de Adrian Chadd Envoyé : 11 septembre 2012 13:30 À : Dominic Blais Cc : Gleb Smirnoff; freebsd-net@freebsd.org Objet : Re: kernel: arpresolve: can't allocate llinfo for 65.59.233.102 On 11 September 2012 09:01, Dominic Blais wrote: > I could do something about the route monitor but not the fstat | grep > route... I mean, I would have to run it in a loop endlessly until the bug > happens... Even with it, it's not even sure I would catch it... The route monitor is a good idea. It should tell us whether it's from some tool/program/network event sending a routing table update, or whether it's just plain memory corruption. adrian ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: kernel: arpresolve: can't allocate llinfo for 65.59.233.102
On Tue, Sep 11, 2012 at 10:35:25AM -0700, Vijay Singh wrote: V> On Tue, Sep 11, 2012 at 10:29 AM, Adrian Chadd wrote: V> > On 11 September 2012 09:01, Dominic Blais wrote: V> >> I could do something about the route monitor but not the fstat | grep route... I mean, I would have to run it in a loop endlessly until the bug happens... Even with it, it's not even sure I would catch it... V> > V> > The route monitor is a good idea. It should tell us whether it's from V> > some tool/program/network event sending a routing table update, or V> > whether it's just plain memory corruption. V> V> Could this be http://svnweb.freebsd.org/base/head/sys/netinet/in.c?r1=226120&r2=226224&pathrev=226331 Why do you suspect this one? -- Totus tuus, Glebius. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: kern/171524: [ipmi] ipmi driver crashes kernel by reboot or shutdown
The following reply was made to PR kern/171524; it has been noted by GNATS. From: Sean Bruno To: bug-follo...@freebsd.org, dhoj...@brainbits.net Cc: Subject: Re: kern/171524: [ipmi] ipmi driver crashes kernel by reboot or shutdown Date: Tue, 11 Sep 2012 12:56:16 -0700 It looks like the fix is not in releng_91 for release. http://svnweb.freebsd.org/base/stable/9/sys/dev/ipmi/ipmi.c?revision=239920&view=markup You can either patch the system by hand by applying this change to your local tree or update to stable/9 to fix this. http://www.wonkity.com/~wblock/docs/html/stable.html Sean ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: kernel: arpresolve: can't allocate llinfo for 65.59.233.102
> V> > V> Could this be > http://svnweb.freebsd.org/base/head/sys/netinet/in.c?r1=226120&r2=226224&pathrev=226331 > > Why do you suspect this one? I was hitting a similar issue in 8.2. After down/up on the interface to the default gateway, I saw this message and arpresolve would never complete. I was able to step through gdb to see that the RTF_GATEWAY check was causing the problem. -vijay ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: kernel: arpresolve: can't allocate llinfo for 65.59.233.102
On Tue, Sep 11, 2012 at 01:03:56PM -0700, Vijay Singh wrote: V> > V> V> > V> Could this be http://svnweb.freebsd.org/base/head/sys/netinet/in.c?r1=226120&r2=226224&pathrev=226331 V> > V> > Why do you suspect this one? V> V> I was hitting a similar issue in 8.2. After down/up on the interface V> to the default gateway, I saw this message and arpresolve would never V> complete. I was able to step through gdb to see that the RTF_GATEWAY V> check was causing the problem. Hmm, interesting. We need more debugging here. But where from could 65.59.233.102 appear? -- Totus tuus, Glebius. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Issue with igb and lagg (was Re: Problem with link aggregation + sshd)
Well, there definitely seems to be a problem with igb and lagg. igb alone works as it should, but doesn't seem to work properly in lagg. To be sure I started from scratch from a 9.0 release with nothing but: /etc/rc.conf --- ifconfig_igb0="inet ..." ifconfig_igb1="up" ifconfig_igb2="up" ifconfig_igb3="up" cloned_interfaces="lagg0" ifconfig_lagg0="laggproto lacp laggport igb1 laggport igb2 laggport igb3 192.168.x.x/24" sshd_enable="YES" --- This doesn't even manage to start sshd, it just hangs there at boot. Disabling lagg configuration everything works correctly. This installation is a zfs root, but I don't think this has anything to do with this. Yes, I think that the maintainer of igb and/or lagg driver should absolutely look into this... On 09/07/2012 12:01 PM, Simon Dick wrote: We've had similar problems with lagg at work, each lagg is made up of one igb and one em port, sometimes for no apparent reason they seem to stop passing through traffic. The easiest way we've found to get it working again is ifconfig down and up on one of the physical interfaces. This is on 8.1 On 3 September 2012 19:25, Giulio Ferro wrote: No idea anybody why this bug happens? Patches? On 08/29/2012 10:22 PM, Giulio Ferro wrote: On 08/28/2012 11:12 AM, Damien Fleuriot wrote: Hi Giulio, Just to clear things up: igb0: 192.168.9.60/24 lagg0: 192.168.12.21/24 Yes. Actually I notice now that the lagg0 address is different from what I wrote below in my rc.conf (192.168.12.7). I've just made many test with different configuration, but no matter, it just doesn't work... What's the IP of the host you're trying ssh connections from ? I'm just trying to connect to and from management interface igb0 (192.168.9.60). From external pc I do : ssh myuser@192.168.9.60 From that server I do : ssh myuser@pcaddress Just to be more precise, the consequences are: 1) daemon sshd on the server gets stuck and becomes unkillable 2) the first connection may work, but then the program ssh on the server becomes unresponsive and unkillable If I don't create a lagg0 interface and just connect (say) igb1 to the data switch, I've no problem and everything works. Just to answer others' question, I connect igb1, igb2 and igb3 to the same data switch in ports configured for aggregation. I connect igb0 to another management switch (of course not configured for aggregation) Also, just in case, did you enable any firewall ? (PF, ipfw) As I already said, no. Nothing is working/active on this server, just sshd. Thank you. On 27 August 2012 21:22, Giulio Ferro wrote: Hi, thanks for the answer Here is what you asked for: # ifconfig igb0 igb0: flags=8843 metric 0 mtu 1500 options=4401bb ether ... inet 192.168.9.60 netmask 0xff00 broadcast 192.168.9.255 inet6 prefixlen 64 scopeid 0x1 nd6 options=29 media: Ethernet autoselect (1000baseT ) status: active # netstat -rn Routing tables Internet: DestinationGatewayFlagsRefs Use Netif Expire default192.168.9.1UGS 00 igb0 127.0.0.1 link#12UH 00lo0 192.168.9.0/24 link#1 U 0 14 igb0 192.168.9.60 link#1 UHS 00lo0 192.168.12.0/24link#13U 0 109 lagg0 192.168.12.21 link#13UHS 00lo0 Internet6: Destination Gateway Flags Netif Expire ::/96 ::1 UGRS lo0 ::1 link#12 UH lo0 :::0.0.0.0/96 ::1 UGRS lo0 fe80::/10 ::1 UGRS lo0 fe80::%igb0/64link#1U igb0 fe80::ea39:35ff:feb6:a0d4%igb0link#1 UHS lo0 fe80::%igb1/64link#2U igb1 fe80::ea39:35ff:feb6:a0d5%igb1link#2 UHS lo0 fe80::%igb2/64link#3U igb2 fe80::ea39:35ff:feb6:a0d6%igb2link#3 UHS lo0 fe80::%igb3/64link#4U igb3 fe80::ea39:35ff:feb6:a0d7%igb3link#4 UHS lo0 fe80::%lo0/64 link#12 U lo0 fe80::1%lo0 link#12 UHS lo0 fe80::%lagg0/64 link#13 U lagg0 fe80::ea39:35ff:feb6:a0d5%lagg0 link#13 UHS lo0 ff01::%igb0/32fe80::ea39:35ff:feb6:a0d4%igb0 U igb0 ff01::%igb1/32fe80::ea39:35ff:feb6:a0d5%igb1 U igb1 ff01::%igb2/32fe80::ea39:35ff:feb6:a0d6%igb2 U igb2 ff01::%igb3/32fe80::ea39:35ff:feb6:a0d7%igb3 U igb3 ff01::%lo0/32 ::1 U lo0
Re: Issue with igb and lagg (was Re: Problem with link aggregation + sshd)
On Sep 11, 2012 2:12 PM, "Giulio Ferro" wrote: > > Well, there definitely seems to be a problem with igb and lagg. > > igb alone works as it should, but doesn't seem to work properly in lagg. > > To be sure I started from scratch from a 9.0 release with nothing but: > > /etc/rc.conf > --- > ifconfig_igb0="inet ..." > > ifconfig_igb1="up" > ifconfig_igb2="up" > ifconfig_igb3="up" > > cloned_interfaces="lagg0" > ifconfig_lagg0="laggproto lacp laggport igb1 laggport igb2 laggport igb3 192.168.x.x/24" > > sshd_enable="YES" > --- > > This doesn't even manage to start sshd, it just hangs there at boot. > > Disabling lagg configuration everything works correctly. > Just curious: does it work if you split the lagg configuration from the IP config: ifconfig_lagg0="laggproto ..." ifconfig_lagg0_alias0="inet 192..." I've had problems in the past with cloned interfaces not working right if you do everything in one ifconfig line. Never spent much time debugging it, though, as the split config always worked. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Issue with igb and lagg (was Re: Problem with link aggregation + sshd)
On Tue, 11 Sep 2012, Giulio Ferro wrote: Well, there definitely seems to be a problem with igb and lagg. igb alone works as it should, but doesn't seem to work properly in lagg. To be sure I started from scratch from a 9.0 release with nothing but: /etc/rc.conf --- ifconfig_igb0="inet ..." ifconfig_igb1="up" ifconfig_igb2="up" ifconfig_igb3="up" cloned_interfaces="lagg0" ifconfig_lagg0="laggproto lacp laggport igb1 laggport igb2 laggport igb3 My rc.conf is something like this: # # For now, force ath0 to use the same MAC address as xl0. # This works around a bug where lagg is unable to set the # MAC address of the underlying wlan0 interface. # ifconfig_ath0="ether 01:02:03:04:05:06" wlans_ath0=wlan0 ifconfig_wlan0="ssid SSID_FOO_NAME WPA" ifconfig_xl0="up" closed_interfaces="lagg0" ifconfig_lagg0="laggproto failover laggport xl0 laggport wlan0" ifconfig_lagg0_alias0="inet 10.0.0.4 netmask 0xff00" I use aliasX to add the address and netmask. -- DE ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Issue with igb and lagg (was Re: Problem with link aggregation + sshd)
On Tue, 11 Sep 2012, Freddie Cash wrote: On Sep 11, 2012 2:12 PM, "Giulio Ferro" wrote: cloned_interfaces="lagg0" ifconfig_lagg0="laggproto lacp laggport igb1 laggport igb2 laggport igb3 192.168.x.x/24" sshd_enable="YES" --- This doesn't even manage to start sshd, it just hangs there at boot. Disabling lagg configuration everything works correctly. Just curious: does it work if you split the lagg configuration from the IP config: ifconfig_lagg0="laggproto ..." ifconfig_lagg0_alias0="inet 192..." I've had problems in the past with cloned interfaces not working right if you do everything in one ifconfig line. Never spent much time debugging it, though, as the split config always worked. This was my experience too, though it's been quite a while since I tried it combined onto the same ifconfig line. -- DE ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Account Closed * (SECURITY NOTICE) *
Your email program does not support HTML. To view an online version of this email, please click the link below. http://a.eb04.executivemailingservices.com/new/en_send_preview_iframe2.aspx?SID=1&NewsletterID=1182759&SiteID=205248&EmailID=146747893&HitID=134747850&token=0e3934e1622a93ef448eb5ce65cccdae4a14888c To unsubscribe, click the link below. http://a.eb04.executivemailingservices.com/RWCode/subscribe.asp?SID=1&SiteID=205248&Email=freebsd-net@freebsd.org&HitID=134747850 ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: kern/167325: [netinet] [patch] sosend sometimes return EINVAL with TSO and VLAN on 82599 NIC
On 07.09.2012 23:44, Jeremiah Lott wrote: On Apr 27, 2012, at 2:07 AM, lini...@freebsd.org wrote: Old Synopsis: sosend sometimes return EINVAL with TSO and VLAN on 82599 NIC New Synopsis: [netinet] [patch] sosend sometimes return EINVAL with TSO and VLAN on 82599 NIC http://www.freebsd.org/cgi/query-pr.cgi?pr=167325 I did an analysis of this pr a while back and I figured I'd share. Definitely looks like a real problem here, but at least in 8.2 it is difficult to hit it. First off, vlan tagging is not required to hit this. The code is question does not account for any amount of link-local header, so you can reproduce the bug even without vlans. In order to trigger it, the tcp stack must choose to send a tso "packet" with a total size (including tcp+ip header and options, but not link-local header) between 65522 and 65535 bytes (because adding 14 byte link-local header will then exceed 64K limit). In 8.1, the tcp stack only chooses to send tso bursts that will result in full mtu-size on-wire packets. To achieve this, it will truncate the tso packet size to be a multiple of mss, not including header and tcp options. The check has been relaxed a little in head, but the same basic check is still there. None of the "normal" mtus have multiples falling in this range. To reproduce it I used an mtu of 1445. When timestamps are in use, every packet has a 40 bytes tcp/ip header + 10 bytes for the timestamp option + 2 bytes pad. You can get a packet length 65523 as follows: 65523 - (40 + 10 + 2) = 65471 (size of tso packet data) 65471 / 47 = 1393 (size of data per on-wire packet) 1393 + (40 + 10 + 2) = 1445 (mtu is data + header + options + pad) Once you set your mtu to 1445, you need a program that can get the stack to send a maximum sized packet. With the congestion window that can be more difficult than it seems. I used some python that sends enough data to open the window, sleeps long enough to drain all outstanding data, but not long enough for the congestion window to go stale and close again, then sends a bunch more data. It also helps to turn off delayed acks on the receiver. Sometimes you will not drain the entire send buffer because an ack for the final chunk is still delayed when you start the second transmit. When the problem described in the pr hits, the EINVAL from bus_dmamap_load_mbuf_sg bubbles right up to userspace. At first I thought this was a driver bug rather than stack bug. The code in question does what it is commented to do (limit the tso packet so that ip->ip_len does not overflow). However, it also seems reasonable that the driver limit its dma tag at 64K (do we really want it allocating another whole page just for the 14 byte link-local header). Perhaps the tcp stack should ensure that the tso packet + max_linkhdr is < 64K. Comments? Thank you for the analysis. I'm looking into it. As an aside, the patch attached to the pr is also slightly wrong. Taking the max_linkhdr into account when rounding the packet to be a multiple of mss does not make sense, it should only take it into account when calculating the max tso length. -- Andre ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"