On 2022-05-01 12:58, Mark Millard wrote:
On 2022-May-1, at 12:15, Mark Millard <mark...@yahoo.com> wrote:On 2022-May-1, at 11:12, bob prohaska <f...@www.zefox.net> wrote:On Sat, Apr 30, 2022 at 06:39:57PM -0700, Bakul Shah wrote:On Apr 29, 2022, at 7:12 PM, bob prohaska <f...@www.zefox.net> wrote:Since about December of 2021 I've been noticing problems with wired network connectivity on a pair of raspberry pi 3 machines using wired network connections. One runs stable-13.1, the other runs -current, both are up to date as of a few days ago. Essentially both machines fail to respond to inbound network connections via ssh or ping after reboot. If I get on the serial console and start an outbound ping to anywhere, both machines respond to incoming pings with about a 65% packet loss.Suggest running tcpdump on the rpi3 to see what is going on when connected to the public vs private net.Public net first, since that's where the machine is now. Gateway.zefox.net is the name of my router's public interface, dcn.org belongs to my isp and fusionbroadband is their service provider.. While on the -current Pi3 serial console (with no outbound ping running) and no inbound traffic from my hosts I see after a couple minutes: root@www:/mnt # tcpdump tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on ue0, link-type EN10MB (Ethernet), capture size 262144 bytes10:39:40.887853 ARP, Request who-has www.zefox.org tell gateway.zefox.net, length 46 10:39:40.887929 ARP, Reply www.zefox.org is-at b8:27:eb:71:46:4e (oui Unknown), length 28 10:39:40.893220 ARP, Request who-has 50-1-20-1.dsl.static.fusionbroadband.com tell www.zefox.org, length 28 10:39:40.915469 ARP, Reply 50-1-20-1.dsl.static.fusionbroadband.com is-at 00:1b:90:d2:4a:c4 (oui Unknown), length 50 10:39:40.915529 IP www.zefox.org.50714 > spoke.dcn.davis.ca.us.domain: 51409+ PTR? 28.20.1.50.in-addr.arpa. (41) 10:39:40.943602 IP spoke.dcn.davis.ca.us.domain > www.zefox.org.50714: 51409 1/3/6 PTR www.zefox.org. (265) 10:39:40.945416 IP www.zefox.org.15986 > spoke.dcn.davis.ca.us.domain: 44966+ PTR? 31.20.1.50.in-addr.arpa. (41) 10:39:40.973487 IP spoke.dcn.davis.ca.us.domain > www.zefox.org.15986: 44966 1/3/6 PTR gateway.zefox.net. (266) 10:39:40.975037 IP www.zefox.org.57611 > spoke.dcn.davis.ca.us.domain: 31749+ PTR? 1.20.1.50.in-addr.arpa. (40) 10:39:46.288219 IP www.zefox.org.49710 > wheel.dcn.davis.ca.us.domain: 31749+ PTR? 1.20.1.50.in-addr.arpa. (40) 10:39:46.316239 IP wheel.dcn.davis.ca.us.domain > www.zefox.org.49710: 31749 1/3/6 PTR 50-1-20-1.dsl.static.fusionbroadband.com. (291) 10:39:46.318267 IP www.zefox.org.17061 > spoke.dcn.davis.ca.us.domain: 37579+ PTR? 2.253.150.168.in-addr.arpa. (44) 10:39:46.346851 IP spoke.dcn.davis.ca.us.domain > www.zefox.org.17061: 37579* 1/2/2 PTR spoke.dcn.davis.ca.us. (145) 10:39:46.348674 IP www.zefox.org.40440 > spoke.dcn.davis.ca.us.domain: 20572+ PTR? 1.253.150.168.in-addr.arpa. (44) 10:39:51.420705 IP www.zefox.org.64019 > wheel.dcn.davis.ca.us.domain: 20572+ PTR? 1.253.150.168.in-addr.arpa. (44) 10:39:51.448850 IP wheel.dcn.davis.ca.us.domain > www.zefox.org.64019: 20572* 1/2/2 PTR wheel.dcn.davis.ca.us. (145) 10:40:40.147603 ARP, Request who-has 50-1-20-1.dsl.static.fusionbroadband.com tell ns1.zefox.net, length 46 10:40:40.148844 IP www.zefox.org.46127 > spoke.dcn.davis.ca.us.domain: 12186+ PTR? 29.20.1.50.in-addr.arpa. (41) 10:40:40.176486 IP spoke.dcn.davis.ca.us.domain > www.zefox.org.46127: 12186 1/3/6 PTR ns1.zefox.net. (262) 10:40:57.688225 ARP, Request who-has www.zefox.org tell gateway.zefox.net, length 46 10:40:57.688305 ARP, Reply www.zefox.org is-at b8:27:eb:71:46:4e (oui Unknown), length 28 10:42:14.488727 ARP, Request who-has www.zefox.org tell gateway.zefox.net, length 46 10:42:14.488804 ARP, Reply www.zefox.org is-at b8:27:eb:71:46:4e (oui Unknown), length 28 10:42:43.761226 ARP, Request who-has 50-1-20-1.dsl.static.fusionbroadband.com tell www.zefox.com, length 46 10:42:43.762522 IP www.zefox.org.56181 > spoke.dcn.davis.ca.us.domain: 28779+ PTR? 26.20.1.50.in-addr.arpa. (41) 10:42:43.790361 IP spoke.dcn.davis.ca.us.domain > www.zefox.org.56181: 28779 1/3/6 PTR www.zefox.com. (265) 10:43:31.289103 ARP, Request who-has www.zefox.org tell gateway.zefox.net, length 46 10:43:31.289181 ARP, Reply www.zefox.org is-at b8:27:eb:71:46:4e (oui Unknown), length 28If I now start an inbound ping from one of my hosts it gets no reply andtcpdump reports no additional traffic. With an outbound ping running there'sat least a sparse reply. ^C 28 packets captured 28 packets received by filter 0 packets dropped by kernel root@www:/mnt # The "oui unknown" looks like some sort of failure..... Can you ping www.zefox.org? I have no outside vantage point. There is still no outbound ping running and I would expect you'll get no or very sparse reply.Thus far only the two Pi3s suffer from connectivity problems; Pi2s and a Pi4 have no difficulty on the same address block. Is there a switch for tcpdump that willlimit records to relevant traffic? Otherwise it's a flood. These results were obtained after standing idle overnight and are rather different (in ways I don't understand) from behavior immediately after reboot, I'll have to repeat as I learn more.I wonder if there is a notable difference between monitoring traffic from 2 places: A) from the machine seeing the problem vs. B) from a machine not having problems but connected were all the traffic would be on the wire it is connected to. It may be that monitoring from both and comparing/contrasting the reported traffic from the two provides additional evidence. There may be modes of monitoring that are relevant for this. But I'm not familiar with any detail here. For reference: # ping www.zefox.org PING www.zefox.org (50.1.20.28): 56 data bytes ^C --- www.zefox.org ping statistics --- 32 packets transmitted, 0 packets received, 100.0% packet loss I found the command traceroute and it reports: # traceroute www.zefox.org traceroute to www.zefox.org (50.1.20.28), 64 hops max, 40 byte packets 1 192.168.1.1 (192.168.1.1) 0.697 ms 0.486 ms 1.277 ms 2 172.30.26.66 (172.30.26.66) 30.019 ms 172.30.26.67 (172.30.26.67) 41.720 ms 172.30.26.66 (172.30.26.66) 28.645 ms 3 68.85.243.125 (68.85.243.125) 8.967 ms 68.85.243.77 (68.85.243.77) 11.462 ms 68.85.243.125 (68.85.243.125) 10.254 ms 4 24.124.129.106 (24.124.129.106) 7.510 ms 96.216.60.165 (96.216.60.165) 10.176 ms 24.124.129.106 (24.124.129.106) 8.945 ms 5 68.85.243.197 (68.85.243.197) 10.837 ms 96.216.60.165 (96.216.60.165) 10.252 ms 68.85.243.197 (68.85.243.197) 16.036 ms 6 68.85.243.197 (68.85.243.197) 14.660 ms be-36211-cs01.seattle.wa.ibone.comcast.net (68.86.93.49) 14.629 ms 68.85.243.197 (68.85.243.197) 8.849 ms 7 be-2412-pe12.seattle.wa.ibone.comcast.net (96.110.34.142) 14.607 ms be-36221-cs02.seattle.wa.ibone.comcast.net (68.86.93.53) 14.122 ms be-2212-pe12.seattle.wa.ibone.comcast.net (96.110.34.134) 13.877 ms8 be-2412-pe12.seattle.wa.ibone.comcast.net (96.110.34.142) 14.133 ms * 13.663 ms9 be2075.ccr21.sfo01.atlas.cogentco.com (154.54.0.233) 30.176 ms * be3717.ccr22.sfo01.atlas.cogentco.com (154.54.86.209) 29.002 ms 10 be3717.ccr22.sfo01.atlas.cogentco.com (154.54.86.209) 28.477 ms be2430.ccr31.sjc04.atlas.cogentco.com (154.54.88.186) 27.203 ms be2075.ccr21.sfo01.atlas.cogentco.com (154.54.0.233) 28.515 ms 11 38.104.141.82 (38.104.141.82) 29.820 ms be2430.ccr31.sjc04.atlas.cogentco.com (154.54.88.186) 28.605 ms 38.104.141.82 (38.104.141.82) 33.735 ms 12 38.104.141.82 (38.104.141.82) 27.160 ms 0.xe-0-3-0.scrm-gw1.scrmca01.sonic.net (135.180.179.146) 32.336 ms 38.104.141.82 (38.104.141.82) 31.867 ms 13 0.xe-0-0-0.cr1.scrmca13.sonic.net (135.180.179.166) 31.761 ms 0.xe-0-3-0.scrm-gw1.scrmca01.sonic.net (135.180.179.146) 29.864 ms 0.xe-0-0-0.cr1.scrmca13.sonic.net (135.180.179.166) 31.711 ms 14 0.xe-0-0-0.cr1.scrmca13.sonic.net (135.180.179.166) 30.373 ms gig1-1-1.gw.wscrca11.sonic.net (50.1.36.106) 35.567 ms 0.xe-0-0-0.cr1.scrmca13.sonic.net (135.180.179.166) 31.146 ms 15 gig1-1-1.gw.davsca11.sonic.net (50.1.36.110) 31.513 ms gig1-1-1.gw.wscrca11.sonic.net (50.1.36.106) 31.203 ms gig1-1-1.gw.davsca11.sonic.net (50.1.36.110) 31.354 ms 16 gig1-1-1.gw.davsca11.sonic.net (50.1.36.110) 30.125 ms * 31.996 ms 17 * * * 18 * * * 19 * * * 20 * * * 21 * * * 22 * * * 23 * * * 24 * * * 25 * * * 26 * * * 27 * * * 28 * * * 29 * * * 30 * * * ^C (There did not seem to be much point in having it continue.)I found and built a port called net/mtr-nox11 ("My traceroute") and tried it, letting it just run. The initial try eventually got a connection but reported a 99.2% packet loss as of when I captured the below: My traceroute [v0.95]amd64_ZFS (192.168.1.120) -> www.zefox.org (50.1.20.28) 2022-05-01T12:40:22-0700Keys: Help Display mode Restart statistics Order of fields quit Packets PingsHost Loss% Snt Last Avg Best Wrst StDev 1. 192.168.1.1 0.0% 135 0.4 0.8 0.1 3.1 0.4 2. 172.30.26.66 0.0% 134 28.2 26.1 9.3 132.7 18.1 3. 68.85.243.77 0.0% 134 8.6 9.0 7.5 11.2 0.8 4. 24.124.129.106 0.0% 134 10.2 9.1 7.6 13.4 0.9 5. 96.216.60.165 0.0% 134 9.0 9.1 7.8 14.3 0.9 6. 68.85.243.197 0.0% 134 14.4 13.6 9.2 44.3 5.4 7. be-36241-cs04.seattle.wa.ibone.comcast.net 0.0% 134 16.8 14.9 13.0 22.6 1.1 8. be-2412-pe12.seattle.wa.ibone.comcast.net 0.0% 134 13.5 15.0 12.8 46.4 3.29. (waiting for reply)10. be2075.ccr21.sfo01.atlas.cogentco.com 0.0% 134 29.3 29.0 26.7 54.1 2.9 11. be2379.ccr31.sjc04.atlas.cogentco.com 0.0% 134 28.0 28.7 27.1 40.3 1.3 12. 38.104.141.82 0.0% 134 28.0 33.8 26.6 114.8 16.5 13. 0.xe-0-3-0.scrm-gw1.scrmca01.sonic.net 0.0% 134 30.9 31.0 29.0 33.7 0.8 14. 0.xe-0-0-0.cr1.scrmca13.sonic.net 0.0% 134 31.1 32.3 29.3 93.2 6.7 15. gig1-1-1.gw.wscrca11.sonic.net 0.0% 134 31.3 34.9 29.5 330.4 26.5 16. gig1-1-1.gw.davsca11.sonic.net 0.0% 134 32.8 32.1 29.9 44.1 1.717. (waiting for reply) 18. (waiting for reply)19. www.zefox.org 99.2% 134 74.9 74.9 74.9 74.9 0.0I stopped and restarted it and so far no connection -- waiting even longer than that first time: Snt is now over 600. Rows 18 and 19 have not shown up, the last is 17. . . . (some more time goes by) . . . I have now stopped it, avoiding the extra load on the machines and network. Looks like there is some problem getting past gig1-1-1.gw.davsca11.sonic.net .
Apologies in advance if I'm just making noise. But here's what I see on a 10Gb network attempting the same traceroute(8) # traceroute www.zefox.org traceroute to www.zefox.org (50.1.20.28), 64 hops max, 40 byte packets1 static-24-113-41-1.wavecable.com (24.113.41.1) 19.918 ms 16.258 ms 13.852 ms
2 174.127.183.72 (174.127.183.72) 18.036 ms 19.647 ms 18.428 ms3 be4.cr2-sea-b.bb.as11404.net (174.127.137.16) 16.318 ms 19.963 ms 22.306 ms 4 be1.cr2-sea-a.bb.as11404.net (174.127.149.136) 19.391 ms 14.457 ms 15.808 ms 5 sea-b2-link.ip.twelve99.net (62.115.49.138) 19.613 ms 22.770 ms 20.330 ms 6 sjo-b23-link.ip.twelve99.net (62.115.118.169) 39.478 ms 32.428 ms 34.416 ms 7 palo-b24-link.ip.twelve99.net (62.115.115.216) 70.207 ms 41.846 ms 37.838 ms 8 sonicnet-ic350733-palo-b24.ip.twelve99-cust.net (62.115.181.227) 44.718 ms 33.959 ms 42.723 ms 9 0.xe-0-3-0.scrm-gw1.scrmca01.sonic.net (135.180.179.146) 41.699 ms 42.660 ms 114.578 ms 10 0.xe-0-0-0.cr1.scrmca13.sonic.net (135.180.179.166) 47.851 ms 51.590 ms 41.286 ms 11 gig1-1-1.gw.wscrca11.sonic.net (50.1.36.106) 51.199 ms 39.567 ms 40.553 ms 12 gig1-1-1.gw.davsca11.sonic.net (50.1.36.110) 45.005 ms 44.096 ms 41.183 ms
13 * * * 14 * www.zefox.org (50.1.20.28) 62.422 ms * A trip to sonic net indicates they brag on having better privacy than their competition. Are they using any privacy extensions that may affect your ability to ping(8) || traceroute(8) -- TCP/UDP/ICMP? Or is it just that gig1-1-1.gw.davsca11.sonic.net's BGP is out of date (stale)? HTH --Chris
=== Mark Millard marklmi at yahoo.com
0xBDE49540.asc
Description: application/pgp-keys