On 08/09/2018 01:38 AM, Vieri Di Paola via Shorewall-users wrote: > Hi, > > I've encountered a weird issue. > > I have 3 ISP links (WAN) connected to a shorewall gateway, each on their own > NIC. > > After about 24 hours working with apparently no issues, I start to get > network issues on only one of the three. > > A simple test from the shorewall gateway shows the following packet loss when > pinging from the NIC that's connected to the failing ISP: > > # shorewall reset ; ping -n -I enp9s6 8.8.8.8 ; shorewall dump > > /home/vieri/swdump > Shorewall Counters Reset > PING 8.8.8.8 (8.8.8.8) from 192.168.101.2 enp9s6: 56(84) bytes of data. > 64 bytes from 8.8.8.8: icmp_seq=12 ttl=120 time=11.1 ms > 64 bytes from 8.8.8.8: icmp_seq=13 ttl=120 time=11.1 ms > 64 bytes from 8.8.8.8: icmp_seq=14 ttl=120 time=11.1 ms > 64 bytes from 8.8.8.8: icmp_seq=15 ttl=120 time=10.9 ms > 64 bytes from 8.8.8.8: icmp_seq=16 ttl=120 time=11.0 ms > 64 bytes from 8.8.8.8: icmp_seq=17 ttl=120 time=11.0 ms > 64 bytes from 8.8.8.8: icmp_seq=18 ttl=120 time=11.2 ms > 64 bytes from 8.8.8.8: icmp_seq=19 ttl=120 time=11.1 ms > 64 bytes from 8.8.8.8: icmp_seq=20 ttl=120 time=11.3 ms > 64 bytes from 8.8.8.8: icmp_seq=21 ttl=120 time=11.1 ms > 64 bytes from 8.8.8.8: icmp_seq=22 ttl=120 time=11.1 ms > 64 bytes from 8.8.8.8: icmp_seq=23 ttl=120 time=11.2 ms > 64 bytes from 8.8.8.8: icmp_seq=24 ttl=120 time=11.2 ms > 64 bytes from 8.8.8.8: icmp_seq=25 ttl=120 time=11.2 ms > 64 bytes from 8.8.8.8: icmp_seq=26 ttl=120 time=11.3 ms > 64 bytes from 8.8.8.8: icmp_seq=27 ttl=120 time=11.2 ms > 64 bytes from 8.8.8.8: icmp_seq=28 ttl=120 time=11.2 ms > 64 bytes from 8.8.8.8: icmp_seq=29 ttl=120 time=11.3 ms > 64 bytes from 8.8.8.8: icmp_seq=30 ttl=120 time=11.4 ms > 64 bytes from 8.8.8.8: icmp_seq=31 ttl=120 time=11.3 ms > 64 bytes from 8.8.8.8: icmp_seq=32 ttl=120 time=11.2 ms > 64 bytes from 8.8.8.8: icmp_seq=33 ttl=120 time=11.3 ms > 64 bytes from 8.8.8.8: icmp_seq=34 ttl=120 time=11.5 ms > 64 bytes from 8.8.8.8: icmp_seq=35 ttl=120 time=11.3 ms > 64 bytes from 8.8.8.8: icmp_seq=36 ttl=120 time=11.3 ms > 64 bytes from 8.8.8.8: icmp_seq=37 ttl=120 time=11.4 ms > 64 bytes from 8.8.8.8: icmp_seq=38 ttl=120 time=11.3 ms > 64 bytes from 8.8.8.8: icmp_seq=39 ttl=120 time=11.3 ms > 64 bytes from 8.8.8.8: icmp_seq=40 ttl=120 time=11.8 ms > 64 bytes from 8.8.8.8: icmp_seq=41 ttl=120 time=11.6 ms > 64 bytes from 8.8.8.8: icmp_seq=42 ttl=120 time=11.4 ms > ^C > --- 8.8.8.8 ping statistics --- > 42 packets transmitted, 31 received, 26% packet loss, time 41698ms > rtt min/avg/max/mdev = 10.981/11.303/11.890/0.212 ms > > The same test on the other 2 ISP links are OK. > > Hence, if ISP3 is the failing link and ISP1, ISP2 are OK, I try to move some > traffic from ISP3 to ISP2 like so in the mangle file: > > MARK(2):P ${HMAN_EXTRA_CORP_NETWORKS} > (2: ISP2, 3: ISP3, > HMAN_EXTRA_CORP_NETWORKS="192.168.210.0/23,192.168.212.0/24") > > Now, the same ping test from the NIC that's connected to ISP2 starts showing > the same packet loss stats while the test on the NIC connected to ISP3 has 0% > packet loss. > > Wherever I move the traffic with this line in the mangle file, I get ICMP > packet loss, ie., moving it back to MARK(3) (ISP3) shows packet loss again > only on that line. > > The shorewall dump taken during the test above is here: > > https://drive.google.com/open?id=1a6RlQhi2w_JJF9ZuFt6aI9G-JAQbFC9n > > Finally, to top it all off, if I reboot the modem/router on the ISP3 link, > all's well again (no packet loss whatsoever, no matter which rule I use in > the mangle file). Until the next day... > > So, how can I go about this to determine what's causing this issue? My > Internet Provider has already passed the buck and thinks that it's an issue > with my shorewall gateway... > > Help appreciated. >
I don't see anything in the dump that explains this behavior. I do, however, notice this conntrack table entry: icmp 1 29 src=192.168.101.2 dst=8.8.8.8 type=8 code=0 id=3380 packets=42 bytes=3528 src=8.8.8.8 dst=192.168.101.2 type=0 code=0 id=3380 packets=31 bytes=2604 mark=3 use=1 'mark=3' indicates that the flow is using the correct interface (enp9s6). My suggestion for debugging this further is to use a packet sniffer to see what is happening on the wire during the period of loss: a) Are the echo-request packets being sent? b) If not, is there unsuccessful ARPing occurring? -Tom -- Tom Eastep \ Q: What do you get when you cross a mobster with Shoreline, \ an international standard? Washington, USA \ A: Someone who makes you an offer you can't http://shorewall.org \ understand \_______________________________________________
signature.asc
Description: OpenPGP digital signature
------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________ Shorewall-users mailing list Shorewall-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shorewall-users