Hi,

I've encountered a weird issue.

I have 3 ISP links (WAN) connected to a shorewall gateway, each on their own 
NIC.

After about 24 hours working with apparently no issues, I start to get network 
issues on only one of the three.

A simple test from the shorewall gateway shows the following packet loss when 
pinging from the NIC that's connected to the failing ISP:

# shorewall reset ;  ping -n -I enp9s6 8.8.8.8 ; shorewall dump > 
/home/vieri/swdump
Shorewall Counters Reset
PING 8.8.8.8 (8.8.8.8) from 192.168.101.2 enp9s6: 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=12 ttl=120 time=11.1 ms
64 bytes from 8.8.8.8: icmp_seq=13 ttl=120 time=11.1 ms
64 bytes from 8.8.8.8: icmp_seq=14 ttl=120 time=11.1 ms
64 bytes from 8.8.8.8: icmp_seq=15 ttl=120 time=10.9 ms
64 bytes from 8.8.8.8: icmp_seq=16 ttl=120 time=11.0 ms
64 bytes from 8.8.8.8: icmp_seq=17 ttl=120 time=11.0 ms
64 bytes from 8.8.8.8: icmp_seq=18 ttl=120 time=11.2 ms
64 bytes from 8.8.8.8: icmp_seq=19 ttl=120 time=11.1 ms
64 bytes from 8.8.8.8: icmp_seq=20 ttl=120 time=11.3 ms
64 bytes from 8.8.8.8: icmp_seq=21 ttl=120 time=11.1 ms
64 bytes from 8.8.8.8: icmp_seq=22 ttl=120 time=11.1 ms
64 bytes from 8.8.8.8: icmp_seq=23 ttl=120 time=11.2 ms
64 bytes from 8.8.8.8: icmp_seq=24 ttl=120 time=11.2 ms
64 bytes from 8.8.8.8: icmp_seq=25 ttl=120 time=11.2 ms
64 bytes from 8.8.8.8: icmp_seq=26 ttl=120 time=11.3 ms
64 bytes from 8.8.8.8: icmp_seq=27 ttl=120 time=11.2 ms
64 bytes from 8.8.8.8: icmp_seq=28 ttl=120 time=11.2 ms
64 bytes from 8.8.8.8: icmp_seq=29 ttl=120 time=11.3 ms
64 bytes from 8.8.8.8: icmp_seq=30 ttl=120 time=11.4 ms
64 bytes from 8.8.8.8: icmp_seq=31 ttl=120 time=11.3 ms
64 bytes from 8.8.8.8: icmp_seq=32 ttl=120 time=11.2 ms
64 bytes from 8.8.8.8: icmp_seq=33 ttl=120 time=11.3 ms
64 bytes from 8.8.8.8: icmp_seq=34 ttl=120 time=11.5 ms
64 bytes from 8.8.8.8: icmp_seq=35 ttl=120 time=11.3 ms
64 bytes from 8.8.8.8: icmp_seq=36 ttl=120 time=11.3 ms
64 bytes from 8.8.8.8: icmp_seq=37 ttl=120 time=11.4 ms
64 bytes from 8.8.8.8: icmp_seq=38 ttl=120 time=11.3 ms
64 bytes from 8.8.8.8: icmp_seq=39 ttl=120 time=11.3 ms
64 bytes from 8.8.8.8: icmp_seq=40 ttl=120 time=11.8 ms
64 bytes from 8.8.8.8: icmp_seq=41 ttl=120 time=11.6 ms
64 bytes from 8.8.8.8: icmp_seq=42 ttl=120 time=11.4 ms
^C
--- 8.8.8.8 ping statistics ---
42 packets transmitted, 31 received, 26% packet loss, time 41698ms
rtt min/avg/max/mdev = 10.981/11.303/11.890/0.212 ms

The same test on the other 2 ISP links are OK.

Hence, if ISP3 is the failing link and ISP1, ISP2 are OK, I try to move some 
traffic from ISP3 to ISP2 like so in the mangle file: 

MARK(2):P       ${HMAN_EXTRA_CORP_NETWORKS}
(2: ISP2, 3: ISP3, HMAN_EXTRA_CORP_NETWORKS="192.168.210.0/23,192.168.212.0/24")

Now, the same ping test from the NIC that's connected to ISP2 starts showing 
the same packet loss stats while the test on the NIC connected to ISP3 has 0% 
packet loss.

Wherever I move the traffic with this line in the mangle file, I get ICMP 
packet loss, ie., moving it back to MARK(3) (ISP3) shows packet loss again only 
on that line.

The shorewall dump taken during the test above is here:

https://drive.google.com/open?id=1a6RlQhi2w_JJF9ZuFt6aI9G-JAQbFC9n

Finally, to top it all off, if I reboot the modem/router on the ISP3 link, 
all's well again (no packet loss whatsoever, no matter which rule I use in the 
mangle file). Until the next day...

So, how can I go about this to determine what's causing this issue? My Internet 
Provider has already passed the buck and thinks that it's an issue with my 
shorewall gateway...

Help appreciated.

Thanks,

Vieri

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Shorewall-users mailing list
Shorewall-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shorewall-users

Reply via email to