Re: big ICMP requests get disrupted on IPSec tunnel activation

Bartschies, Thomas Wed, 16 Oct 2019 05:57:25 -0700

Hello,

did another test. This time I've changed the order. First triggered the IPSec 
policy and then tried to ping in parallel with a big packet size.
Could also reproduce the issue, but the trace was completely different. May be 
this time I've got the trace for the problematic connection?


[Mi Okt 16 14:43:15 2019] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 
5.2.18-200.fc30.x86_64 #1
[Mi Okt 16 14:43:15 2019] Hardware name: VMware, Inc. VMware Virtual 
Platform/440BX Desktop Reference Platform, BIOS 6.00 12/12/2018
[Mi Okt 16 14:43:15 2019] RIP: 0010:fq_enqueue+0x531/0x610 [sch_fq]
[Mi Okt 16 14:43:15 2019] Code: c2 48 8b 3d 49 28 00 00 48 89 ee e8 39 0a ec c1 
45 85 ed 75 d4 49 8b 2f e9 f1 fb ff ff 48 83 83 30 02 00 00 01 e9 fd fc ff ff 
<0f> 0b e9 83 fc ff ff 4c 89 ea 31 c0 e9 be fc ff ff 41 8b 46 08 39
[Mi Okt 16 14:43:15 2019] RSP: 0018:ffff948640748ae8 EFLAGS: 00010206
[Mi Okt 16 14:43:15 2019] RAX: 0000000df8475800 RBX: ffff895036e25c00 RCX: 
0000000000000018
[Mi Okt 16 14:43:15 2019] RDX: 00000e7ebca89351 RSI: 0014ca96f0789efd RDI: 
ffffffff83423a60
[Mi Okt 16 14:43:15 2019] RBP: 15ce211ccec6bca9 R08: 000000002a91ed5a R09: 
0000000000000001
[Mi Okt 16 14:43:15 2019] R10: ffff948640748a80 R11: 00000000bc3ed800 R12: 
ffff895034e00900
[Mi Okt 16 14:43:15 2019] R13: ffff89503720f2a0 R14: 00000000000004d3 R15: 
ffff895038809748
[Mi Okt 16 14:43:15 2019] FS:  0000000000000000(0000) GS:ffff89503bb80000(0000) 
knlGS:0000000000000000
[Mi Okt 16 14:43:15 2019] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Mi Okt 16 14:43:15 2019] CR2: 00007f3235d5d000 CR3: 00000001334a6003 CR4: 
00000000000606e0
[Mi Okt 16 14:43:15 2019] Call Trace:
[Mi Okt 16 14:43:15 2019]  <IRQ>
[Mi Okt 16 14:43:15 2019]  ? __qdisc_run+0x151/0x540
[Mi Okt 16 14:43:15 2019]  __dev_queue_xmit+0x45b/0x960
[Mi Okt 16 14:43:15 2019]  ? mod_timer+0x15b/0x300
[Mi Okt 16 14:43:15 2019]  __neigh_update+0x432/0x9f0
[Mi Okt 16 14:43:15 2019]  arp_process+0x255/0x7e0
[Mi Okt 16 14:43:15 2019]  ? ip_forward+0x3c5/0x480
[Mi Okt 16 14:43:15 2019]  arp_rcv+0x18c/0x1d0
[Mi Okt 16 14:43:15 2019]  __netif_receive_skb_one_core+0x87/0x90
[Mi Okt 16 14:43:15 2019]  netif_receive_skb_internal+0x41/0xb0
[Mi Okt 16 14:43:15 2019]  napi_gro_receive+0xf6/0x160
[Mi Okt 16 14:43:15 2019]  e1000_clean_rx_irq+0x2ac/0x570 [e1000]
[Mi Okt 16 14:43:15 2019]  e1000_clean+0x278/0x5f0 [e1000]
[Mi Okt 16 14:43:15 2019]  net_rx_action+0x148/0x3b0
[Mi Okt 16 14:43:15 2019]  __do_softirq+0xed/0x30e
[Mi Okt 16 14:43:15 2019]  irq_exit+0xf1/0x100
[Mi Okt 16 14:43:15 2019]  do_IRQ+0x81/0xe0
[Mi Okt 16 14:43:15 2019]  common_interrupt+0xf/0xf
[Mi Okt 16 14:43:15 2019]  </IRQ>
[Mi Okt 16 14:43:15 2019] RIP: 0010:native_safe_halt+0xe/0x10
[Mi Okt 16 14:43:15 2019] Code: 90 90 90 90 90 90 90 90 90 90 90 90 e9 07 00 00 
00 0f 00 2d 36 06 44 00 f4 c3 66 90 e9 07 00 00 00 0f 00 2d 26 06 44 00 fb f4 
<c3> 90 66 66 66 66 90 41 54 55 53 e8 e2 01 79 ff 65 8b 2d cb 7f 64
[Mi Okt 16 14:43:15 2019] RSP: 0018:ffff9486406bbeb0 EFLAGS: 00000246 ORIG_RAX: 
ffffffffffffffd5
[Mi Okt 16 14:43:15 2019] RAX: ffffffff829c7d70 RBX: 0000000000000003 RCX: 
0000000000000000
[Mi Okt 16 14:43:15 2019] RDX: 0000000000000003 RSI: 0000000000000003 RDI: 
ffff89503bb9c500
[Mi Okt 16 14:43:15 2019] RBP: 0000000000000003 R08: 00000066a172847b R09: 
0000000000000000
[Mi Okt 16 14:43:15 2019] R10: ffff89503bb97888 R11: 0000000000000000 R12: 
ffff89503a84be80
[Mi Okt 16 14:43:15 2019] R13: 0000000000000000 R14: 0000000000000000 R15: 
ffff89503a84be80
[Mi Okt 16 14:43:15 2019]  ? __cpuidle_text_start+0x8/0x8
[Mi Okt 16 14:43:15 2019]  default_idle+0x1a/0x140
[Mi Okt 16 14:43:15 2019]  do_idle+0x1fb/0x260
[Mi Okt 16 14:43:15 2019]  ? do_idle+0x17b/0x260
[Mi Okt 16 14:43:15 2019]  cpu_startup_entry+0x19/0x20
[Mi Okt 16 14:43:15 2019]  start_secondary+0x17d/0x1d0
[Mi Okt 16 14:43:15 2019]  secondary_startup_64+0xa4/0xb0
[Mi Okt 16 14:43:15 2019] ---[ end trace 601c6ff4362427bd ]---

Regards,
--
Thomas Bartschies
CVK IT Systeme

-----Ursprüngliche Nachricht-----
Von: Bartschies, Thomas 
Gesendet: Dienstag, 15. Oktober 2019 12:12
An: 'Eric Dumazet' <eric.duma...@gmail.com>; 'David Ahern' <dsah...@gmail.com>; 
'netdev@vger.kernel.org' <netdev@vger.kernel.org>
Betreff: AW: AW: big ICMP requests get disrupted on IPSec tunnel activation

Hello Eric,

no problem. Just done that. Here are the results. Hope it helps. Did a parallel 
debug ftrace also. But no conclusive information within that. Tried it before.

[Di Okt 15 11:51:06 2019] WARNING: CPU: 1 PID: 0 at net/sched/sch_fq.c:389 
fq_enqueue+0x531/0x610 [sch_fq]
[Di Okt 15 11:51:06 2019] Modules linked in: twofish_generic twofish_avx_x86_64 
twofish_x86_64_3way twofish_x86_64 twofish_common camellia_generic 
camellia_aesni_avx_x86_64 camellia_x86_64 serpent_avx_x86_64 
serpent_sse2_x86_64 serpent_generic blowfish_generic blowfish_x86_64 
blowfish_common cast5_avx_x86_64 cast5_generic cast_common des_generic cmac 
xcbc rmd160 sha512_ssse3 sha512_generic af_key xt_CHECKSUM ipt_rpfilter 
xt_statistic xt_connlimit nf_conncount ip_set_hash_ip br_netfilter bridge stp 
llc xt_HL xt_hl ipt_REJECT nf_reject_ipv4 xt_REDIRECT xt_recent xt_realm xt_CT 
xt_NETMAP xt_MASQUERADE xt_esp ipt_ECN xt_ecn xt_comment ipt_CLUSTERIP ipt_ah 
xt_addrtype act_police cls_basic cls_flow cls_fw cls_u32 sch_tbf sch_prio 
sch_htb sch_hfsc sch_ingress sch_sfq xt_set ip_set xt_NFLOG nf_log_ipv4 
nf_log_common xt_LOG nf_conntrack_sane nf_conntrack_netlink nf_nat_tftp 
nf_nat_snmp_basic nf_conntrack_snmp nf_nat_sip nf_nat_pptp nf_nat_irc 
nf_nat_h323 nf_nat_ftp nf_nat_amanda nf_conntrack_tftp
[Di Okt 15 11:51:06 2019]  nf_conntrack_sip nf_conntrack_pptp 
nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_irc 
nf_conntrack_h323 nf_conntrack_ftp ts_kmp nf_conntrack_amanda xt_TPROXY 
nf_tproxy_ipv6 nf_tproxy_ipv4 xt_time xt_TCPMSS xt_tcpmss xt_sctp xt_policy 
xt_pkttype xt_physdev xt_owner xt_NFQUEUE xt_nat xt_multiport xt_mark xt_mac 
xt_limit xt_length xt_iprange xt_helper xt_hashlimit xt_DSCP xt_dscp xt_dccp 
xt_connmark xt_CLASSIFY xt_AUDIT iptable_raw iptable_nat nf_nat iptable_mangle 
iptable_filter vmw_vsock_vmci_transport vsock ip6t_REJECT nf_reject_ipv6 
xt_state xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c 
nfnetlink_log nfnetlink ip6table_filter ip6_tables sb_edac crct10dif_pclmul 
crc32_pclmul ghash_clmulni_intel intel_rapl_perf joydev vmw_balloon vmw_vmci 
i2c_piix4 sch_fq tcp_illinois binfmt_misc ip_tables vmwgfx drm_kms_helper ttm 
drm crc32c_intel serio_raw mptspi e1000 scsi_transport_spi mptscsih mptbase 
ata_generic pata_acpi
[Di Okt 15 11:51:06 2019] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 
5.2.18-200.fc30.x86_64 #1
[Di Okt 15 11:51:06 2019] Hardware name: VMware, Inc. VMware Virtual 
Platform/440BX Desktop Reference Platform, BIOS 6.00 12/12/2018
[Di Okt 15 11:51:06 2019] RIP: 0010:fq_enqueue+0x531/0x610 [sch_fq]
[Di Okt 15 11:51:06 2019] Code: db 48 8b 3d 49 28 00 00 48 89 ee e8 39 ba 03 db 
45 85 ed 75 d4 49 8b 2f e9 f1 fb ff ff 48 83 83 30 02 00 00 01 e9 fd fc ff ff 
<0f> 0b e9 83 fc ff ff 4c 89 ea 31 c0 e9 be fc ff ff 41 8b 46 08 39
[Di Okt 15 11:51:06 2019] RSP: 0018:ffffa2fe806f0a58 EFLAGS: 00010206
[Di Okt 15 11:51:06 2019] RAX: 0000000df8475800 RBX: ffff973638a32800 RCX: 
0000000000000000
[Di Okt 15 11:51:06 2019] RDX: 000003df1924d162 RSI: 0000000000000004 RDI: 
ffffffffc02845b5
[Di Okt 15 11:51:06 2019] RBP: 15cdc829bb26f866 R08: 0000000000000000 R09: 
ffff97363afc4e00
[Di Okt 15 11:51:06 2019] R10: 0000000000000078 R11: ffff97363808b088 R12: 
ffff97363821ce00
[Di Okt 15 11:51:06 2019] R13: ffff973634620cb0 R14: 00000000000003cb R15: 
ffff973635965c70
[Di Okt 15 11:51:06 2019] FS:  0000000000000000(0000) GS:ffff97363ba80000(0000) 
knlGS:0000000000000000
[Di Okt 15 11:51:06 2019] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Di Okt 15 11:51:06 2019] CR2: 00005616fa9f90b8 CR3: 0000000135084005 CR4: 
00000000000606e0
[Di Okt 15 11:51:06 2019] Call Trace:
[Di Okt 15 11:51:06 2019]  <IRQ>
[Di Okt 15 11:51:06 2019]  ? fq_flow_purge+0x70/0x70 [sch_fq]
[Di Okt 15 11:51:06 2019]  ? ftrace_graph_caller+0xa0/0xa0
[Di Okt 15 11:51:06 2019]  __dev_queue_xmit+0x45b/0x960
[Di Okt 15 11:51:06 2019]  ? ftrace_graph_caller+0xa0/0xa0
[Di Okt 15 11:51:06 2019]  ip_finish_output2+0x2c8/0x580
[Di Okt 15 11:51:06 2019]  ? ftrace_graph_caller+0xa0/0xa0
[Di Okt 15 11:51:06 2019]  ip_do_fragment+0x69d/0x7a0
[Di Okt 15 11:51:06 2019]  ? __ip_flush_pending_frames.isra.0+0x80/0x80
[Di Okt 15 11:51:06 2019]  ? ftrace_graph_caller+0xa0/0xa0
[Di Okt 15 11:51:06 2019]  ip_output+0x71/0xf0
[Di Okt 15 11:51:06 2019]  ? ip_finish_output2+0x580/0x580
[Di Okt 15 11:51:06 2019]  ? ftrace_graph_caller+0xa0/0xa0
[Di Okt 15 11:51:06 2019]  ip_forward+0x3c5/0x480
[Di Okt 15 11:51:06 2019]  ? ip_defrag.cold+0x37/0x37
[Di Okt 15 11:51:06 2019]  ? ftrace_graph_caller+0xa0/0xa0
[Di Okt 15 11:51:06 2019]  ip_rcv+0xbc/0xd0
[Di Okt 15 11:51:06 2019]  ? ip_rcv_finish_core.isra.0+0x390/0x390
[Di Okt 15 11:51:06 2019]  ? ftrace_graph_caller+0xa0/0xa0
[Di Okt 15 11:51:06 2019]  __netif_receive_skb_one_core+0x80/0x90
[Di Okt 15 11:51:06 2019]  ? ftrace_graph_caller+0xa0/0xa0
[Di Okt 15 11:51:06 2019]  netif_receive_skb_internal+0x41/0xb0
[Di Okt 15 11:51:06 2019]  ? return_to_handler+0x15/0x36
[Di Okt 15 11:51:06 2019]  ? ftrace_graph_caller+0xa0/0xa0
[Di Okt 15 11:51:06 2019]  napi_gro_receive+0xf6/0x160
[Di Okt 15 11:51:06 2019]  ? ftrace_graph_caller+0xa0/0xa0
[Di Okt 15 11:51:06 2019]  e1000_clean_rx_irq+0x2ac/0x570 [e1000]
[Di Okt 15 11:51:06 2019]  ? ftrace_graph_caller+0xa0/0xa0
[Di Okt 15 11:51:06 2019]  e1000_clean+0x278/0x5f0 [e1000]
[Di Okt 15 11:51:06 2019]  ? ftrace_graph_caller+0xa0/0xa0
[Di Okt 15 11:51:06 2019]  net_rx_action+0x148/0x3b0
[Di Okt 15 11:51:06 2019]  ? napi_complete_done+0x110/0x110
[Di Okt 15 11:51:06 2019]  ? ftrace_graph_caller+0xa0/0xa0
[Di Okt 15 11:51:06 2019]  __do_softirq+0xed/0x30e
[Di Okt 15 11:51:06 2019]  ? ftrace_graph_caller+0xa0/0xa0
[Di Okt 15 11:51:06 2019]  irq_exit+0xf1/0x100
[Di Okt 15 11:51:06 2019]  ? ftrace_graph_caller+0xa0/0xa0
[Di Okt 15 11:51:06 2019]  do_IRQ+0x81/0xe0
[Di Okt 15 11:51:06 2019]  ? ftrace_graph_caller+0xa0/0xa0
[Di Okt 15 11:51:06 2019]  common_interrupt+0xf/0xf
[Di Okt 15 11:51:06 2019]  </IRQ>
[Di Okt 15 11:51:06 2019] RIP: 0010:native_safe_halt+0xe/0x10
[Di Okt 15 11:51:06 2019] Code: 90 90 90 90 90 90 90 90 90 90 90 90 e9 07 00 00 
00 0f 00 2d 46 06 44 00 f4 c3 66 90 e9 07 00 00 00 0f 00 2d 36 06 44 00 fb f4 
<c3> 90 e8 eb 98 03 00 41 54 55 53 e8 f2 01 79 ff 65 8b 2d db 7f 64
[Di Okt 15 11:51:06 2019] RSP: 0018:ffffa2fe806abeb0 EFLAGS: 00000282 ORIG_RAX: 
ffffffffffffffd3
[Di Okt 15 11:51:06 2019] RAX: ffffffff9b9c7d60 RBX: 0000000000000001 RCX: 
0000000000000000
[Di Okt 15 11:51:06 2019] RDX: 0000000000000001 RSI: 0000000000000008 RDI: 
ffffffff9b1140d6
[Di Okt 15 11:51:06 2019] RBP: 0000000000000001 R08: 0000000000000000 R09: 
ffff97363afc4e00
[Di Okt 15 11:51:06 2019] R10: 0000000000000c78 R11: ffff973638092c88 R12: 
ffff97363a84ddc0
[Di Okt 15 11:51:06 2019] R13: 0000000000000000 R14: 0000000000000000 R15: 
ffff97363a84ddc0
[Di Okt 15 11:51:06 2019]  ? __cpuidle_text_start+0x8/0x8
[Di Okt 15 11:51:06 2019]  ? do_idle+0x1f6/0x260
[Di Okt 15 11:51:06 2019]  default_idle+0x1a/0x140
[Di Okt 15 11:51:06 2019]  ? ftrace_graph_caller+0xa0/0xa0
[Di Okt 15 11:51:06 2019]  do_idle+0x1fb/0x260
[Di Okt 15 11:51:06 2019]  ? cpu_startup_entry+0x19/0x20
[Di Okt 15 11:51:06 2019]  ? ftrace_graph_caller+0xa0/0xa0
[Di Okt 15 11:51:06 2019]  cpu_startup_entry+0x19/0x20
[Di Okt 15 11:51:06 2019]  start_secondary+0x17d/0x1d0
[Di Okt 15 11:51:06 2019]  secondary_startup_64+0xa4/0xb0
[Di Okt 15 11:51:06 2019] ---[ end trace 34f95424acff5ddb ]---

Regards,
--
Thomas Bartschies
CVK IT Systeme

-----Ursprüngliche Nachricht-----
Von: Eric Dumazet [mailto:eric.duma...@gmail.com] 
Gesendet: Montag, 14. Oktober 2019 17:32
An: Bartschies, Thomas <thomas.bartsch...@cvk.de>; 'David Ahern' 
<dsah...@gmail.com>; 'netdev@vger.kernel.org' <netdev@vger.kernel.org>
Betreff: Re: AW: big ICMP requests get disrupted on IPSec tunnel activation



On 10/14/19 7:02 AM, Bartschies, Thomas wrote:
> Hello,
> 
> it took a while to build a testsystem for bisecting the issue. Finally I've 
> identified the patch that causes my problems.
> BTW. The fq packet network scheduler is in use.
> 
> It's
> [PATCH net-next] tcp/fq: move back to CLOCK_MONOTONIC
> 
> In the recent TCP/EDT patch series, I switched TCP and sch_fq clocks 
> from MONOTONIC to TAI, in order to meet the choice done earlier for sch_etf 
> packet scheduler.
> 
> But sure enough, this broke some setups were the TAI clock jumps forward (by 
> almost 50 year...), as reported by Leonard Crestez.
> 
> If we want to converge later, we'll probably need to add an skb field to 
> differentiate the clock bases, or a socket option.
> 
> In the meantime, an UDP application will need to use CLOCK_MONOTONIC 
> base for its SCM_TXTIME timestamps if using fq packet scheduler.
> 
> Fixes: 72b0094f9182 ("tcp: switch tcp_clock_ns() to CLOCK_TAI base")
> Fixes: 142537e41923 ("net_sched: sch_fq: switch to CLOCK_TAI")
> Fixes: fd2bca2aa789 ("tcp: switch internal pacing timer to CLOCK_TAI")
> Signed-off-by: Eric Dumazet <edumazet@xxxxxxxxxx>
> Reported-by: Leonard Crestez <leonard.crestez@xxxxxxx>
> 
> ----
> 
> After reverting it in a current 5.2.18 kernel, the problem disappears. 
> There were some post fixes for other issues caused by this patch. 
> These fixed other similar issues, but not mine. I've already tried to 
> set the tstamp to zero in xfrm4_output.c, but with no luck so far. I'm pretty 
> sure, that reverting the clock patch isn't the proper solution for upstream. 
> So I what other way this can be fixed?


Thanks a lot Thomas for this report !

I guess you could add a debug check in fq to let us know the call graph.

Something like the following :

diff --git a/net/sched/sch_fq.c b/net/sched/sch_fq.c index 
98dd87ce15108cfe1c011da44ba32f97763776c8..2aa41a39e81b94f3b7092dc51b91829f5929634d
 100644
--- a/net/sched/sch_fq.c
+++ b/net/sched/sch_fq.c
@@ -380,9 +380,14 @@ static void flow_queue_add(struct fq_flow *flow, struct 
sk_buff *skb)  {
        struct rb_node **p, *parent;
        struct sk_buff *head, *aux;
+       s64 delay;
 
        fq_skb_cb(skb)->time_to_send = skb->tstamp ?: ktime_get_ns();
 
+       /* We should really add a TCA_FQ_MAX_HORIZON  at some point :( */
+       delay = fq_skb_cb(skb)->time_to_send - ktime_get_ns();
+       WARN_ON_ONCE(delay > 60 * NSEC_PER_SEC);
+
        head = flow->head;
        if (!head ||
            fq_skb_cb(skb)->time_to_send >= 
fq_skb_cb(flow->tail)->time_to_send) {


> 
> ---
> [PATCH net] net: clear skb->tstamp in bridge forwarding path Matteo 
> reported forwarding issues inside the linux bridge, if the enslaved 
> interfaces use the fq qdisc.
> 
> Similar to commit 8203e2d844d3 ("net: clear skb->tstamp in forwarding 
> paths"), we need to clear the tstamp field in the bridge forwarding path.
> 
> Fixes: 80b14dee2bea ("net: Add a new socket option for a future 
> transmit time.")
> Fixes: fb420d5d91c1 ("tcp/fq: move back to CLOCK_MONOTONIC")
> Reported-and-tested-by: Matteo Croce <mcr...@redhat.com>
> Signed-off-by: Paolo Abeni <pab...@redhat.com>
> 
> and
> 
> net: clear skb->tstamp in forwarding paths
> 
> Sergey reported that forwarding was no longer working if fq packet scheduler 
> was used.
> 
> This is caused by the recent switch to EDT model, since incoming 
> packets might have been timestamped by __net_timestamp()
> 
> __net_timestamp() uses ktime_get_real(), while fq expects packets using 
> CLOCK_MONOTONIC base.
> 
> The fix is to clear skb->tstamp in forwarding paths.
> 
> Fixes: 80b14dee ("net: Add a new socket option for a future transmit 
> time.")
> Fixes: fb420d5d ("tcp/fq: move back to CLOCK_MONOTONIC")
> Signed-off-by: default avatarEric Dumazet <eduma...@google.com>
> Reported-by: default avatarSergey Matyukevich <geoma...@gmail.com>
> Tested-by: default avatarSergey Matyukevich <geoma...@gmail.com>
> Signed-off-by: default avatarDavid S. Miller <da...@davemloft.net>
> 
> Best regards,
> --
> Thomas Bartschies
> CVK IT Systeme
> 
> 
> -----Ursprüngliche Nachricht-----
> Von: Bartschies, Thomas
> Gesendet: Dienstag, 17. September 2019 09:28
> An: 'David Ahern' <dsah...@gmail.com>; 'netdev@vger.kernel.org' 
> <netdev@vger.kernel.org>
> Betreff: AW: big ICMP requests get disrupted on IPSec tunnel 
> activation
> 
> Hello,
> 
> thanks for the suggestion. Running pmtu.sh with kernel versions 4.19, 4.20 
> and even 5.2.13 made no difference. All tests were successful every time.
> 
> Although my external ping tests still failing with the newer kernels. I've 
> ran the script after triggering my problem, to make sure all possible side 
> effects happening. 
> 
> Please keep in mind, that even when the ICMP requests stalling, other 
> connections still going through. Like e.g. ssh or tracepath. I would expect 
> that all connection types would be affected if this is a MTU problem. Am I 
> wrong?
> 
> Any suggestions for more tests to isolate the cause? 
> 
> Best regards,
> --
> Thomas Bartschies
> CVK IT Systeme
> 
> -----Ursprüngliche Nachricht-----
> Von: David Ahern [mailto:dsah...@gmail.com]
> Gesendet: Freitag, 13. September 2019 19:13
> An: Bartschies, Thomas <thomas.bartsch...@cvk.de>; 
> 'netdev@vger.kernel.org' <netdev@vger.kernel.org>
> Betreff: Re: big ICMP requests get disrupted on IPSec tunnel 
> activation
> 
> On 9/13/19 9:59 AM, Bartschies, Thomas wrote:
>> Hello together,
>>
>> since kenel 4.20 we're observing a strange behaviour when sending big ICMP 
>> packets. An example is a packet size of 3000 bytes.
>> The packets should be forwarded by a linux gateway (firewall) having 
>> multiple interfaces also acting as a vpn gateway.
>>
>> Test steps:
>> 1. Disabled all iptables rules
>> 2. Enabled the VPN IPSec Policies.
>> 3. Start a ping with packet size (e.g. 3000 bytes) from a client in 
>> the DMZ passing the machine targeting another LAN machine 4. Ping 
>> works 5. Enable a VPN policy by sending pings from the gateway to a 
>> tunnel target. System tries to create the tunnel 6. Ping from 3. immediately 
>> stalls. No error messages. Just stops.
>> 7. Stop Ping from 3. Start another without packet size parameter. Stalls 
>> also.
>>
>> Result:
>> Connections from the client to other services on the LAN machine 
>> still work. Tracepath works. Only ICMP requests do not pass the 
>> gateway anymore. tcpdump sees them on incoming interface, but not on the 
>> outgoing LAN interface. IMCP requests to any other target IP address in LAN 
>> still work. Until one uses a bigger packet size. Then these alternative 
>> connections stall also.
>>
>> Flushing the policy table has no effect. Flushing the conntrack table has no 
>> effect. Setting rp_filter to loose (2) has no effect.
>> Flush the route cache has no effect.
>>
>> Only a reboot of the gateway restores normal behavior.
>>
>> What can be the cause? Is this a networking bug?
>>
> 
> some of these most likely will fail due to other reasons, but can you run 
> 'tools/testing/selftests/net/pmtu.sh'[1] on 4.19 and then 4.20 and compare 
> results. Hopefully it will shed some light on the problem and can be used to 
> bisect to a commit that caused the regression.
> 
> 
> [1]
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tre
> e/tools/testing/selftests/net/pmtu.sh
>

Re: big ICMP requests get disrupted on IPSec tunnel activation

Reply via email to