That information is all in the email The openflow trace is showing that the pipeline is fine . This is why I’m worried about a deeper issue with the kernal / openvswitch kernal module / connection tracking
On Wed, Apr 17, 2024 at 16:33 Flavio Leitner <f...@sysclose.org> wrote: > On Wed, 17 Apr 2024 12:26:27 -0700 > Gavin McKee <gavmcke...@googlemail.com> wrote: > > > Hi Flavio, > > > > I had to restart the Open vSwitch across 16 machines to resolve the > > issue for a customer . I think it will occur again and when it does > > I'll use that command to gather the tc information. > > > > Until then I think I have found why the issue is occurring . > > > > Take a look at the output below (this is a packet capture from the > > physical interface on the compute node , so traffic that has gone > > through the OVS Openflow pipeline) - we make a 3 way handshake with R2 > > , and establish the connection. A packet goes missing - TLS > > handshake, it then appears that it hasn't gone through NAT as it's > > using the Private IP of the VM . > > > If that is the case, you might be able to see if the data path > is matching correctly and the actions are using NAT with > ovs-appctl ofproto/trace. > > fbl > > > > > > Take a look at frame 14 > > > > No. Time Source Destination > > Protocol Length Info > > Delta > > 14 09:24:08.064432 172.27.18.244 104.18.2.35 > > TLSv1 502 Client Hello, Alert (Level: Fatal, Description: Decode > > Error) 2.362983 > > > > Frame 14: 502 bytes on wire (4016 bits), 502 bytes captured (4016 > > bits) Ethernet II, Src: 4e:42:14:a1:2a:fb (4e:42:14:a1:2a:fb), Dst: > > IETF-VRRP-VRID_ff (00:00:5e:00:01:ff) > > 802.1Q Virtual LAN, PRI: 0, DEI: 0, ID: 120 > > Internet Protocol Version 4, Src: 172.27.18.244, Dst: 104.18.2.35 > > Transmission Control Protocol, Src Port: 57394, Dst Port: 443, Seq: 1, > > Ack: 1, Len: 444 > > Transport Layer Security > > TLSv1 Record Layer: Handshake Protocol: Client Hello > > Content Type: Handshake (22) > > Version: TLS 1.0 (0x0301) > > Length: 432 > > Handshake Protocol: Client Hello > > TLSv1 Record Layer: Alert (Level: Fatal, Description: Decode > > Error) Content Type: Alert (21) > > Version: TLS 1.0 (0x0301) > > Length: 2 > > Alert Message > > Level: Fatal (2) > > Description: Decode Error (50) > > > > > > > > > ------------------------------------------------------------------------------------------------------------ > > > > On Wed, 17 Apr 2024 at 11:28, Flavio Leitner <f...@sysclose.org> wrote: > > > > > > > > > Hi Gavin, > > > > > > It would be helpful if you can provide some TC dumps from the > > > "good" state to the "bad" state to see how it was and what changes. > > > Something like: > > > > > > # tc -s filter show dev enp148s0f0_1 ingress > > > > > > I haven't checked the attached files, but one suggestion is to > > > check if this is not a csum issue. > > > > > > Thanks, > > > fbl > > > > > > > > > On Tue, 16 Apr 2024 13:17:10 -0700 > > > Gavin McKee via discuss <ovs-discuss@openvswitch.org> wrote: > > > > > > > Adding information relating to the Open VSwitch kernal module > > > > @Ilya Maximets @Numan Siddique Can either of you help out here? > > > > > > > > > > > > modinfo openvswitch > > > > filename: > > > > > /lib/modules/5.14.0-362.8.1.el9_3.x86_64/kernel/net/openvswitch/openvswitch.ko.xz > > > > alias: net-pf-16-proto-16-family-ovs_ct_limit > > > > alias: net-pf-16-proto-16-family-ovs_meter > > > > alias: net-pf-16-proto-16-family-ovs_packet > > > > alias: net-pf-16-proto-16-family-ovs_flow > > > > alias: net-pf-16-proto-16-family-ovs_vport > > > > alias: net-pf-16-proto-16-family-ovs_datapath > > > > license: GPL > > > > description: Open vSwitch switching datapath > > > > rhelversion: 9.3 > > > > srcversion: 8A2159D727C8BADC82261B8 > > > > depends: nf_conntrack,nf_conncount,libcrc32c,nf_nat > > > > retpoline: Y > > > > intree: Y > > > > name: openvswitch > > > > vermagic: 5.14.0-362.8.1.el9_3.x86_64 SMP preempt mod_unload > > > > modversions sig_id: PKCS#7 > > > > signer: Rocky kernel signing key > > > > sig_key: > > > > 17:CA:DE:1F:EC:D1:59:2D:9F:52:34:C6:7C:09:06:81:3D:74:7C:F7 > > > > sig_hashalgo: sha256 signature: > > > > 67:31:56:70:86:DB:57:69:8D:4A:9B:A7:ED:17:F3:67:65:98:97:08: > > > > 1F:FB:4D:F8:A8:2D:7C:A7:7D:3A:57:85:CA:67:9D:82:72:EB:54:14: > > > > F2:BB:40:78:AD:85:56:2D:EF:D5:00:95:38:A4:86:9F:5F:29:1A:81: > > > > 32:94:B4:87:41:94:A0:3E:71:A5:97:44:2E:42:DD:F7:42:6B:69:94: > > > > E3:AB:6E:E5:4F:C9:60:57:70:07:5F:CA:C7:83:7A:2F:C7:81:62:FF: > > > > 53:AF:AC:2B:06:D8:08:D3:1D:A7:F0:43:10:98:DE:B1:62:AE:89:A5: > > > > FE:EF:74:09:0F:2D:0F:D9:73:A5:59:75:D0:87:1E:EA:3A:40:86:1E: > > > > 76:E5:E7:3B:59:2E:3A:7E:65:F3:92:A1:B4:84:48:3F:43:A0:D7:1C: > > > > 21:29:E0:B6:D1:10:36:15:88:43:6A:11:8F:55:EE:1B:F9:53:3B:86: > > > > EF:81:71:17:81:08:EC:53:30:D6:69:8E:13:11:D5:DF:15:75:88:50: > > > > 69:19:51:3B:41:6B:6F:E0:7A:30:33:32:E6:60:18:02:A6:0C:63:9B: > > > > C5:D7:2F:6A:D0:BA:45:03:19:0E:21:E8:18:FB:E8:D1:C1:33:05:36: > > > > 1F:9B:0F:29:3F:05:51:7A:30:86:88:B7:C7:44:2E:2B:50:F9:EF:4F: > > > > D4:70:EA:1B:33:E2:F0:E3:E2:88:00:E5:BF:06:E2:D4:B7:81:EE:6E: > > > > 89:02:18:65:8B:1C:84:42:2F:89:14:63:1D:51:70:37:42:C5:68:DD: > > > > 4D:12:7B:07:33:2B:C6:BC:8F:7F:23:D7:58:DF:47:AC:DE:08:67:FE: > > > > CB:E8:E6:4D:95:2F:6B:F5:07:4D:32:92:80:0A:7C:D1:B6:81:EE:AB: > > > > 26:C3:C6:22:77:00:5E:64:DE:96:0E:9F:A4:A0:F0:45:9F:19:73:EB: > > > > CC:60:AE:E9:63:E2:6D:2E:BA:65:9B:BD:04:CC:13:C2:55:88:05:03: > > > > 1B:30:18:8B > > > > > > > > On Tue, 16 Apr 2024 at 11:12, Gavin McKee > > > > <gavmcke...@googlemail.com> wrote: > > > > > > > > > > Hi, > > > > > > > > > > I need some help with strange OVS behaviours. > > > > > > > > > > ovs-vsctl (Open vSwitch) 3.2.2 > > > > > ovn-controller 23.09.1 > > > > > Open vSwitch Library 3.2.2 > > > > > > > > > > TLDR: We need to restart Open VSwitch in order for TLS traffic > > > > > to work between a VM and Cloudflare R2. After restarting Open > > > > > VSwitch the TLS connection works fine. > > > > > (see attached pcap tls-error.txt) > > > > > > > > > > See the attached openflow traces - they show a flow trace from > > > > > Open Vswitch. > > > > > > > > > > Also there is a retis trace (retis tool discussed at Open > > > > > VSwitch conference 2023). > > > > > > > > > > Note the drop (TC_INGRESS) in this file > > > > > + 1702601116185568 [swapper/140] 0 [tp] skb:kfree_skb > > > > > #60c81b6b91e2cff284fb3a3d65800 (skb 18386033671255367680) n 3 > > > > > drop (TC_INGRESS) > > > > > if 21 (enp148s0f0_1) rxif 21 172.27.18.244.57394 > > > > > > 104.18.2.35.443 ttl 63 tos 0x0 id 26162 off 0 [DF] len 477 proto > > > > > TCP (6) flags [P.] seq 792060930:792061367 ack 951229219 win 11 > > > > > > > > > > Again , once I restart Open vSwitch the problem goes away for a > > > > > time and comes back sometime later (not sure what that time > > > > > frame is but its a recurring issue.) > > > > _______________________________________________ > > > > discuss mailing list > > > > disc...@openvswitch.org > > > > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss > > > > > > > > > > > > -- > > > fbl > > > > -- > fbl >
_______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss