Hi Flavio,

I had to restart the Open vSwitch across 16 machines to resolve the
issue for a customer .  I think it will occur again and when it does
I'll use that command to gather the tc information.

Until then I think I have found why the issue is occurring .

Take a look at the output below (this is a packet capture from the
physical interface on the compute node , so traffic that has gone
through the OVS Openflow pipeline) - we make a 3 way handshake with R2
, and establish the connection.  A packet goes missing - TLS
handshake, it then appears that it hasn't gone through NAT as it's
using the Private IP of the VM  .

Take a look at frame 14

No.     Time               Source                Destination
Protocol Length Info
         Delta
     14 09:24:08.064432    172.27.18.244         104.18.2.35
TLSv1    502    Client Hello, Alert (Level: Fatal, Description: Decode
Error)   2.362983

Frame 14: 502 bytes on wire (4016 bits), 502 bytes captured (4016 bits)
Ethernet II, Src: 4e:42:14:a1:2a:fb (4e:42:14:a1:2a:fb), Dst:
IETF-VRRP-VRID_ff (00:00:5e:00:01:ff)
802.1Q Virtual LAN, PRI: 0, DEI: 0, ID: 120
Internet Protocol Version 4, Src: 172.27.18.244, Dst: 104.18.2.35
Transmission Control Protocol, Src Port: 57394, Dst Port: 443, Seq: 1,
Ack: 1, Len: 444
Transport Layer Security
    TLSv1 Record Layer: Handshake Protocol: Client Hello
        Content Type: Handshake (22)
        Version: TLS 1.0 (0x0301)
        Length: 432
        Handshake Protocol: Client Hello
    TLSv1 Record Layer: Alert (Level: Fatal, Description: Decode Error)
        Content Type: Alert (21)
        Version: TLS 1.0 (0x0301)
        Length: 2
        Alert Message
            Level: Fatal (2)
            Description: Decode Error (50)



------------------------------------------------------------------------------------------------------------

On Wed, 17 Apr 2024 at 11:28, Flavio Leitner <f...@sysclose.org> wrote:
>
>
> Hi Gavin,
>
> It would be helpful if you can provide some TC dumps from the
> "good" state to the "bad" state to see how it was and what changes.
> Something like:
>
> # tc -s filter show dev enp148s0f0_1 ingress
>
> I haven't checked the attached files, but one suggestion is to
> check if this is not a csum issue.
>
> Thanks,
> fbl
>
>
> On Tue, 16 Apr 2024 13:17:10 -0700
> Gavin McKee via discuss <ovs-discuss@openvswitch.org> wrote:
>
> > Adding information relating to the Open VSwitch kernal module
> > @Ilya Maximets @Numan Siddique  Can either of you help out here?
> >
> >
> > modinfo openvswitch
> > filename:
> > /lib/modules/5.14.0-362.8.1.el9_3.x86_64/kernel/net/openvswitch/openvswitch.ko.xz
> > alias:          net-pf-16-proto-16-family-ovs_ct_limit
> > alias:          net-pf-16-proto-16-family-ovs_meter
> > alias:          net-pf-16-proto-16-family-ovs_packet
> > alias:          net-pf-16-proto-16-family-ovs_flow
> > alias:          net-pf-16-proto-16-family-ovs_vport
> > alias:          net-pf-16-proto-16-family-ovs_datapath
> > license:        GPL
> > description:    Open vSwitch switching datapath
> > rhelversion:    9.3
> > srcversion:     8A2159D727C8BADC82261B8
> > depends:        nf_conntrack,nf_conncount,libcrc32c,nf_nat
> > retpoline:      Y
> > intree:         Y
> > name:           openvswitch
> > vermagic:       5.14.0-362.8.1.el9_3.x86_64 SMP preempt mod_unload
> > modversions sig_id:         PKCS#7
> > signer:         Rocky kernel signing key
> > sig_key:
> > 17:CA:DE:1F:EC:D1:59:2D:9F:52:34:C6:7C:09:06:81:3D:74:7C:F7
> > sig_hashalgo:   sha256 signature:
> > 67:31:56:70:86:DB:57:69:8D:4A:9B:A7:ED:17:F3:67:65:98:97:08:
> > 1F:FB:4D:F8:A8:2D:7C:A7:7D:3A:57:85:CA:67:9D:82:72:EB:54:14:
> > F2:BB:40:78:AD:85:56:2D:EF:D5:00:95:38:A4:86:9F:5F:29:1A:81:
> > 32:94:B4:87:41:94:A0:3E:71:A5:97:44:2E:42:DD:F7:42:6B:69:94:
> > E3:AB:6E:E5:4F:C9:60:57:70:07:5F:CA:C7:83:7A:2F:C7:81:62:FF:
> > 53:AF:AC:2B:06:D8:08:D3:1D:A7:F0:43:10:98:DE:B1:62:AE:89:A5:
> > FE:EF:74:09:0F:2D:0F:D9:73:A5:59:75:D0:87:1E:EA:3A:40:86:1E:
> > 76:E5:E7:3B:59:2E:3A:7E:65:F3:92:A1:B4:84:48:3F:43:A0:D7:1C:
> > 21:29:E0:B6:D1:10:36:15:88:43:6A:11:8F:55:EE:1B:F9:53:3B:86:
> > EF:81:71:17:81:08:EC:53:30:D6:69:8E:13:11:D5:DF:15:75:88:50:
> > 69:19:51:3B:41:6B:6F:E0:7A:30:33:32:E6:60:18:02:A6:0C:63:9B:
> > C5:D7:2F:6A:D0:BA:45:03:19:0E:21:E8:18:FB:E8:D1:C1:33:05:36:
> > 1F:9B:0F:29:3F:05:51:7A:30:86:88:B7:C7:44:2E:2B:50:F9:EF:4F:
> > D4:70:EA:1B:33:E2:F0:E3:E2:88:00:E5:BF:06:E2:D4:B7:81:EE:6E:
> > 89:02:18:65:8B:1C:84:42:2F:89:14:63:1D:51:70:37:42:C5:68:DD:
> > 4D:12:7B:07:33:2B:C6:BC:8F:7F:23:D7:58:DF:47:AC:DE:08:67:FE:
> > CB:E8:E6:4D:95:2F:6B:F5:07:4D:32:92:80:0A:7C:D1:B6:81:EE:AB:
> > 26:C3:C6:22:77:00:5E:64:DE:96:0E:9F:A4:A0:F0:45:9F:19:73:EB:
> > CC:60:AE:E9:63:E2:6D:2E:BA:65:9B:BD:04:CC:13:C2:55:88:05:03:
> > 1B:30:18:8B
> >
> > On Tue, 16 Apr 2024 at 11:12, Gavin McKee <gavmcke...@googlemail.com>
> > wrote:
> > >
> > > Hi,
> > >
> > > I need some help with strange OVS behaviours.
> > >
> > > ovs-vsctl (Open vSwitch) 3.2.2
> > > ovn-controller 23.09.1
> > > Open vSwitch Library 3.2.2
> > >
> > > TLDR: We need to restart Open VSwitch in order for TLS traffic to
> > > work between a VM and Cloudflare R2.  After restarting Open VSwitch
> > > the TLS connection works fine.
> > > (see attached pcap tls-error.txt)
> > >
> > > See the attached openflow traces - they show a flow trace from Open
> > > Vswitch.
> > >
> > > Also there is a retis trace (retis tool discussed at Open VSwitch
> > > conference 2023).
> > >
> > > Note the drop (TC_INGRESS) in this file
> > >   + 1702601116185568 [swapper/140] 0 [tp] skb:kfree_skb
> > > #60c81b6b91e2cff284fb3a3d65800 (skb 18386033671255367680) n 3 drop
> > > (TC_INGRESS)
> > >     if 21 (enp148s0f0_1) rxif 21 172.27.18.244.57394 >
> > > 104.18.2.35.443 ttl 63 tos 0x0 id 26162 off 0 [DF] len 477 proto
> > > TCP (6) flags [P.] seq 792060930:792061367 ack 951229219 win 11
> > >
> > > Again , once I restart Open vSwitch the problem goes away for a time
> > > and comes back sometime later (not sure what that time frame is but
> > > its a recurring issue.)
> > _______________________________________________
> > discuss mailing list
> > disc...@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
>
>
>
> --
> fbl
No.     Time               Source                Destination           Protocol 
Length Info                                                            Delta
      4 09:23:40.660635    204.52.24.116         104.18.2.35           TCP      
70     57394 → 443 [SYN] Seq=0 Win=42340 Len=0 MSS=1460 SACK_PERM WS=4096 
10.014701

Frame 4: 70 bytes on wire (560 bits), 70 bytes captured (560 bits)
Ethernet II, Src: 4e:42:14:a1:2a:fb (4e:42:14:a1:2a:fb), Dst: IETF-VRRP-VRID_ff 
(00:00:5e:00:01:ff)
802.1Q Virtual LAN, PRI: 0, DEI: 0, ID: 120
Internet Protocol Version 4, Src: 204.52.24.116, Dst: 104.18.2.35
Transmission Control Protocol, Src Port: 57394, Dst Port: 443, Seq: 0, Len: 0

No.     Time               Source                Destination           Protocol 
Length Info                                                            Delta
      5 09:23:40.666095    104.18.2.35           204.52.24.116         TCP      
66     443 → 57394 [SYN, ACK] Seq=0 Ack=1 Win=64240 Len=0 MSS=1400 SACK_PERM 
WS=8192 0.005460

Frame 5: 66 bytes on wire (528 bits), 66 bytes captured (528 bits)
Ethernet II, Src: Mellanox_4a:c0:fd (9c:05:91:4a:c0:fd), Dst: 4e:42:14:a1:2a:fb 
(4e:42:14:a1:2a:fb)
Internet Protocol Version 4, Src: 104.18.2.35, Dst: 204.52.24.116
Transmission Control Protocol, Src Port: 443, Dst Port: 57394, Seq: 0, Ack: 1, 
Len: 0

No.     Time               Source                Destination           Protocol 
Length Info                                                            Delta
      6 09:23:40.666194    204.52.24.116         104.18.2.35           TCP      
58     57394 → 443 [ACK] Seq=1 Ack=1 Win=45056 Len=0                 0.000099

Frame 6: 58 bytes on wire (464 bits), 58 bytes captured (464 bits)
Ethernet II, Src: 4e:42:14:a1:2a:fb (4e:42:14:a1:2a:fb), Dst: IETF-VRRP-VRID_ff 
(00:00:5e:00:01:ff)
802.1Q Virtual LAN, PRI: 0, DEI: 0, ID: 120
Internet Protocol Version 4, Src: 204.52.24.116, Dst: 104.18.2.35
Transmission Control Protocol, Src Port: 57394, Dst Port: 443, Seq: 1, Ack: 1, 
Len: 0

No.     Time               Source                Destination           Protocol 
Length Info                                                            Delta
      8 09:23:55.673177    104.18.2.35           204.52.24.116         TCP      
60     443 → 57394 [FIN, ACK] Seq=1 Ack=1 Win=65536 Len=0            12.696825

Frame 8: 60 bytes on wire (480 bits), 60 bytes captured (480 bits)
Ethernet II, Src: Mellanox_4a:c0:fd (9c:05:91:4a:c0:fd), Dst: 4e:42:14:a1:2a:fb 
(4e:42:14:a1:2a:fb)
802.1Q Virtual LAN, PRI: 0, DEI: 0, ID: 120
Internet Protocol Version 4, Src: 104.18.2.35, Dst: 204.52.24.116
Transmission Control Protocol, Src Port: 443, Dst Port: 57394, Seq: 1, Ack: 1, 
Len: 0

No.     Time               Source                Destination           Protocol 
Length Info                                                            Delta
      9 09:23:55.676533    204.52.24.116         104.18.2.35           TLSv1    
65     [TCP Previous segment not captured] , Alert (Level: Fatal, Description: 
Decode Error) 0.003356

Frame 9: 65 bytes on wire (520 bits), 65 bytes captured (520 bits)
Ethernet II, Src: 4e:42:14:a1:2a:fb (4e:42:14:a1:2a:fb), Dst: IETF-VRRP-VRID_ff 
(00:00:5e:00:01:ff)
802.1Q Virtual LAN, PRI: 0, DEI: 0, ID: 120
Internet Protocol Version 4, Src: 204.52.24.116, Dst: 104.18.2.35
Transmission Control Protocol, Src Port: 57394, Dst Port: 443, Seq: 438, Ack: 
2, Len: 7
Transport Layer Security
    TLSv1 Record Layer: Alert (Level: Fatal, Description: Decode Error)
        Content Type: Alert (21)
        Version: TLS 1.0 (0x0301)
        Length: 2
        Alert Message
            Level: Fatal (2)
            Description: Decode Error (50)

No.     Time               Source                Destination           Protocol 
Length Info                                                            Delta
     10 09:23:55.681947    104.18.2.35           204.52.24.116         TCP      
56     443 → 57394 [RST] Seq=2 Win=0 Len=0                           0.005414

Frame 10: 56 bytes on wire (448 bits), 56 bytes captured (448 bits)
Ethernet II, Src: Mellanox_4a:c0:fd (9c:05:91:4a:c0:fd), Dst: 4e:42:14:a1:2a:fb 
(4e:42:14:a1:2a:fb)
Internet Protocol Version 4, Src: 104.18.2.35, Dst: 204.52.24.116
Transmission Control Protocol, Src Port: 443, Dst Port: 57394, Seq: 2, Len: 0

No.     Time               Source                Destination           Protocol 
Length Info                                                            Delta
     14 09:24:08.064432    172.27.18.244         104.18.2.35           TLSv1    
502    Client Hello, Alert (Level: Fatal, Description: Decode Error)   2.362983

Frame 14: 502 bytes on wire (4016 bits), 502 bytes captured (4016 bits)
Ethernet II, Src: 4e:42:14:a1:2a:fb (4e:42:14:a1:2a:fb), Dst: IETF-VRRP-VRID_ff 
(00:00:5e:00:01:ff)
802.1Q Virtual LAN, PRI: 0, DEI: 0, ID: 120
Internet Protocol Version 4, Src: 172.27.18.244, Dst: 104.18.2.35
Transmission Control Protocol, Src Port: 57394, Dst Port: 443, Seq: 1, Ack: 1, 
Len: 444
Transport Layer Security
    TLSv1 Record Layer: Handshake Protocol: Client Hello
        Content Type: Handshake (22)
        Version: TLS 1.0 (0x0301)
        Length: 432
        Handshake Protocol: Client Hello
    TLSv1 Record Layer: Alert (Level: Fatal, Description: Decode Error)
        Content Type: Alert (21)
        Version: TLS 1.0 (0x0301)
        Length: 2
        Alert Message
            Level: Fatal (2)
            Description: Decode Error (50)
_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to