Hi Gavin

On 4/18/24 02:38, Gavin McKee via discuss wrote:
This is an example.

Again the TCP 3 handshake completes , but the next packet fails to NAT
and goes out onto the physical network using the private address .  An
example of this is in the packet trace I provided.


Given you were using retis in your initial troubleshooting, you can use it with the additional "ct" collector to see if the kernel datapath is retrieving the right conntrack entry and what's its state. If that shows some unexpected conntrack entry change, it can be confirmed by monitoring "conntrack -E".

Additionally, a dp flow dump (ovs-appctl dpctl/dump-flows -m) when the problem is happening might be also useful.

--
Adrián


ovs-appctl ofproto/trace br-int
in_port=7753,dl_src=4e:42:14:a1:2a:fb,dl_dst=1a:16:b1:58:e1:cd,tcp,nw_src=172.27.18.244,nw_dst=104.18.3.35,nw_ttl=32,tcp_src=52776,tcp_dst=443,tcp_flags=2
Flow: 
tcp,in_port=7753,vlan_tci=0x0000,dl_src=4e:42:14:a1:2a:fb,dl_dst=1a:16:b1:58:e1:cd,nw_src=172.27.18.244,nw_dst=104.18.3.35,nw_tos=0,nw_ecn=0,nw_ttl=32,nw_frag=no,tp_src=52776,tp_dst=443,tcp_flags=syn

bridge("br-int")
----------------
  0. in_port=7753, priority 100, cookie 0xc33f39c4
     set_field:0x16d->reg13
     set_field:0x155->reg11
     set_field:0x1cf->reg12
     set_field:0x12->metadata
     set_field:0xca->reg14
     resubmit(,8)
  8. metadata=0x12, priority 50, cookie 0x1645d3f2
     set_field:0/0x1000->reg10
     resubmit(,73)
     73. 
ip,reg14=0xca,metadata=0x12,dl_src=4e:42:14:a1:2a:fb,nw_src=172.27.18.244,
priority 90, cookie 0xc33f39c4
             set_field:0/0x1000->reg10
     move:NXM_NX_REG10[12]->NXM_NX_XXREG0[111]
      -> NXM_NX_XXREG0[111] is now 0
     resubmit(,9)
  9. metadata=0x12, priority 0, cookie 0xcc4fd106
     resubmit(,10)
10. metadata=0x12, priority 0, cookie 0x9e10ad0e
     resubmit(,11)
11. metadata=0x12, priority 0, cookie 0x557f3249
     resubmit(,12)
12. ip,metadata=0x12, priority 100, cookie 0x14131a67
     set_field:0x1000000000000000000000000/0x1000000000000000000000000->xxreg0
     resubmit(,13)
13. metadata=0x12, priority 0, cookie 0x85f9ed4f
     resubmit(,14)
14. ip,reg0=0x1/0x1,metadata=0x12, priority 100, cookie 0x279651c
     ct(table=15,zone=NXM_NX_REG13[0..15])
     drop
      -> A clone of the packet is forked to recirculate. The forked
pipeline will be resumed at table 15.
      -> Sets the packet to an untracked state, and clears all the
conntrack fields.

Final flow: 
tcp,reg0=0x1,reg11=0x155,reg12=0x1cf,reg13=0x16d,reg14=0xca,metadata=0x12,in_port=7753,vlan_tci=0x0000,dl_src=4e:42:14:a1:2a:fb,dl_dst=1a:16:b1:58:e1:cd,nw_src=172.27.18.244,nw_dst=104.18.3.35,nw_tos=0,nw_ecn=0,nw_ttl=32,nw_frag=no,tp_src=52776,tp_dst=443,tcp_flags=syn
Megaflow: 
recirc_id=0,eth,tcp,in_port=7753,dl_src=4e:42:14:a1:2a:fb,dl_dst=1a:16:b1:58:e1:cd,nw_src=172.27.18.244,nw_frag=no
Datapath actions: ct(zone=365),recirc(0x13b5a)

===============================================================================
recirc(0x13b5a) - resume conntrack with default ct_state=trk|new (use
--ct-next to customize)
===============================================================================

Flow: 
recirc_id=0x13b5a,ct_state=new|trk,ct_zone=365,eth,tcp,reg0=0x1,reg11=0x155,reg12=0x1cf,reg13=0x16d,reg14=0xca,metadata=0x12,in_port=7753,vlan_tci=0x0000,dl_src=4e:42:14:a1:2a:fb,dl_dst=1a:16:b1:58:e1:cd,nw_src=172.27.18.244,nw_dst=104.18.3.35,nw_tos=0,nw_ecn=0,nw_ttl=32,nw_frag=no,tp_src=52776,tp_dst=443,tcp_flags=syn

bridge("br-int")
----------------
     thaw
         Resuming from table 15
15. ct_state=+new-est+trk,metadata=0x12, priority 7, cookie 0xa9e0ee6f
     set_field:0x80000000000000000000000000/0x80000000000000000000000000->xxreg0
     
set_field:0x200000000000000000000000000/0x200000000000000000000000000->xxreg0
     resubmit(,16)
16. conj_id=2865573479,tcp,reg0=0x80/0x80,reg14=0xca,metadata=0x12,
priority 3000, cookie 0xabdec111
     set_field:0x1000000000000/0x1000000000000->xreg4
     set_field:0x2000000000000000000000000/0x2000000000000000000000000->xxreg0
     resubmit(,17)
17. reg8=0x10000/0x10000,metadata=0x12, priority 1000, cookie 0x8171c04a
     set_field:0/0x1000000000000->xreg4
     set_field:0/0x2000000000000->xreg4
     set_field:0/0x4000000000000->xreg4
     resubmit(,18)
18. metadata=0x12, priority 0, cookie 0x62454929
     resubmit(,19)
19. metadata=0x12, priority 0, cookie 0x3bb47080
     resubmit(,20)
20. metadata=0x12, priority 0, cookie 0x73face9d
     resubmit(,21)
21. metadata=0x12, priority 0, cookie 0x8e46634d
     resubmit(,22)
22. metadata=0x12, priority 0, cookie 0xbddac461
     resubmit(,23)
23. metadata=0x12, priority 0, cookie 0x9a32c0de
     resubmit(,24)
24. metadata=0x12, priority 0, cookie 0x79b1c074
     resubmit(,25)
25. metadata=0x12, priority 0, cookie 0xc02374d3
     resubmit(,26)
26. metadata=0x12, priority 0, cookie 0x218dc750
     resubmit(,27)
27. metadata=0x12, priority 0, cookie 0xf6943631
     set_field:0/0x1000000000000->xreg4
     set_field:0/0x2000000000000->xreg4
     set_field:0/0x4000000000000->xreg4
     resubmit(,28)
28. ip,reg0=0x2/0x2002,metadata=0x12, priority 100, cookie 0x7fdca4fb
     ct(commit,zone=NXM_NX_REG13[0..15],nat(src),exec(set_field:0/0x1->ct_mark))
     nat(src)
     set_field:0/0x1->ct_mark
      -> Sets the packet to an untracked state, and clears all the
conntrack fields.
     resubmit(,29)
29. metadata=0x12, priority 0, cookie 0x34e5fac7
     resubmit(,30)
30. metadata=0x12, priority 0, cookie 0xac3f53bd
     resubmit(,31)
31. metadata=0x12, priority 0, cookie 0x54000bd5
     resubmit(,32)
32. metadata=0x12, priority 0, cookie 0x29ab49f3
     resubmit(,33)
33. metadata=0x12, priority 0, cookie 0xae6aefcb
     resubmit(,34)
34. metadata=0x12, priority 0, cookie 0x632dc93b
     resubmit(,35)
35. metadata=0x12,dl_dst=1a:16:b1:58:e1:cd, priority 50, cookie 0x3fa33935
     set_field:0x1->reg15
     resubmit(,37)
37. priority 0
     resubmit(,39)
39. priority 0
     resubmit(,40)
40. reg15=0x1,metadata=0x12, priority 100, cookie 0x95c785db
     set_field:0x155->reg11
     set_field:0x1cf->reg12
     resubmit(,41)
41. priority 0
     set_field:0->reg0
     set_field:0->reg1
     set_field:0->reg2
     set_field:0->reg3
     set_field:0->reg4
     set_field:0->reg5
     set_field:0->reg6
     set_field:0->reg7
     set_field:0->reg8
     set_field:0->reg9
     resubmit(,42)
42. ip,reg15=0x1,metadata=0x12, priority 110, cookie 0xc9a79824
     resubmit(,43)
43. ip,reg15=0x1,metadata=0x12, priority 110, cookie 0xac7d5d78
     resubmit(,44)
44. metadata=0x12, priority 0, cookie 0xa1ddb4f6
     resubmit(,45)
45. ct_state=-trk,metadata=0x12, priority 5, cookie 0xb2622a65
     
set_field:0x100000000000000000000000000/0x100000000000000000000000000->xxreg0
     
set_field:0x200000000000000000000000000/0x200000000000000000000000000->xxreg0
     resubmit(,46)
46. metadata=0x12, priority 0, cookie 0x267ae0e3
     resubmit(,47)
47. metadata=0x12, priority 0, cookie 0x914392c3
     set_field:0/0x1000000000000->xreg4
     set_field:0/0x2000000000000->xreg4
     set_field:0/0x4000000000000->xreg4
     resubmit(,48)
48. metadata=0x12, priority 0, cookie 0xc4f6a97f
     resubmit(,49)
49. metadata=0x12, priority 0, cookie 0x8ebb0c71
     resubmit(,50)
50. metadata=0x12, priority 0, cookie 0x61e385f4
     resubmit(,51)
51. metadata=0x12, priority 0, cookie 0x4c722ce1
     set_field:0/0x1000->reg10
     resubmit(,75)
     75. No match.
             drop
     move:NXM_NX_REG10[12]->NXM_NX_XXREG0[111]
      -> NXM_NX_XXREG0[111] is now 0
     resubmit(,52)
52. metadata=0x12, priority 0, cookie 0x3c01b89d
     resubmit(,64)
64. priority 0
     resubmit(,65)
65. reg15=0x1,metadata=0x12, priority 100, cookie 0x95c785db
     
clone(ct_clear,set_field:0->reg11,set_field:0->reg12,set_field:0->reg13,set_field:0x4cc->reg11,set_field:0x383->reg12,set_field:0x13->metadata,set_field:0x1->reg14,set_field:0->reg10,set_field:0->reg15,set_field:0->reg0,set_field:0->reg1,set_field:0->reg2,set_field:0->reg3,set_field:0->reg4,set_field:0->reg5,set_field:0->reg6,set_field:0->reg7,set_field:0->reg8,set_field:0->reg9,resubmit(,8))
     ct_clear
     set_field:0->reg11
     set_field:0->reg12
     set_field:0->reg13
     set_field:0x4cc->reg11
     set_field:0x383->reg12
     set_field:0x13->metadata
     set_field:0x1->reg14
     set_field:0->reg10
     set_field:0->reg15
     set_field:0->reg0
     set_field:0->reg1
     set_field:0->reg2
     set_field:0->reg3
     set_field:0->reg4
     set_field:0->reg5
     set_field:0->reg6
     set_field:0->reg7
     set_field:0->reg8
     set_field:0->reg9
     resubmit(,8)
  8. reg14=0x1,metadata=0x13,dl_dst=1a:16:b1:58:e1:cd, priority 50,
cookie 0x8522832c
     
set_field:0x1a16b158e1cd0000000000000000/0xffffffffffff0000000000000000->xxreg0
     resubmit(,9)
  9. metadata=0x13, priority 0, cookie 0xef4dd8a9
     set_field:0x4/0x4->xreg4
     resubmit(,10)
10. reg9=0x4/0x4,metadata=0x13, priority 100, cookie 0x37718dd7
     resubmit(,79)
     79. 
ip,reg14=0x1,metadata=0x13,dl_src=4e:42:14:a1:2a:fb,nw_src=172.27.18.244,
priority 100, cookie 0x2e544767
             drop
     resubmit(,11)
11. metadata=0x13, priority 0, cookie 0x3c2eb1b
     resubmit(,12)
12. metadata=0x13, priority 0, cookie 0x44f62446
     resubmit(,13)
13. metadata=0x13, priority 0, cookie 0x3842bec6
     resubmit(,14)
14. metadata=0x13, priority 0, cookie 0xfd7b2ab9
     resubmit(,15)
15. metadata=0x13, priority 0, cookie 0xefbd7e27
     resubmit(,16)
16. metadata=0x13, priority 0, cookie 0xf439d853
     resubmit(,17)
17. metadata=0x13, priority 0, cookie 0x123f01f0
     resubmit(,18)
18. metadata=0x13, priority 0, cookie 0x142cd59b
     resubmit(,19)
19. metadata=0x13, priority 0, cookie 0x297e0190
     resubmit(,20)
20. metadata=0x13, priority 0, cookie 0x61e9e3c7
     set_field:0/0xffffffff->xxreg1
     resubmit(,21)
21. ip,reg7=0,metadata=0x13, priority 1, cookie 0x9f114f3d
     dec_ttl()
     set_field:0/0xffff00000000->xreg4
     
set_field:0x64646401000000000000000000000000/0xffffffff000000000000000000000000->xxreg0
     set_field:0x646464030000000000000000/0xffffffff0000000000000000->xxreg0
     set_field:2a:d5:d1:e8:89:cc->eth_src
     set_field:0x2->reg15
     set_field:0x1/0x1->reg10
     resubmit(,22)
22. reg8=0/0xffff,metadata=0x13, priority 150, cookie 0x6ece899a
     resubmit(,23)
23. metadata=0x13, priority 0, cookie 0x72437536
     set_field:0/0xffff00000000->xreg4
     resubmit(,24)
24. reg8=0/0xffff,metadata=0x13, priority 150, cookie 0xce06f1
     resubmit(,25)
25. ip,metadata=0x13, priority 1, cookie 0xf690342c
     push:NXM_NX_REG0[]
     push:NXM_NX_XXREG0[96..127]
     pop:NXM_NX_REG0[]
      -> NXM_NX_REG0[] is now 0x64646401
     set_field:00:00:00:00:00:00->eth_dst
     resubmit(,66)
     66. reg0=0x64646401,reg15=0x2,metadata=0x13, priority 100, cookie 
0x97b4353d
             set_field:00:00:5e:00:01:ff->eth_dst
             set_field:0x40/0x40->reg10
     pop:NXM_NX_REG0[]
      -> NXM_NX_REG0[] is now 0x64646401
     resubmit(,26)
26. metadata=0x13, priority 0, cookie 0x6c2ad4cc
     resubmit(,27)
27. metadata=0x13, priority 0, cookie 0xadc9fb4f
     resubmit(,28)
28. ip,reg15=0x2,metadata=0x13,nw_src=172.27.18.244, priority 100,
cookie 0xfa923492
     set_field:4e:42:14:a1:2a:fb->eth_src
     set_field:0xcc3418740000000000000000/0xffffffff0000000000000000->xxreg0
     resubmit(,29)
29. metadata=0x13, priority 0, cookie 0x571bbf77
     resubmit(,37)
37. priority 0
     resubmit(,39)
39. priority 0
     resubmit(,40)
40. reg15=0x2,metadata=0x13, priority 100, cookie 0x407086d0
     set_field:0x4cc->reg11
     set_field:0x383->reg12
     resubmit(,41)
41. priority 0
     set_field:0->reg0
     set_field:0->reg1
     set_field:0->reg2
     set_field:0->reg3
     set_field:0->reg4
     set_field:0->reg5
     set_field:0->reg6
     set_field:0->reg7
     set_field:0->reg8
     set_field:0->reg9
     resubmit(,42)
42. metadata=0x13, priority 0, cookie 0x4c3b7fcc
     set_field:0/0x10->xreg4
     resubmit(,43)
43. ip,reg15=0x2,metadata=0x13,nw_src=172.27.18.244, priority 100,
cookie 0x540f90b2
     set_field:4e:42:14:a1:2a:fb->eth_src
     ct(table=44,zone=NXM_NX_REG11[0..15],nat)
     nat
      -> A clone of the packet is forked to recirculate. The forked
pipeline will be resumed at table 44.
      -> Sets the packet to an untracked state, and clears all the
conntrack fields.

Final flow: 
recirc_id=0x13b5a,eth,tcp,reg0=0x300,reg11=0x155,reg12=0x1cf,reg13=0x16d,reg14=0xca,reg15=0x1,metadata=0x12,in_port=7753,vlan_tci=0x0000,dl_src=4e:42:14:a1:2a:fb,dl_dst=1a:16:b1:58:e1:cd,nw_src=172.27.18.244,nw_dst=104.18.3.35,nw_tos=0,nw_ecn=0,nw_ttl=32,nw_frag=no,tp_src=52776,tp_dst=443,tcp_flags=syn
Megaflow: 
recirc_id=0x13b5a,ct_state=+new-est-rel-rpl-inv+trk,ct_mark=0/0x1,eth,tcp,in_port=7753,dl_src=4e:42:14:a1:2a:fb,dl_dst=1a:16:b1:58:e1:cd,nw_src=172.27.18.244,nw_dst=104.0.0.0/5,nw_ttl=32,nw_frag=no,tp_src=0x8000/0x8000,tp_dst=0x180/0xffc0
Datapath actions:
ct(commit,zone=365,mark=0/0x1,nat(src)),set(eth(dst=00:00:5e:00:01:ff)),set(ipv4(ttl=31)),ct(zone=1228,nat),recirc(0x13b5c)

===============================================================================
recirc(0x13b5c) - resume conntrack with default ct_state=trk|new (use
--ct-next to customize)
Replacing src/dst IP/ports to simulate NAT:
  Initial flow:
  Modified flow:
===============================================================================

Flow: 
recirc_id=0x13b5c,ct_state=new|trk,ct_zone=1228,eth,tcp,reg10=0x41,reg11=0x4cc,reg12=0x383,reg14=0x1,reg15=0x2,metadata=0x13,in_port=7753,vlan_tci=0x0000,dl_src=4e:42:14:a1:2a:fb,dl_dst=00:00:5e:00:01:ff,nw_src=172.27.18.244,nw_dst=104.18.3.35,nw_tos=0,nw_ecn=0,nw_ttl=31,nw_frag=no,tp_src=52776,tp_dst=443,tcp_flags=syn

bridge("br-int")
----------------
     thaw
         Resuming from table 44
44. metadata=0x13, priority 0, cookie 0x48ef0c42
     resubmit(,45)
45. ct_state=-rpl+trk,ip,reg15=0x2,metadata=0x13,nw_src=172.27.18.244,
priority 161, cookie 0x5414262a
     set_field:4e:42:14:a1:2a:fb->eth_src
     ct(commit,table=46,zone=NXM_NX_REG12[0..15],nat(src=204.52.24.116))
     nat(src=204.52.24.116)
      -> A clone of the packet is forked to recirculate. The forked
pipeline will be resumed at table 46.
      -> Sets the packet to an untracked state, and clears all the
conntrack fields.

Final flow: 
recirc_id=0x13b5c,eth,tcp,reg10=0x41,reg11=0x4cc,reg12=0x383,reg14=0x1,reg15=0x2,metadata=0x13,in_port=7753,vlan_tci=0x0000,dl_src=4e:42:14:a1:2a:fb,dl_dst=00:00:5e:00:01:ff,nw_src=172.27.18.244,nw_dst=104.18.3.35,nw_tos=0,nw_ecn=0,nw_ttl=31,nw_frag=no,tp_src=52776,tp_dst=443,tcp_flags=syn
Megaflow: 
recirc_id=0x13b5c,ct_state=-rpl+trk,eth,ip,in_port=7753,dl_src=4e:42:14:a1:2a:fb,nw_src=172.27.18.244,nw_frag=no
Datapath actions: ct(commit,zone=899,nat(src=204.52.24.116)),recirc(0x25f9b)

===============================================================================
recirc(0x25f9b) - resume conntrack with default ct_state=trk|new (use
--ct-next to customize)
Replacing src/dst IP/ports to simulate NAT:
  Initial flow: nw_src=172.27.18.244,tp_src=52776,nw_dst=104.18.3.35,tp_dst=443
  Modified flow: nw_src=204.52.24.116,tp_src=52776,nw_dst=104.18.3.35,tp_dst=443
===============================================================================

Flow: 
recirc_id=0x25f9b,ct_state=new|trk,ct_zone=899,eth,tcp,reg10=0x41,reg11=0x4cc,reg12=0x383,reg14=0x1,reg15=0x2,metadata=0x13,in_port=7753,vlan_tci=0x0000,dl_src=4e:42:14:a1:2a:fb,dl_dst=00:00:5e:00:01:ff,nw_src=204.52.24.116,nw_dst=104.18.3.35,nw_tos=0,nw_ecn=0,nw_ttl=31,nw_frag=no,tp_src=52776,tp_dst=443,tcp_flags=syn

bridge("br-int")
----------------
     thaw
         Resuming from table 46
46. metadata=0x13, priority 0, cookie 0x3800cb88
     resubmit(,47)
47. metadata=0x13, priority 0, cookie 0xc2575be2
     resubmit(,48)
48. reg15=0x2,metadata=0x13, priority 100, cookie 0xaf19a22a
     resubmit(,64)
64. reg10=0x1/0x1,reg15=0x2,metadata=0x13, priority 100, cookie 0x407086d0
     push:NXM_OF_IN_PORT[]
     set_field:ANY->in_port
     resubmit(,65)
     65. reg15=0x2,metadata=0x13, priority 100, cookie 0x407086d0
             
clone(ct_clear,set_field:0->reg11,set_field:0->reg12,set_field:0->reg13,set_field:0x2c2->reg11,set_field:0x327->reg12,set_field:0x1->metadata,set_field:0xa->reg14,set_field:0->reg10,set_field:0->reg15,set_field:0->reg0,set_field:0->reg1,set_field:0->reg2,set_field:0->reg3,set_field:0->reg4,set_field:0->reg5,set_field:0->reg6,set_field:0->reg7,set_field:0->reg8,set_field:0->reg9,resubmit(,8))
             ct_clear
             set_field:0->reg11
             set_field:0->reg12
             set_field:0->reg13
             set_field:0x2c2->reg11
             set_field:0x327->reg12
             set_field:0x1->metadata
             set_field:0xa->reg14
             set_field:0->reg10
             set_field:0->reg15
             set_field:0->reg0
             set_field:0->reg1
             set_field:0->reg2
             set_field:0->reg3
             set_field:0->reg4
             set_field:0->reg5
             set_field:0->reg6
             set_field:0->reg7
             set_field:0->reg8
             set_field:0->reg9
             resubmit(,8)
          8. metadata=0x1, priority 50, cookie 0x1645d3f2
             set_field:0/0x1000->reg10
             resubmit(,73)
             73. No match.
                     drop
             move:NXM_NX_REG10[12]->NXM_NX_XXREG0[111]
              -> NXM_NX_XXREG0[111] is now 0
             resubmit(,9)
          9. metadata=0x1, priority 0, cookie 0xcc4fd106
             resubmit(,10)
         10. metadata=0x1, priority 0, cookie 0x9e10ad0e
             resubmit(,11)
         11. metadata=0x1, priority 0, cookie 0x557f3249
             resubmit(,12)
         12. metadata=0x1, priority 0, cookie 0x915c56b1
             resubmit(,13)
         13. ip,reg14=0xa,metadata=0x1, priority 110, cookie 0xccdcd3e1
             resubmit(,14)
         14. metadata=0x1, priority 0, cookie 0x134ce32f
             resubmit(,15)
         15. metadata=0x1, priority 65535, cookie 0x49627e5f
             resubmit(,16)
         16. metadata=0x1, priority 65535, cookie 0xc947843d
             resubmit(,17)
         17. metadata=0x1, priority 0, cookie 0xd2fb4d2
             resubmit(,18)
         18. metadata=0x1, priority 0, cookie 0x62454929
             resubmit(,19)
         19. metadata=0x1, priority 0, cookie 0x3bb47080
             resubmit(,20)
         20. metadata=0x1, priority 0, cookie 0x73face9d
             resubmit(,21)
         21. metadata=0x1, priority 0, cookie 0x8e46634d
             resubmit(,22)
         22. metadata=0x1, priority 0, cookie 0xbddac461
             resubmit(,23)
         23. metadata=0x1, priority 0, cookie 0x9a32c0de
             resubmit(,24)
         24. metadata=0x1, priority 0, cookie 0x79b1c074
             resubmit(,25)
         25. metadata=0x1, priority 0, cookie 0xc02374d3
             resubmit(,26)
         26. metadata=0x1, priority 0, cookie 0x218dc750
             resubmit(,27)
         27. metadata=0x1, priority 0, cookie 0x8db4eebc
             resubmit(,28)
         28. metadata=0x1, priority 0, cookie 0x6deecbbe
             resubmit(,29)
         29. metadata=0x1, priority 0, cookie 0x34e5fac7
             resubmit(,30)
         30. metadata=0x1, priority 0, cookie 0xac3f53bd
             resubmit(,31)
         31. metadata=0x1, priority 0, cookie 0x54000bd5
             resubmit(,32)
         32. metadata=0x1, priority 0, cookie 0x29ab49f3
             resubmit(,33)
         33. metadata=0x1, priority 0, cookie 0xae6aefcb
             resubmit(,34)
         34. metadata=0x1, priority 0, cookie 0x632dc93b
             resubmit(,35)
         35. metadata=0x1, priority 0, cookie 0x9961115c
             set_field:0->reg15
             resubmit(,71)
             71. No match.
                     drop
             resubmit(,36)
         36. reg15=0,metadata=0x1, priority 50, cookie 0x62255993
             set_field:0x8001->reg15
             resubmit(,37)
         37. priority 0
             resubmit(,39)
         39. priority 0
             resubmit(,40)
         40. reg15=0x8001,metadata=0x1, priority 100, cookie 0x11699429
             set_field:0x14->reg13
             set_field:0x1->reg15
             resubmit(,41)
             41. priority 0
                     set_field:0->reg0
                     set_field:0->reg1
                     set_field:0->reg2
                     set_field:0->reg3
                     set_field:0->reg4
                     set_field:0->reg5
                     set_field:0->reg6
                     set_field:0->reg7
                     set_field:0->reg8
                     set_field:0->reg9
                     resubmit(,42)
                 42. metadata=0x1, priority 0, cookie 0x9ad1480d
                     resubmit(,43)
                 43. ip,reg15=0x1,metadata=0x1, priority 110, cookie 0xca4977d3
                     ct_clear
                     resubmit(,44)
                 44. metadata=0x1, priority 0, cookie 0xa1ddb4f6
                     resubmit(,45)
                 45. metadata=0x1, priority 65535, cookie 0x91b4350e
                     resubmit(,46)
                 46. metadata=0x1, priority 65535, cookie 0xfc9c351d
                     resubmit(,47)
                 47. metadata=0x1, priority 0, cookie 0xb984696e
                     resubmit(,48)
                 48. metadata=0x1, priority 0, cookie 0xc4f6a97f
                     resubmit(,49)
                 49. metadata=0x1, priority 0, cookie 0x8ebb0c71
                     resubmit(,50)
                 50. metadata=0x1, priority 0, cookie 0x61e385f4
                     resubmit(,51)
                 51. metadata=0x1, priority 0, cookie 0x4c722ce1
                     set_field:0/0x1000->reg10
                     resubmit(,75)
                     75. No match.
                             drop
                     move:NXM_NX_REG10[12]->NXM_NX_XXREG0[111]
                      -> NXM_NX_XXREG0[111] is now 0
                     resubmit(,52)
                 52. metadata=0x1, priority 0, cookie 0x3c01b89d
                     resubmit(,64)
                 64. priority 0
                     resubmit(,65)
                 65. reg15=0x1,metadata=0x1, priority 100, cookie 0x4581da82
                     push_vlan:0x8100
                     set_field:4216->vlan_vid
                     output:7754

                     bridge("br-provider")
                     ---------------------
                          0. priority 0
                             NORMAL
                              -> forwarding to learned port
                     pop_vlan
             set_field:0x8001->reg15
     pop:NXM_OF_IN_PORT[]
      -> NXM_OF_IN_PORT[] is now 7753

Final flow: unchanged
Megaflow: 
recirc_id=0x25f9b,ct_state=+new-est-rel-rpl-inv+trk,ct_mark=0/0x1,eth,ip,in_port=7753,dl_src=4e:42:14:a1:2a:fb,dl_dst=00:00:5e:00:01:ff,nw_dst=0.0.0.0/1,nw_frag=no
Datapath actions: ct_clear,push_vlan(vid=120,pcp=0),5

On Wed, 17 Apr 2024 at 17:32, Gavin McKee <gavmcke...@googlemail.com> wrote:

That information is all in the email

The openflow trace is showing that the pipeline is fine . This is why I’m 
worried about a deeper issue with the kernal / openvswitch kernal module / 
connection tracking



On Wed, Apr 17, 2024 at 16:33 Flavio Leitner <f...@sysclose.org> wrote:

On Wed, 17 Apr 2024 12:26:27 -0700
Gavin McKee <gavmcke...@googlemail.com> wrote:

Hi Flavio,

I had to restart the Open vSwitch across 16 machines to resolve the
issue for a customer .  I think it will occur again and when it does
I'll use that command to gather the tc information.

Until then I think I have found why the issue is occurring .

Take a look at the output below (this is a packet capture from the
physical interface on the compute node , so traffic that has gone
through the OVS Openflow pipeline) - we make a 3 way handshake with R2
, and establish the connection.  A packet goes missing - TLS
handshake, it then appears that it hasn't gone through NAT as it's
using the Private IP of the VM  .


If that is the case, you might be able to see if the data path
is matching correctly and the actions are using NAT with
ovs-appctl ofproto/trace.

fbl



Take a look at frame 14

No.     Time               Source                Destination
Protocol Length Info
          Delta
      14 09:24:08.064432    172.27.18.244         104.18.2.35
TLSv1    502    Client Hello, Alert (Level: Fatal, Description: Decode
Error)   2.362983

Frame 14: 502 bytes on wire (4016 bits), 502 bytes captured (4016
bits) Ethernet II, Src: 4e:42:14:a1:2a:fb (4e:42:14:a1:2a:fb), Dst:
IETF-VRRP-VRID_ff (00:00:5e:00:01:ff)
802.1Q Virtual LAN, PRI: 0, DEI: 0, ID: 120
Internet Protocol Version 4, Src: 172.27.18.244, Dst: 104.18.2.35
Transmission Control Protocol, Src Port: 57394, Dst Port: 443, Seq: 1,
Ack: 1, Len: 444
Transport Layer Security
     TLSv1 Record Layer: Handshake Protocol: Client Hello
         Content Type: Handshake (22)
         Version: TLS 1.0 (0x0301)
         Length: 432
         Handshake Protocol: Client Hello
     TLSv1 Record Layer: Alert (Level: Fatal, Description: Decode
Error) Content Type: Alert (21)
         Version: TLS 1.0 (0x0301)
         Length: 2
         Alert Message
             Level: Fatal (2)
             Description: Decode Error (50)



------------------------------------------------------------------------------------------------------------

On Wed, 17 Apr 2024 at 11:28, Flavio Leitner <f...@sysclose.org> wrote:


Hi Gavin,

It would be helpful if you can provide some TC dumps from the
"good" state to the "bad" state to see how it was and what changes.
Something like:

# tc -s filter show dev enp148s0f0_1 ingress

I haven't checked the attached files, but one suggestion is to
check if this is not a csum issue.

Thanks,
fbl


On Tue, 16 Apr 2024 13:17:10 -0700
Gavin McKee via discuss <ovs-discuss@openvswitch.org> wrote:

Adding information relating to the Open VSwitch kernal module
@Ilya Maximets @Numan Siddique  Can either of you help out here?


modinfo openvswitch
filename:
/lib/modules/5.14.0-362.8.1.el9_3.x86_64/kernel/net/openvswitch/openvswitch.ko.xz
alias:          net-pf-16-proto-16-family-ovs_ct_limit
alias:          net-pf-16-proto-16-family-ovs_meter
alias:          net-pf-16-proto-16-family-ovs_packet
alias:          net-pf-16-proto-16-family-ovs_flow
alias:          net-pf-16-proto-16-family-ovs_vport
alias:          net-pf-16-proto-16-family-ovs_datapath
license:        GPL
description:    Open vSwitch switching datapath
rhelversion:    9.3
srcversion:     8A2159D727C8BADC82261B8
depends:        nf_conntrack,nf_conncount,libcrc32c,nf_nat
retpoline:      Y
intree:         Y
name:           openvswitch
vermagic:       5.14.0-362.8.1.el9_3.x86_64 SMP preempt mod_unload
modversions sig_id:         PKCS#7
signer:         Rocky kernel signing key
sig_key:
17:CA:DE:1F:EC:D1:59:2D:9F:52:34:C6:7C:09:06:81:3D:74:7C:F7
sig_hashalgo:   sha256 signature:
67:31:56:70:86:DB:57:69:8D:4A:9B:A7:ED:17:F3:67:65:98:97:08:
1F:FB:4D:F8:A8:2D:7C:A7:7D:3A:57:85:CA:67:9D:82:72:EB:54:14:
F2:BB:40:78:AD:85:56:2D:EF:D5:00:95:38:A4:86:9F:5F:29:1A:81:
32:94:B4:87:41:94:A0:3E:71:A5:97:44:2E:42:DD:F7:42:6B:69:94:
E3:AB:6E:E5:4F:C9:60:57:70:07:5F:CA:C7:83:7A:2F:C7:81:62:FF:
53:AF:AC:2B:06:D8:08:D3:1D:A7:F0:43:10:98:DE:B1:62:AE:89:A5:
FE:EF:74:09:0F:2D:0F:D9:73:A5:59:75:D0:87:1E:EA:3A:40:86:1E:
76:E5:E7:3B:59:2E:3A:7E:65:F3:92:A1:B4:84:48:3F:43:A0:D7:1C:
21:29:E0:B6:D1:10:36:15:88:43:6A:11:8F:55:EE:1B:F9:53:3B:86:
EF:81:71:17:81:08:EC:53:30:D6:69:8E:13:11:D5:DF:15:75:88:50:
69:19:51:3B:41:6B:6F:E0:7A:30:33:32:E6:60:18:02:A6:0C:63:9B:
C5:D7:2F:6A:D0:BA:45:03:19:0E:21:E8:18:FB:E8:D1:C1:33:05:36:
1F:9B:0F:29:3F:05:51:7A:30:86:88:B7:C7:44:2E:2B:50:F9:EF:4F:
D4:70:EA:1B:33:E2:F0:E3:E2:88:00:E5:BF:06:E2:D4:B7:81:EE:6E:
89:02:18:65:8B:1C:84:42:2F:89:14:63:1D:51:70:37:42:C5:68:DD:
4D:12:7B:07:33:2B:C6:BC:8F:7F:23:D7:58:DF:47:AC:DE:08:67:FE:
CB:E8:E6:4D:95:2F:6B:F5:07:4D:32:92:80:0A:7C:D1:B6:81:EE:AB:
26:C3:C6:22:77:00:5E:64:DE:96:0E:9F:A4:A0:F0:45:9F:19:73:EB:
CC:60:AE:E9:63:E2:6D:2E:BA:65:9B:BD:04:CC:13:C2:55:88:05:03:
1B:30:18:8B

On Tue, 16 Apr 2024 at 11:12, Gavin McKee
<gavmcke...@googlemail.com> wrote:

Hi,

I need some help with strange OVS behaviours.

ovs-vsctl (Open vSwitch) 3.2.2
ovn-controller 23.09.1
Open vSwitch Library 3.2.2

TLDR: We need to restart Open VSwitch in order for TLS traffic
to work between a VM and Cloudflare R2.  After restarting Open
VSwitch the TLS connection works fine.
(see attached pcap tls-error.txt)

See the attached openflow traces - they show a flow trace from
Open Vswitch.

Also there is a retis trace (retis tool discussed at Open
VSwitch conference 2023).

Note the drop (TC_INGRESS) in this file
   + 1702601116185568 [swapper/140] 0 [tp] skb:kfree_skb
#60c81b6b91e2cff284fb3a3d65800 (skb 18386033671255367680) n 3
drop (TC_INGRESS)
     if 21 (enp148s0f0_1) rxif 21 172.27.18.244.57394 >
104.18.2.35.443 ttl 63 tos 0x0 id 26162 off 0 [DF] len 477 proto
TCP (6) flags [P.] seq 792060930:792061367 ack 951229219 win 11

Again , once I restart Open vSwitch the problem goes away for a
time and comes back sometime later (not sure what that time
frame is but its a recurring issue.)
_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss



--
fbl



--
fbl
_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to