On Mon, Feb 17, 2025 at 08:33:10AM +0100, Ales Musil via discuss wrote: > On Fri, Feb 14, 2025 at 3:57 PM Piotr Misiak via discuss < > ovs-discuss@openvswitch.org> wrote: > > > Hi, > > > > Hi Piotr, > > thank you for contacting us. > > > > We are running several OpenStack/OVN regions with different sizes. > > All of them have external networks connected to the Internet. > > We are receiving a lot of packets to non used (non provisioned) > > destination IP addresses, I guess some bots scanning Internet. > > This creates a lot of ARP requests which cannot be replied, because > > those IP addresses are not configured anywhere yet. > > > > Few days ago we upgraded one of our regions from OVN 22.09 to OVN > > 24.03 and basically we suddenly started having critical issues with > > DNS resolving on VMs running in the OpenStack. > > Generally non of DNS requests were successful, some of them was going > > back after 5 minutes, sometimes even after 30 minutes. Yes, minutes > > not seconds. > > > > Slightly related, there was recently an improvement to DNS handling > where the cache is no longer processed by pinctrl only [0], then > later on there was another addition to avoid mutex contention as > much as possible [1]. I believe that both of those would help in > your case to some extent. > > > > > After some debugging we identified problematic OpenFlow flows which > > send ARP request packets to ovn-controllers. > > Those flows are created because we have around 400 ports in the > > external-network and packet flooding flow have to be splitted. > > Those flows are installed at the beginning of OF 39 table with > > priority 110 which includes 170 resubmits: > > > > Those flows are related to multicast groups, in this case the "_MC_flood". > > > > cookie=0x28ef9c32, duration=829.596s, table=39, n_packets=117482, > > n_bytes=4947460, idle_age=0, hard_age=58, > > priority=110,reg6=0x9001,reg15=0x8000,metadata=0xba > > > > actions=load:0->NXM_NX_REG6[],load:0x5a3->NXM_NX_REG15[],resubmit(,41),load:0x21af->NXM_NX_REG15[],resubmit(,41),load:0x8f->NXM_NX_REG15[],resubmit(,41),load:0x1374->NXM_NX_REG15[],resubmit(,41),load:0x5f->NXM_NX_REG15[],resubmit(,41),load:0x10b->NXM_NX_REG15[],resubmit(,41),load:0x106->NXM_NX_REG15[],resubmit(,41),load:0x13d9->NXM_NX_REG15[],resubmit(,41),load:0x4d->NXM_NX_REG15[],resubmit(,41),load:0x2202->NXM_NX_REG15[],resubmit(,41),load:0xb4->NXM_NX_REG15[],resubmit(,41),load:0x25ed->NXM_NX_REG15[],resubmit(,41),load:0x1b59->NXM_NX_REG15[],resubmit(,41),load:0x26b2->NXM_NX_REG15[],resubmit(,41),load:0x6a->NXM_NX_REG15[],resubmit(,41) > > <<< CUT >>> > > > > load:0x169a->NXM_NX_REG15[],resubmit(,41),controller(userdata=00.00.00.1b.00.00.00.00.00.00.90.01.00.00.80.00.27) > > > > there is also second rule with 170 resubmits with controller() at the end: > > controller(userdata=00.00.00.1b.00.00.00.00.00.00.90.02.00.00.80.00.27) > > > > and also third rule with smaller number of resubmits without > > controller. In total we have around 400 resubmits. > > > > This was introduced in 24.03 version by this commit: > > > > https://github.com/ovn-org/ovn/commit/325c7b203d8bfd12bc1285ad11390c1a55cd6717 > > > > What we see in the ovn-controller logs: > > > > 2025-02-12T20:35:41.490Z|10791|pinctrl(ovn_pinctrl0)|DBG|pinctrl > > received packet-in | opcode=unrecognized(27)| OF_Table_ID=39| > > OF_Cookie_ID=0x28ef9c32| in-port=60| src-mac=4e:15:bc:ac:36:45, > > dst-mac=ff:ff:ff:ff:ff:ff| src-ip=A.A.A.A, dst-ip=B.B.B.B > > 2025-02-12T20:35:41.500Z|10792|pinctrl(ovn_pinctrl0)|DBG|pinctrl > > received packet-in | opcode=unrecognized(27)| OF_Table_ID=39| > > OF_Cookie_ID=0x28ef9c32| in-port=65533| src-mac=4e:15:bc:ac:36:45, > > dst-mac=ff:ff:ff:ff:ff:ff| src-ip=A.A.A.A, dst-ip=B.B.B.B > > > > as you can see the same packet is looped thru the ovn-controller > > twice. It's because we have 400 ports and this is covered by three > > OpenFlow flows. > > > > The funny thing is that those packets are dropped at the end of > > OpenFlow table chain in the datapath. So they kill our ovn-controllers > > performance to be finally dropped. > > I'm including a small part of packet trace result here: > > > > 39. reg15=0x8000,metadata=0xba, priority 100, cookie 0x28ef9c32 > > set_field:0->reg6 > > set_field:0xe8->reg15 > > resubmit(,41) > > 41. priority 0 > > set_field:0->reg0 > > set_field:0->reg1 > > set_field:0->reg2 > > set_field:0->reg3 > > set_field:0->reg4 > > set_field:0->reg5 > > set_field:0->reg6 > > set_field:0->reg7 > > set_field:0->reg8 > > set_field:0->reg9 > > resubmit(,42) > > 42. metadata=0xba, priority 0, cookie 0x3372823b > > resubmit(,43) > > 43. metadata=0xba,dl_dst=01:00:00:00:00:00/01:00:00:00:00:00, > > priority 110, cookie 0xaabcf4fa > > resubmit(,44) > > 44. metadata=0xba, priority 0, cookie 0x9b7d541f > > resubmit(,45) > > 45. metadata=0xba, priority 65535, cookie 0xedb6d3de > > resubmit(,46) > > 46. metadata=0xba, priority 65535, cookie 0x1dbceae > > resubmit(,47) > > 47. metadata=0xba, priority 0, cookie 0xc1c2a264 > > resubmit(,48) > > 48. metadata=0xba, priority 0, cookie 0x640d65ba > > resubmit(,49) > > 49. metadata=0xba, priority 0, cookie 0x78f2abc0 > > resubmit(,50) > > 50. metadata=0xba, priority 0, cookie 0x7b63c11c > > resubmit(,51) > > 51. metadata=0xba,dl_dst=01:00:00:00:00:00/01:00:00:00:00:00, > > priority 100, cookie 0xb055fd1c > > set_field:0/0x8000000000000000000000000000->xxreg0 > > resubmit(,52) > > 52. metadata=0xba, priority 0, cookie 0x4dd5d603 > > resubmit(,64) > > 64. priority 0 > > resubmit(,65) > > 65. reg15=0xe8,metadata=0xba, priority 100, cookie 0xfab6eb > > > > clone(ct_clear,set_field:0->reg11,set_field:0->reg12,set_field:0/0xffff->reg13,set_field:0x25b->reg11,set_field:0x30a->reg12,set_field:0x252->metadata,set_field:0x1->reg14,set_field:0->reg10,set_field:0->reg15,set_field:0->reg0,set_field:0->reg1,set_field:0->reg2,set_field:0->reg3,set_field:0->reg4,set_field:0->reg5,set_field:0->reg6,set_field:0->reg7,set_field:0->reg8,set_field:0->reg9,resubmit(,8)) > > ct_clear > > set_field:0->reg11 > > set_field:0->reg12 > > set_field:0/0xffff->reg13 > > set_field:0x25b->reg11 > > set_field:0x30a->reg12 > > set_field:0x252->metadata > > set_field:0x1->reg14 > > set_field:0->reg10 > > set_field:0->reg15 > > set_field:0->reg0 > > set_field:0->reg1 > > set_field:0->reg2 > > set_field:0->reg3 > > set_field:0->reg4 > > set_field:0->reg5 > > set_field:0->reg6 > > set_field:0->reg7 > > set_field:0->reg8 > > set_field:0->reg9 > > resubmit(,8) > > 8. > > reg14=0x1,metadata=0x252,dl_dst=01:00:00:00:00:00/01:00:00:00:00:00, > > priority 50, cookie 0x33587607 > > > > set_field:0xfa163e9f2f460000000000000000/0xffffffffffff0000000000000000->xxreg0 > > resubmit(,9) > > 9. metadata=0x252, priority 0, cookie 0x671d3d97 > > set_field:0x4/0x4->xreg4 > > resubmit(,10) > > 10. reg9=0x4/0x4,metadata=0x252, priority 100, cookie 0xd21e0659 > > resubmit(,79) > > 79. reg0=0x2, priority 0 > > drop > > resubmit(,11) > > 11. arp,metadata=0x252, priority 85, cookie 0xb5758416 > > drop > > > > > > What we can do to improve those ARP packets handling to not to send > > them to ovn-controllers? > > > > I'm not sure if there is a way to not send them to ovn-controller when > the multicast group is large. > > > > Maybe they can be dropped somewhere earlier in the table chain? They > > are requesting a MAC address which OVN doesn't know. Why it tries to > > flood it to all router ports in the external network? > > > > > At this point in the pipeline OVN doesn't know that this IP/MAC is > unknown. And because the packet is multicast one OVN basically does > what a normal network would do, flood it to all ports on the switch.
Hi Ales, Hi Piotr, we had a similar issue in the past. However i am not sure if our solution will also work in your case. What we did is configure the external LS (so the one that does all this flooding) with other_config:broadcast-arps-to-all-routers=false. This ensures that any arp/nd request that is not handled by the arp responder flows is not flooded to LRs. In your case that would probably mean it will drop the packets. Note that this breaks GARPs from the upstream switches since they will be dropped too. In our case that is not an issue since we use stable virtual mac/ip's. Let me know if that helps. If not then we will also need to find a solution for this on our upcoming 24.03 upgrade :) Thanks a lot, Felix > > > > Maybe we can implement this "too big" OpenFlow rule in a different way > > and loop it inside the fast datapath, if possible? > > > > Unfortunately not, OvS would still try to fit it into a single buffer > it doesn't matter if it's one long action or multiple resubmits. > Unless there is a action that needs to be executed before > continuation e.g. controller action, we would still have the issue > that the commit tried to fix. > > > > > I also noticed that IPv6 NS packets are processed via ovn-controller. > > Why OVS can't create responses inside the fast datapath in a similar > > way it creates responses to the ARP requests for known MACs? > > > > This is a known limitation of OvS, there was an attempt to make it > work, however it didn't lead anywhere [2]. We should probably try > to revisit this. Once there is OvS support we could easily change > it in OVN to do it directly as we do for ARP. > > > > > > > This issue had a big influence on our cloud, because the same > > ovn-controller thread is responsible for DHCP, DNS interception, IPv6 > > NS packets and when they were overloaded all those services were not > > working. > > > > Another thing, quite misleading, are those "opcode=unrecognized(27)" > > in the ovn-controller log, which are unrecognized only because I guess > > the mentioned commit haven't added new action name mapping somewhere > > here: > > > > https://github.com/ovn-org/ovn/blob/ed2790153c07a376890f28b0a16bc321e3af016b/lib/actions.c#L5977 > > > Good catch, we might be actually missing more of those looking at it. > > > > > > To recover our region we disabled the DNS interception and lowered > > number of ARP requests by increasing > > "net.ipv4.neigh.default.retrans_time_ms" on our upstream gateways. > > Those changes lowered number of packets sent to ovn-controllers from > > around 500 p/s to 200 p/s and stabilized our region. > > Nevertheless this OVN performance issue is still there. > > > > If I may suggest another potential mitigation might be to add > stateless ACL that will ensure the ARP packets are dropped before > reaching the flood flows. Would that be an option? This would really > be just mitigation until we have a proper solution. Speaking about > proper solution, given the need for this, the proper solution would > probably be CoPP for this controller action so we don't end up with > overloaded pinctrl thread. There is a downside to CoPP as we might > drop legitimate packets that need flooding. > > > > > > Thanks for your attention, > > Piotr Misiak > > _______________________________________________ > > discuss mailing list > > disc...@openvswitch.org > > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss > > > > > > Thanks, > Ales > > [0] https://github.com/ovn-org/ovn/commit/817d4e53 > [1] https://github.com/ovn-org/ovn/commit/eba60b27 > [2] > http://patchwork.ozlabs.org/project/openvswitch/patch/20200928134947.48269-1-fankaixi...@bytedance.com > _______________________________________________ > discuss mailing list > disc...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss _______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss