Hello, I've been hitting the same issue with OVN 20.09 from CentOS NFV SIG repo - is there a chance to backport this change to 20.09?
czw., 17 gru 2020 o 11:05 Daniel Alvarez Sanchez <[email protected]> napisał(a): > > > On Tue, Dec 15, 2020 at 11:39 AM Krzysztof Klimonda < > [email protected]> wrote: > >> Hi, >> >> Just as a quick update - I've updated our ovn version to 20.12.0 snapshot >> (d8bc0377c) and so far the problem hasn't yet reoccurred after over 24 >> hours of tempest testing. >> > > We could reproduce the issue with 20.12 and master. Also this is not > related exclusively to localports but to any port potentially. > Dumitru posted a fix for this: > > > http://patchwork.ozlabs.org/project/ovn/patch/[email protected]/ > > Thanks! > daniel > >> >> Best Regards, >> -Chris >> >> >> On Tue, Dec 15, 2020, at 11:13, Daniel Alvarez Sanchez wrote: >> >> Hey Krzysztof, >> >> On Fri, Nov 20, 2020 at 1:17 PM Krzysztof Klimonda < >> [email protected]> wrote: >> >> Hi, >> >> Doing some tempest runs on our pre-prod environment (stable/ussuri with >> ovn 20.06.2 release) I've noticed that some network connectivity tests were >> failing randomly. I've reproduced that by conitnously rescuing and >> unrescuing instance - network connectivity from and to VM works in general >> (dhcp is fine, access from outside is fine), however VM has no access to >> its metadata server (via 169.254.169.254 ip address). Tracing packet from >> VM to metadata via: >> >> ----8<----8<----8<---- >> ovs-appctl ofproto/trace br-int >> in_port=tapa489d406-91,dl_src=fa:16:3e:2c:b0:fd,dl_dst=fa:16:3e:8b:b5:39 >> ----8<----8<----8<---- >> >> ends with >> >> ----8<----8<----8<---- >> 65. reg15=0x1,metadata=0x97e, priority 100, cookie 0x15ec4875 >> output:1187 >> >> Nonexistent output port >> ----8<----8<----8<---- >> >> And I can verify that there is no flow for the actual ovnmeta tap >> interface (tap67731b0a-c0): >> >> ----8<----8<----8<---- >> # docker exec -it openvswitch_vswitchd ovs-ofctl dump-flows br-int |grep >> -E output:'("tap67731b0a-c0"|1187)' >> cookie=0x15ec4875, duration=1868.378s, table=65, n_packets=524, >> n_bytes=40856, priority=100,reg15=0x1,metadata=0x97e actions=output:1187 >> # >> ----8<----8<----8<---- >> >> From ovs-vswitchd.log it seems the interface tap67731b0a-c0 was added >> with index 1187, then deleted, and re-added with index 1189 - that's >> probably due to the fact that that is the only VM in that network and I'm >> constantly hard rebooting it via rescue/unrescue: >> >> ----8<----8<----8<---- >> 2020-11-20T11:41:18.347Z|08043|bridge|INFO|bridge br-int: added interface >> tap67731b0a-c0 on port 1187 >> 2020-11-20T11:41:30.813Z|08044|bridge|INFO|bridge br-int: deleted >> interface tapa489d406-91 on port 1186 >> 2020-11-20T11:41:30.816Z|08045|bridge|WARN|could not open network device >> tapa489d406-91 (No such device) >> 2020-11-20T11:41:31.040Z|08046|bridge|INFO|bridge br-int: deleted >> interface tap67731b0a-c0 on port 1187 >> 2020-11-20T11:41:31.044Z|08047|bridge|WARN|could not open network device >> tapa489d406-91 (No such device) >> 2020-11-20T11:41:31.050Z|08048|bridge|WARN|could not open network device >> tapa489d406-91 (No such device) >> 2020-11-20T11:41:31.235Z|08049|connmgr|INFO|br-int<->unix#31: 2069 >> flow_mods in the last 43 s (858 adds, 814 deletes, 397 modifications) >> 2020-11-20T11:41:33.057Z|08050|bridge|INFO|bridge br-int: added interface >> tapa489d406-91 on port 1188 >> 2020-11-20T11:41:33.582Z|08051|bridge|INFO|bridge br-int: added interface >> tap67731b0a-c0 on port 1189 >> 2020-11-20T11:42:31.235Z|08052|connmgr|INFO|br-int<->unix#31: 168 >> flow_mods in the 2 s starting 59 s ago (114 adds, 10 deletes, 44 >> modifications) >> ----8<----8<----8<---- >> >> Once I restart ovn-controller it recalculates local ovs flows and the >> problem is fixed so I'm assuming it's a local problem and not related to NB >> and SB databases. >> >> >> I have seen exactly the same which with 20.09, for the same port input >> and output ofports do not match: >> >> bash-4.4# ovs-ofctl dump-flows br-int table=0 | grep 745 >> cookie=0x38937d8e, duration=40387.372s, table=0, n_packets=1863, >> n_bytes=111678, idle_age=1, priority=100,in_port=745 >> actions=load:0x4b->NXM_NX_REG13[],load:0x6a->NXM_NX_REG11[],load:0x69->NXM_NX_REG12[],load:0x18d->OXM_OF_METADATA[],load:0x1->NXM_NX_REG14[],resubmit(,8) >> >> >> bash-4.4# ovs-ofctl dump-flows br-int table=65 | grep 8937d8e >> cookie=0x38937d8e, duration=40593.699s, table=65, n_packets=1848, >> n_bytes=98960, idle_age=2599, priority=100,reg15=0x1,metadata=0x18d >> actions=output:737 >> >> >> In table=0, the ofport is fine (745) but in the output stage it is using >> a different one (737). >> >> By checking the OVS database transaction history, that port, at some >> point, had the id 737: >> >> record 6516: 2020-12-14 22:22:54.184 >> >> table Interface row "tap71a5dfc1-10" (073801e2): >> ofport=737 >> table Open_vSwitch row 1d9566c8 (1d9566c8): >> cur_cfg=2023 >> >> So it looks like ovn-controller is not updating the ofport in the >> physical flows for the output stage. >> >> We'll try to figure out if this happens also in master. >> >> Thanks, >> daniel >> >> >> -- >> Krzysztof Klimonda >> [email protected] >> _______________________________________________ >> discuss mailing list >> [email protected] >> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss >> >> >> _______________________________________________ > discuss mailing list > [email protected] > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss > -- Michał Nasiadka [email protected]
_______________________________________________ discuss mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
