On 05/08/2024 19:16, Ihar Hrachyshka wrote:
Hi,
I don't know what libvirt OVN hook is though. Do you have a link?
https://docs.openvswitch.org/en/latest/howto/libvirt/
https://www.redhat.com/sysadmin/libvirt-open-vswitch
This suggests that `requested-chassis` for the OVN port is, at least
for a moment, set to `null`, while both hosts have the interface with
`iface-id` set to the port ID. Your hook should avoid it. (Always have
requested-chassis pointing to one host or another, but never clear it
up completely.) (Alternatively, create the destination OVS interface
only when migration is complete.)
BTW the 100 flapping in 50s is actually not too bad, it was a lot
worse before:
https://github.com/ovn-org/ovn/commit/4dc4bc7fdb848bcc626becbd2c80ffef8a39ff9a
<https://urldefense.com/v3/__https://github.com/ovn-org/ovn/commit/4dc4bc7fdb848bcc626becbd2c80ffef8a39ff9a__;!!ACWV5N9M2RV99hQ!P4Dli8ff2rtH8mzHvmTlVoIBEEp7UX4RSsBiUC1QJGdVGJAM_7AqU1UUF-6MPdDTom7oupIAo5eS5zua_o4K$>
---
A while ago, I wrote a blog walking through the binding process for
OVN, (it covers live migration and port binding flapping; it also
covers the additional-requested-chassis feature that is helpful for
live migration, but I suspect you don't use it). I hope it may be of
help: https://ihar.dev/posts/ovn-chassis-binding-walkthru
<https://urldefense.com/v3/__https://ihar.dev/posts/ovn-chassis-binding-walkthru__;!!ACWV5N9M2RV99hQ!P4Dli8ff2rtH8mzHvmTlVoIBEEp7UX4RSsBiUC1QJGdVGJAM_7AqU1UUF-6MPdDTom7oupIAo5eS51JVTxAa$>
Thanks, that is interesting, but. I guess this would require hooks
between libvert's live migration
and the CMS, so that the CMs can adjust |options:requested-chassis.
Brendan
|
On Fri, Jul 26, 2024 at 9:41 AM Brendan Doyle via discuss
<ovs-discuss@openvswitch.org> wrote:
Hi Folks,
We have a pair of VM's with their network interfaces connected
into OVN via the libvert
OVN hook. When doing live migration of these I see intense port
flapping, with each
chassis repeatably claiming the interfaces and and configuring
them into the southbound
DB/OVS, bringing the ports up/down constantly. I see the sequence
of events below
repeated up to a 100 times in a 50s period as the interfaces are
migrated from one chassis
to another.
Is this expected behavior, and what is the expected interface
outage to be during this
period? We have a heartbeat running with a 30s timeout, and that
intermittently
fails during these migrations.
On each chassis I see these sequence of events repeated 90 to 100
times:
in pcacn001, we see:
2024-07-21T09:36:35.262Z|304798|binding|INFO|Changing chassis for
lport c7e2c10e-43f1-11ef-b3c5-a8698c171668 from pcacn004 to
pcacn001
2024-07-21T09:36:35.262Z|304799|binding|INFO|c7e2c10e-43f1-11ef-b3c5-a8698c171668:
Claiming 00:13:97:f3:ea:30 11.11.1.2
2024-07-21T09:36:35.265Z|304800|binding|INFO|Setting lport
c7e2c10e-43f1-11ef-b3c5-a8698c171668 up in Southbound
2024-07-21T09:36:35.583Z|304806|binding|INFO|Removing lport
c7e2c10e-43f1-11ef-b3c5-a8698c171668 ovn-installed in OVS
2024-07-21T09:36:35.584Z|304809|binding|INFO|Setting lport
c7e2c10e-43f1-11ef-b3c5-a8698c171668 ovn-installed in OVS
2024-07-21T09:36:35.746Z|304810|binding|INFO|Setting lport
c7e2c10e-43f1-11ef-b3c5-a8698c171668 up in Southbound
2024-07-21T09:36:35.891Z|304811|binding|INFO|Changing chassis for
lport c7e2c10e-43f1-11ef-b3c5-a8698c171668 from pcacn004 to
pcacn001
2024-07-21T09:36:35.891Z|304812|binding|INFO|c7e2c10e-43f1-11ef-b3c5-a8698c171668:
Claiming 00:13:97:f3:ea:30 11.11.1.2
2024-07-21T09:36:35.967Z|304813|binding|INFO|Setting lport
c7e2c10e-43f1-11ef-b3c5-a8698c171668 up in Southbound
2024-07-21T09:36:36.189Z|304817|binding|INFO|Setting lport
c7e2c10e-43f1-11ef-b3c5-a8698c171668 up in Southbound
2024-07-21T09:36:36.338Z|304818|binding|INFO|Removing lport
c7e2c10e-43f1-11ef-b3c5-a8698c171668 ovn-installed in OVS
2024-07-21T09:36:36.338Z|304818|binding|INFO|Removing lport
c7e2c10e-43f1-11ef-b3c5-a8698c171668 ovn-installed in OVS
2024-07-21T09:36:36.339Z|304819|binding|INFO|Setting lport
c7e2c10e-43f1-11ef-b3c5-a8698c171668 down in Southbound
2024-07-21T09:36:36.422Z|304821|binding|INFO|Setting lport
c7e2c10e-43f1-11ef-b3c5-a8698c171668 ovn-installed in OVS
2024-07-21T09:36:36.422Z|304822|binding|INFO|Setting lport
c7e2c10e-43f1-11ef-b3c5-a8698c171668 up in Southbound And this
moving the port from pcacn004 to pcacn001 and setting the port
down/up repeats for 90-100 times until:
2024-07-21T09:37:23.945Z|306386|binding|INFO|Releasing lport
c7e2c10e-43f1-11ef-b3c5-a8698c171668 from this chassis. 49 seconds
of this. But also in pcacn004, we see it doing the same:
2024-07-21T09:36:35.253Z|301841|binding|INFO|Changing chassis for
lport c7e2c10e-43f1-11ef-b3c5-a8698c171668 from pcacn001 to
pcacn004.
2024-07-21T09:36:35.253Z|301842|binding|INFO|c7e2c10e-43f1-11ef-b3c5-a8698c171668:
Claiming 00:13:97:f3:ea:30 11.11.1.2
2024-07-21T09:36:35.259Z|301843|binding|INFO|Setting lport
c7e2c10e-43f1-11ef-b3c5-a8698c171668 down in Southbound
2024-07-21T09:36:35.452Z|301846|binding|INFO|Setting lport
c7e2c10e-43f1-11ef-b3c5-a8698c171668 ovn-installed in OVS
2024-07-21T09:36:35.483Z|301850|binding|INFO|Removing lport
c7e2c10e-43f1-11ef-b3c5-a8698c171668 ovn-installed in OVS
2024-07-21T09:36:35.483Z|301851|binding|INFO|Setting lport
c7e2c10e-43f1-11ef-b3c5-a8698c171668 down in Southbound
2024-07-21T09:36:35.562Z|301852|binding|INFO|Setting lport
c7e2c10e-43f1-11ef-b3c5-a8698c171668 ovn-installed in OVS
2024-07-21T09:36:35.562Z|301853|binding|INFO|Setting lport
c7e2c10e-43f1-11ef-b3c5-a8698c171668 up in Southbound
2024-07-21T09:36:35.743Z|301855|binding|INFO|Removing lport
c7e2c10e-43f1-11ef-b3c5-a8698c171668 ovn-installed in OVS
2024-07-21T09:36:35.743Z|301856|binding|INFO|Setting lport
c7e2c10e-43f1-11ef-b3c5-a8698c171668 down in Southbound
2024-07-21T09:36:35.817Z|301857|binding|INFO|Setting lport
c7e2c10e-43f1-11ef-b3c5-a8698c171668 ovn-installed in OVS
2024-07-21T09:36:35.817Z|301858|binding|INFO|Setting lport
c7e2c10e-43f1-11ef-b3c5-a8698c171668 up in Southbound which
continues until it finally gets the port at:
2024-07-21T09:37:23.793Z|303182|binding|INFO|Changing chassis for
lport c7e2c10e-43f1-11ef-b3c5-a8698c171668 from pcacn001 to
pcacn004.
2024-07-21T09:37:23.793Z|303183|binding|INFO|c7e2c10e-43f1-11ef-b3c5-a8698c171668:
Claiming 00:13:97:f3:ea:30 11.11.1.2
2024-07-21T09:37:23.857Z|303184|binding|INFO|Setting lport
c7e2c10e-43f1-11ef-b3c5-a8698c171668 up in Southbound
2024-07-21T09:37:23.949Z|303185|binding|INFO|Claiming lport
c7e2c10e-43f1-11ef-b3c5-a8698c171668 for this chassis.
2024-07-21T09:37:23.949Z|303186|binding|INFO|c7e2c10e-43f1-11ef-b3c5-a8698c171668:
Claiming 00:13:97:f3:ea:30 11.11.1.2
2024-07-21T09:37:23.951Z|303187|binding|INFO|Setting lport
c7e2c10e-43f1-11ef-b3c5-a8698c171668 up in Southbound
_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
<https://urldefense.com/v3/__https://mail.openvswitch.org/mailman/listinfo/ovs-discuss__;!!ACWV5N9M2RV99hQ!P4Dli8ff2rtH8mzHvmTlVoIBEEp7UX4RSsBiUC1QJGdVGJAM_7AqU1UUF-6MPdDTom7oupIAo5eS50Dbbh3u$>
_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss