On 05/08/2024 19:16, Ihar Hrachyshka wrote:
Hi,

I don't know what libvirt OVN hook is though. Do you have a link?

https://docs.openvswitch.org/en/latest/howto/libvirt/
https://www.redhat.com/sysadmin/libvirt-open-vswitch


This suggests that `requested-chassis` for the OVN port is, at least for a moment, set to `null`, while both hosts have the interface with `iface-id` set to the port ID. Your hook should avoid it. (Always have requested-chassis pointing to one host or another, but never clear it up completely.) (Alternatively, create the destination OVS interface only when migration is complete.)

BTW the 100 flapping in 50s is actually not too bad, it was a lot worse before: https://github.com/ovn-org/ovn/commit/4dc4bc7fdb848bcc626becbd2c80ffef8a39ff9a <https://urldefense.com/v3/__https://github.com/ovn-org/ovn/commit/4dc4bc7fdb848bcc626becbd2c80ffef8a39ff9a__;!!ACWV5N9M2RV99hQ!P4Dli8ff2rtH8mzHvmTlVoIBEEp7UX4RSsBiUC1QJGdVGJAM_7AqU1UUF-6MPdDTom7oupIAo5eS5zua_o4K$>

---

A while ago, I wrote a blog walking through the binding process for OVN, (it covers live migration and port binding flapping; it also covers the additional-requested-chassis feature that is helpful for live migration, but I suspect you don't use it). I hope it may be of help: https://ihar.dev/posts/ovn-chassis-binding-walkthru <https://urldefense.com/v3/__https://ihar.dev/posts/ovn-chassis-binding-walkthru__;!!ACWV5N9M2RV99hQ!P4Dli8ff2rtH8mzHvmTlVoIBEEp7UX4RSsBiUC1QJGdVGJAM_7AqU1UUF-6MPdDTom7oupIAo5eS51JVTxAa$>

Thanks, that is interesting, but. I guess this would require  hooks between libvert's live migration
and the CMS, so that the CMs can adjust |options:requested-chassis.


Brendan

|

On Fri, Jul 26, 2024 at 9:41 AM Brendan Doyle via discuss <ovs-discuss@openvswitch.org> wrote:

    Hi Folks,


    We have a pair of VM's with their network interfaces connected
    into OVN via the libvert
    OVN hook. When doing live migration of these I see intense port
    flapping, with each
    chassis repeatably claiming the interfaces and and configuring
    them into the southbound
    DB/OVS, bringing the ports up/down constantly. I see the sequence
    of events below
    repeated up to a 100 times in a 50s period as the interfaces are
    migrated from one chassis
    to  another.

    Is this expected behavior, and what is the expected interface
    outage to be  during this
    period?  We have a heartbeat running with a 30s timeout, and that
    intermittently
    fails during these migrations.

    On each chassis I see these sequence of events repeated 90 to 100
    times:


    in pcacn001, we see:
    2024-07-21T09:36:35.262Z|304798|binding|INFO|Changing chassis for
    lport c7e2c10e-43f1-11ef-b3c5-a8698c171668 from pcacn004 to
    pcacn001
    
2024-07-21T09:36:35.262Z|304799|binding|INFO|c7e2c10e-43f1-11ef-b3c5-a8698c171668:
    Claiming 00:13:97:f3:ea:30 11.11.1.2
    2024-07-21T09:36:35.265Z|304800|binding|INFO|Setting lport
    c7e2c10e-43f1-11ef-b3c5-a8698c171668 up in Southbound
    2024-07-21T09:36:35.583Z|304806|binding|INFO|Removing lport
    c7e2c10e-43f1-11ef-b3c5-a8698c171668 ovn-installed in OVS
    2024-07-21T09:36:35.584Z|304809|binding|INFO|Setting lport
    c7e2c10e-43f1-11ef-b3c5-a8698c171668 ovn-installed in OVS
    2024-07-21T09:36:35.746Z|304810|binding|INFO|Setting lport
    c7e2c10e-43f1-11ef-b3c5-a8698c171668 up in Southbound
    2024-07-21T09:36:35.891Z|304811|binding|INFO|Changing chassis for
    lport c7e2c10e-43f1-11ef-b3c5-a8698c171668 from pcacn004 to
    pcacn001
    
2024-07-21T09:36:35.891Z|304812|binding|INFO|c7e2c10e-43f1-11ef-b3c5-a8698c171668:
    Claiming 00:13:97:f3:ea:30 11.11.1.2
    2024-07-21T09:36:35.967Z|304813|binding|INFO|Setting lport
    c7e2c10e-43f1-11ef-b3c5-a8698c171668 up in Southbound
    2024-07-21T09:36:36.189Z|304817|binding|INFO|Setting lport
    c7e2c10e-43f1-11ef-b3c5-a8698c171668 up in Southbound
    2024-07-21T09:36:36.338Z|304818|binding|INFO|Removing lport
    c7e2c10e-43f1-11ef-b3c5-a8698c171668 ovn-installed in OVS
    2024-07-21T09:36:36.338Z|304818|binding|INFO|Removing lport
    c7e2c10e-43f1-11ef-b3c5-a8698c171668 ovn-installed in OVS
    2024-07-21T09:36:36.339Z|304819|binding|INFO|Setting lport
    c7e2c10e-43f1-11ef-b3c5-a8698c171668 down in Southbound
    2024-07-21T09:36:36.422Z|304821|binding|INFO|Setting lport
    c7e2c10e-43f1-11ef-b3c5-a8698c171668 ovn-installed in OVS
    2024-07-21T09:36:36.422Z|304822|binding|INFO|Setting lport
    c7e2c10e-43f1-11ef-b3c5-a8698c171668 up in Southbound And this
    moving the port from pcacn004 to pcacn001 and setting the port
    down/up repeats for 90-100 times until:
    2024-07-21T09:37:23.945Z|306386|binding|INFO|Releasing lport
    c7e2c10e-43f1-11ef-b3c5-a8698c171668 from this chassis. 49 seconds
    of this. But also in pcacn004, we see it doing the same:
    2024-07-21T09:36:35.253Z|301841|binding|INFO|Changing chassis for
    lport c7e2c10e-43f1-11ef-b3c5-a8698c171668 from pcacn001 to
    pcacn004.
    
2024-07-21T09:36:35.253Z|301842|binding|INFO|c7e2c10e-43f1-11ef-b3c5-a8698c171668:
    Claiming 00:13:97:f3:ea:30 11.11.1.2
    2024-07-21T09:36:35.259Z|301843|binding|INFO|Setting lport
    c7e2c10e-43f1-11ef-b3c5-a8698c171668 down in Southbound
    2024-07-21T09:36:35.452Z|301846|binding|INFO|Setting lport
    c7e2c10e-43f1-11ef-b3c5-a8698c171668 ovn-installed in OVS
    2024-07-21T09:36:35.483Z|301850|binding|INFO|Removing lport
    c7e2c10e-43f1-11ef-b3c5-a8698c171668 ovn-installed in OVS
    2024-07-21T09:36:35.483Z|301851|binding|INFO|Setting lport
    c7e2c10e-43f1-11ef-b3c5-a8698c171668 down in Southbound
    2024-07-21T09:36:35.562Z|301852|binding|INFO|Setting lport
    c7e2c10e-43f1-11ef-b3c5-a8698c171668 ovn-installed in OVS
    2024-07-21T09:36:35.562Z|301853|binding|INFO|Setting lport
    c7e2c10e-43f1-11ef-b3c5-a8698c171668 up in Southbound
    2024-07-21T09:36:35.743Z|301855|binding|INFO|Removing lport
    c7e2c10e-43f1-11ef-b3c5-a8698c171668 ovn-installed in OVS
    2024-07-21T09:36:35.743Z|301856|binding|INFO|Setting lport
    c7e2c10e-43f1-11ef-b3c5-a8698c171668 down in Southbound
    2024-07-21T09:36:35.817Z|301857|binding|INFO|Setting lport
    c7e2c10e-43f1-11ef-b3c5-a8698c171668 ovn-installed in OVS
    2024-07-21T09:36:35.817Z|301858|binding|INFO|Setting lport
    c7e2c10e-43f1-11ef-b3c5-a8698c171668 up in Southbound which
    continues until it finally gets the port at:
    2024-07-21T09:37:23.793Z|303182|binding|INFO|Changing chassis for
    lport c7e2c10e-43f1-11ef-b3c5-a8698c171668 from pcacn001 to
    pcacn004.
    
2024-07-21T09:37:23.793Z|303183|binding|INFO|c7e2c10e-43f1-11ef-b3c5-a8698c171668:
    Claiming 00:13:97:f3:ea:30 11.11.1.2
    2024-07-21T09:37:23.857Z|303184|binding|INFO|Setting lport
    c7e2c10e-43f1-11ef-b3c5-a8698c171668 up in Southbound
    2024-07-21T09:37:23.949Z|303185|binding|INFO|Claiming lport
    c7e2c10e-43f1-11ef-b3c5-a8698c171668 for this chassis.
    
2024-07-21T09:37:23.949Z|303186|binding|INFO|c7e2c10e-43f1-11ef-b3c5-a8698c171668:
    Claiming 00:13:97:f3:ea:30 11.11.1.2
    2024-07-21T09:37:23.951Z|303187|binding|INFO|Setting lport
    c7e2c10e-43f1-11ef-b3c5-a8698c171668 up in Southbound
    _______________________________________________
    discuss mailing list
    disc...@openvswitch.org
    https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
    
<https://urldefense.com/v3/__https://mail.openvswitch.org/mailman/listinfo/ovs-discuss__;!!ACWV5N9M2RV99hQ!P4Dli8ff2rtH8mzHvmTlVoIBEEp7UX4RSsBiUC1QJGdVGJAM_7AqU1UUF-6MPdDTom7oupIAo5eS50Dbbh3u$>

_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to