On 06/08/2024 17:06, Ihar Hrachyshka wrote:
On Tue, Aug 6, 2024 at 11:44 AM <brendan.do...@oracle.com> wrote:
On 06/08/2024 16:22, Ihar Hrachyshka wrote:
On Tue, Aug 6, 2024 at 6:09 AM <brendan.do...@oracle.com> wrote:
On 05/08/2024 19:16, Ihar Hrachyshka wrote:
Hi,
I don't know what libvirt OVN hook is though. Do you have a
link?
https://docs.openvswitch.org/en/latest/howto/libvirt/
<https://urldefense.com/v3/__https://docs.openvswitch.org/en/latest/howto/libvirt/__;!!ACWV5N9M2RV99hQ!PrRc5pn2H7XKwNz56Fx02hMzSYd38g07Xp0SnqTKS5G5LGElU0mSBKchpfNeA_Dt3aboh03f9XaEQl2jDBMh$>
https://www.redhat.com/sysadmin/libvirt-open-vswitch
<https://urldefense.com/v3/__https://www.redhat.com/sysadmin/libvirt-open-vswitch__;!!ACWV5N9M2RV99hQ!PrRc5pn2H7XKwNz56Fx02hMzSYd38g07Xp0SnqTKS5G5LGElU0mSBKchpfNeA_Dt3aboh03f9XaEQmapFFpt$>
This *seems* Open vSwitch specific?.. Is there some additional
integration for OVN that is not covered in the links above? I
imagine, at the very least, you'd have to set iface-id on OVS
interfaces to establish a mapping between OVN LSPs and OVS
interfaces.
Yes the libvert OVN hook creates the interface IDs.
It looks like the OVS (note: OVS, not OVN) hook fills in iface-ids for
VM OVS interfaces. It looks like the rest of the integration - OVN LSP
management - is done outside of the hook (by your automation?)
I assume this is done, somehow. Wonder if you have a link to this
code so we can check what it does to attach LSPs.
I'm not sure I'd imagine it is in the base libvert code. The way
we use:
Create a libvert "ovn-network" type :
cat ovn-network.xml
<network>
<name>ovn-network</name>
<forward mode='bridge'/>
<bridge name='br-int'/>
<virtualport type='openvswitch'/>
</network>
virsh net-create ovn-network.xml
Then when creating a VM add NIC Choose "ovn-network" as Network source and under
Virtual port put type as "openvswitch"
ovs-vsctl get interface vnetX external_ids:iface-id
ovs-vsctl get interface vnetX external_ids:attached-mac
ovn-nbctl lsp-add <switch name> iface-id
Ah I see, thanks for this clarification.
So you create LSPs yourself?
Yes We have our own CMS that does this :)
You will have to manage the requested-chassis setting on the LSP to
reflect the "current" host for the port (=the host that actively runs
the live migrated VM). This is what CMS like OpenStack does. It may be
tricky to quickly (as in: sub-second range) detect the switch from
source to destination host, but since you have ~50 seconds of port
flapping, it may be the least of your problems.
I'm hoping that using Multi-chassis port bindings as described in your
write up will help to avoid/reduce this.
This suggests that `requested-chassis` for the OVN port is,
at least for a moment, set to `null`, while both hosts have
the interface with `iface-id` set to the port ID. Your hook
should avoid it. (Always have requested-chassis pointing to
one host or another, but never clear it up completely.)
(Alternatively, create the destination OVS interface only
when migration is complete.)
BTW the 100 flapping in 50s is actually not too bad, it was
a lot worse before:
https://github.com/ovn-org/ovn/commit/4dc4bc7fdb848bcc626becbd2c80ffef8a39ff9a
<https://urldefense.com/v3/__https://github.com/ovn-org/ovn/commit/4dc4bc7fdb848bcc626becbd2c80ffef8a39ff9a__;!!ACWV5N9M2RV99hQ!P4Dli8ff2rtH8mzHvmTlVoIBEEp7UX4RSsBiUC1QJGdVGJAM_7AqU1UUF-6MPdDTom7oupIAo5eS5zua_o4K$>
---
A while ago, I wrote a blog walking through the binding
process for OVN, (it covers live migration and port binding
flapping; it also covers the additional-requested-chassis
feature that is helpful for live migration, but I suspect
you don't use it). I hope it may be of help:
https://ihar.dev/posts/ovn-chassis-binding-walkthru
<https://urldefense.com/v3/__https://ihar.dev/posts/ovn-chassis-binding-walkthru__;!!ACWV5N9M2RV99hQ!P4Dli8ff2rtH8mzHvmTlVoIBEEp7UX4RSsBiUC1QJGdVGJAM_7AqU1UUF-6MPdDTom7oupIAo5eS51JVTxAa$>
Thanks, that is interesting, but. I guess this would require
hooks between libvert's live migration
and the CMS, so that the CMs can adjust
|options:requested-chassis.
|
Yes. But I think as long as you have multiple OVS interfaces with
the same iface-id on different hosts, you have to use
requested-chassis to identify which host is "current" at any
particular moment.
|
Brendan
|
On Fri, Jul 26, 2024 at 9:41 AM Brendan Doyle via discuss
<ovs-discuss@openvswitch.org> wrote:
Hi Folks,
We have a pair of VM's with their network interfaces
connected into OVN via the libvert
OVN hook. When doing live migration of these I see
intense port flapping, with each
chassis repeatably claiming the interfaces and and
configuring them into the southbound
DB/OVS, bringing the ports up/down constantly. I see the
sequence of events below
repeated up to a 100 times in a 50s period as the
interfaces are migrated from one chassis
to another.
Is this expected behavior, and what is the expected
interface outage to be during this
period? We have a heartbeat running with a 30s timeout,
and that intermittently
fails during these migrations.
On each chassis I see these sequence of events repeated
90 to 100 times:
in pcacn001, we see:
2024-07-21T09:36:35.262Z|304798|binding|INFO|Changing
chassis for lport c7e2c10e-43f1-11ef-b3c5-a8698c171668
from pcacn004 to pcacn001
2024-07-21T09:36:35.262Z|304799|binding|INFO|c7e2c10e-43f1-11ef-b3c5-a8698c171668:
Claiming 00:13:97:f3:ea:30 11.11.1.2
2024-07-21T09:36:35.265Z|304800|binding|INFO|Setting
lport c7e2c10e-43f1-11ef-b3c5-a8698c171668 up in
Southbound
2024-07-21T09:36:35.583Z|304806|binding|INFO|Removing
lport c7e2c10e-43f1-11ef-b3c5-a8698c171668 ovn-installed
in OVS
2024-07-21T09:36:35.584Z|304809|binding|INFO|Setting
lport c7e2c10e-43f1-11ef-b3c5-a8698c171668 ovn-installed
in OVS
2024-07-21T09:36:35.746Z|304810|binding|INFO|Setting
lport c7e2c10e-43f1-11ef-b3c5-a8698c171668 up in
Southbound
2024-07-21T09:36:35.891Z|304811|binding|INFO|Changing
chassis for lport c7e2c10e-43f1-11ef-b3c5-a8698c171668
from pcacn004 to pcacn001
2024-07-21T09:36:35.891Z|304812|binding|INFO|c7e2c10e-43f1-11ef-b3c5-a8698c171668:
Claiming 00:13:97:f3:ea:30 11.11.1.2
2024-07-21T09:36:35.967Z|304813|binding|INFO|Setting
lport c7e2c10e-43f1-11ef-b3c5-a8698c171668 up in
Southbound
2024-07-21T09:36:36.189Z|304817|binding|INFO|Setting
lport c7e2c10e-43f1-11ef-b3c5-a8698c171668 up in
Southbound
2024-07-21T09:36:36.338Z|304818|binding|INFO|Removing
lport c7e2c10e-43f1-11ef-b3c5-a8698c171668 ovn-installed
in OVS
2024-07-21T09:36:36.338Z|304818|binding|INFO|Removing
lport c7e2c10e-43f1-11ef-b3c5-a8698c171668 ovn-installed
in OVS
2024-07-21T09:36:36.339Z|304819|binding|INFO|Setting
lport c7e2c10e-43f1-11ef-b3c5-a8698c171668 down in
Southbound
2024-07-21T09:36:36.422Z|304821|binding|INFO|Setting
lport c7e2c10e-43f1-11ef-b3c5-a8698c171668 ovn-installed
in OVS
2024-07-21T09:36:36.422Z|304822|binding|INFO|Setting
lport c7e2c10e-43f1-11ef-b3c5-a8698c171668 up in
Southbound And this moving the port from pcacn004 to
pcacn001 and setting the port down/up repeats for 90-100
times until:
2024-07-21T09:37:23.945Z|306386|binding|INFO|Releasing
lport c7e2c10e-43f1-11ef-b3c5-a8698c171668 from this
chassis. 49 seconds of this. But also in pcacn004, we
see it doing the same:
2024-07-21T09:36:35.253Z|301841|binding|INFO|Changing
chassis for lport c7e2c10e-43f1-11ef-b3c5-a8698c171668
from pcacn001 to pcacn004.
2024-07-21T09:36:35.253Z|301842|binding|INFO|c7e2c10e-43f1-11ef-b3c5-a8698c171668:
Claiming 00:13:97:f3:ea:30 11.11.1.2
2024-07-21T09:36:35.259Z|301843|binding|INFO|Setting
lport c7e2c10e-43f1-11ef-b3c5-a8698c171668 down in
Southbound
2024-07-21T09:36:35.452Z|301846|binding|INFO|Setting
lport c7e2c10e-43f1-11ef-b3c5-a8698c171668 ovn-installed
in OVS
2024-07-21T09:36:35.483Z|301850|binding|INFO|Removing
lport c7e2c10e-43f1-11ef-b3c5-a8698c171668 ovn-installed
in OVS
2024-07-21T09:36:35.483Z|301851|binding|INFO|Setting
lport c7e2c10e-43f1-11ef-b3c5-a8698c171668 down in
Southbound
2024-07-21T09:36:35.562Z|301852|binding|INFO|Setting
lport c7e2c10e-43f1-11ef-b3c5-a8698c171668 ovn-installed
in OVS
2024-07-21T09:36:35.562Z|301853|binding|INFO|Setting
lport c7e2c10e-43f1-11ef-b3c5-a8698c171668 up in
Southbound
2024-07-21T09:36:35.743Z|301855|binding|INFO|Removing
lport c7e2c10e-43f1-11ef-b3c5-a8698c171668 ovn-installed
in OVS
2024-07-21T09:36:35.743Z|301856|binding|INFO|Setting
lport c7e2c10e-43f1-11ef-b3c5-a8698c171668 down in
Southbound
2024-07-21T09:36:35.817Z|301857|binding|INFO|Setting
lport c7e2c10e-43f1-11ef-b3c5-a8698c171668 ovn-installed
in OVS
2024-07-21T09:36:35.817Z|301858|binding|INFO|Setting
lport c7e2c10e-43f1-11ef-b3c5-a8698c171668 up in
Southbound which continues until it finally gets the
port at:
2024-07-21T09:37:23.793Z|303182|binding|INFO|Changing
chassis for lport c7e2c10e-43f1-11ef-b3c5-a8698c171668
from pcacn001 to pcacn004.
2024-07-21T09:37:23.793Z|303183|binding|INFO|c7e2c10e-43f1-11ef-b3c5-a8698c171668:
Claiming 00:13:97:f3:ea:30 11.11.1.2
2024-07-21T09:37:23.857Z|303184|binding|INFO|Setting
lport c7e2c10e-43f1-11ef-b3c5-a8698c171668 up in
Southbound
2024-07-21T09:37:23.949Z|303185|binding|INFO|Claiming
lport c7e2c10e-43f1-11ef-b3c5-a8698c171668 for this
chassis.
2024-07-21T09:37:23.949Z|303186|binding|INFO|c7e2c10e-43f1-11ef-b3c5-a8698c171668:
Claiming 00:13:97:f3:ea:30 11.11.1.2
2024-07-21T09:37:23.951Z|303187|binding|INFO|Setting
lport c7e2c10e-43f1-11ef-b3c5-a8698c171668 up in Southbound
_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
<https://urldefense.com/v3/__https://mail.openvswitch.org/mailman/listinfo/ovs-discuss__;!!ACWV5N9M2RV99hQ!P4Dli8ff2rtH8mzHvmTlVoIBEEp7UX4RSsBiUC1QJGdVGJAM_7AqU1UUF-6MPdDTom7oupIAo5eS50Dbbh3u$>
_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss