Public bug reported: I am running the ovn sandbox, a second chassis, and neutron. I synchronize neutron database with the databases of the sandbox, run neutron-server, and possibly run a few ovs-vsctl commands on chassis to set up ovs ports.
I notice that some commands on the chassis can trigger some sort of infinite loop in neutron. For example ovs-vsctl set open . external-ids:ovn-cms-options=enable-chassis-as-gw ovs-vsctl set open . external-ids:ovn-cms-options=xx ovs-vsctl set open . external-ids:ovn-cms-options=enable-chassis-as-gw on the second chassis, will trigger transactions "in a loop" on the neutron-server: ... Successfully bumped revision number for resource f32ac6cc (type: ports) to 571 Router 079cde19-0b92-48f8-bef2-5e35b939a7a1 is bound to host sandbox Running txn n=1 command(idx=0): CheckRevisionNumberCommand Running txn n=1 command(idx=1): UpdateLRouterPortCommand Running txn n=1 command(idx=2): SetLRouterPortInLSwitchPortCommand Successfully bumped revision number for resource f32ac6cc (type: router_ports) to 572 Running txn n=1 command(idx=0): CheckRevisionNumberCommand Running txn n=1 command(idx=1): SetLSwitchPortCommand Running txn n=1 command(idx=2): PgDelPortCommand Successfully bumped revision number for resource f32ac6cc (type: ports) to 572 Router 079cde19-0b92-48f8-bef2-5e35b939a7a1 is bound to host sandbox Running txn n=1 command(idx=0): CheckRevisionNumberCommand Running txn n=1 command(idx=1): UpdateLRouterPortCommand Running txn n=1 command(idx=2): SetLRouterPortInLSwitchPortCommand Successfully bumped revision number for resource f32ac6cc (type: router_ports) to 573 Running txn n=1 command(idx=0): CheckRevisionNumberCommand Running txn n=1 command(idx=1): SetLSwitchPortCommand Running txn n=1 command(idx=2): PgDelPortCommand ... This is not limited to the change of external-ids:ovn-cmd-options, other ovs-vsctl commands can trigger the same issue. neutron-server CPU consumption jumps to 100% and the revision_number of ports keep increasing. Restarting neutron-server fixes the issue temporarily. I am not sure how to provide a simple reproducer because I did not found any instructions to run neutron standalone and two OVN chassis. I will investigate what is happening locally. Version: main branch from OVN (d41a337fe3b608a8f90de8722d148344011f0bd8) and of Neutron (94d36862c207b1e4d984d28874ca2f3bd09c855f) It's not a blocker as long as it happens only on my laptop. ** Affects: neutron Importance: Undecided Status: New ** Tags: ovn ** Attachment added: "logs of one loop" https://bugs.launchpad.net/bugs/1926838/+attachment/5494052/+files/logs1 -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1926838 Title: [OVN] infinite loop in ovsdb_monitor Status in neutron: New Bug description: I am running the ovn sandbox, a second chassis, and neutron. I synchronize neutron database with the databases of the sandbox, run neutron-server, and possibly run a few ovs-vsctl commands on chassis to set up ovs ports. I notice that some commands on the chassis can trigger some sort of infinite loop in neutron. For example ovs-vsctl set open . external-ids:ovn-cms-options=enable-chassis-as-gw ovs-vsctl set open . external-ids:ovn-cms-options=xx ovs-vsctl set open . external-ids:ovn-cms-options=enable-chassis-as-gw on the second chassis, will trigger transactions "in a loop" on the neutron-server: ... Successfully bumped revision number for resource f32ac6cc (type: ports) to 571 Router 079cde19-0b92-48f8-bef2-5e35b939a7a1 is bound to host sandbox Running txn n=1 command(idx=0): CheckRevisionNumberCommand Running txn n=1 command(idx=1): UpdateLRouterPortCommand Running txn n=1 command(idx=2): SetLRouterPortInLSwitchPortCommand Successfully bumped revision number for resource f32ac6cc (type: router_ports) to 572 Running txn n=1 command(idx=0): CheckRevisionNumberCommand Running txn n=1 command(idx=1): SetLSwitchPortCommand Running txn n=1 command(idx=2): PgDelPortCommand Successfully bumped revision number for resource f32ac6cc (type: ports) to 572 Router 079cde19-0b92-48f8-bef2-5e35b939a7a1 is bound to host sandbox Running txn n=1 command(idx=0): CheckRevisionNumberCommand Running txn n=1 command(idx=1): UpdateLRouterPortCommand Running txn n=1 command(idx=2): SetLRouterPortInLSwitchPortCommand Successfully bumped revision number for resource f32ac6cc (type: router_ports) to 573 Running txn n=1 command(idx=0): CheckRevisionNumberCommand Running txn n=1 command(idx=1): SetLSwitchPortCommand Running txn n=1 command(idx=2): PgDelPortCommand ... This is not limited to the change of external-ids:ovn-cmd-options, other ovs-vsctl commands can trigger the same issue. neutron-server CPU consumption jumps to 100% and the revision_number of ports keep increasing. Restarting neutron-server fixes the issue temporarily. I am not sure how to provide a simple reproducer because I did not found any instructions to run neutron standalone and two OVN chassis. I will investigate what is happening locally. Version: main branch from OVN (d41a337fe3b608a8f90de8722d148344011f0bd8) and of Neutron (94d36862c207b1e4d984d28874ca2f3bd09c855f) It's not a blocker as long as it happens only on my laptop. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1926838/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp