** Also affects: neutron/kilo Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1533454
Title: L3 agent unable to update HA router state after race between HA router creating and deleting Status in neutron: Fix Released Status in neutron kilo series: New Bug description: The router L3 HA binding process does not take into account the fact that the port it is binding to the agent can be concurrently deleted. Details: When neutron server deleted all the resources of a HA router, L3 agent can not aware that, so race happened in some procedure like this: 1. Neutron server delete all resources of a HA router 2. RPC fanout to L3 agent 1 in which the HA router was master state 3. In l3 agent 2 'backup' router set itself to masert and notify neutron server a HA router state change notify. 4. PortNotFound rasied in updating HA router states function (Seems the DB error was no longer existed.) How the step 2 and 3 happens? Consider that l3 agent 2 has much more HA routers than l3 agent 1, or any reason that causes l3 agent 2 gets/processes the deleting RPC later than l3 agent 1. Then l3 agent 1 remove HA router's keepalived process will soonly be detected by backup router in l3 agent 2 via VRRP protocol. Now the router deleting RPC is in the queue of RouterUpdate or any step of a HA router deleting procedure, and the router_info will still have 'the' router info. So l3 agent 2 will do the state change procedure, AKA notify the neutron server to update router state. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1533454/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp