Public bug reported: Reproduction: * Create a new router * Attach external interface - Execute external_gateway_added successfully but fail some time before self.ex_gw_port = self.get_ex_gw_port() (An example of a failure would be an RPC error when trying to update FIP statuses. In such a case extra routes would not be configured either, and post-router creation events would not be sent, which means that for example the metadata proxy wouldn't be started).
Any follow up update to the router (Add/remove interface, add/remove FIP) will fail non-idempotent operations on the external device. This is because any update will try to add the gateway again (Because self.ex_gw_port = None). Even without a specific failure, reconfiguring the external device is wasteful. HA routers in particular will fail by throwing VIPDuplicateAddressException for the external device's VIP. This behavior was actually changed in a recent Mitaka patch (https://review.openstack.org/#/c/196893/50/neutron/agent/l3/ha_router.py), so this affects Juno to Liberty but not master and future releases. The impact on legacy or distributed routers is less severe as their process_external and routes_updated seem to be idempotent - Verified against master via a makeshift functional test, I could not vouch for previous releases. Severity: It's severe for HA routers from Juno to Liberty, but not as much for other routes types or HA routers on master. ** Affects: neutron Importance: Medium Status: New ** Tags: l3-dvr-backlog l3-ha l3-ipam-dhcp ** Description changed: Reproduction: - Create a new router - Attach external interface - Execute external_gateway_added successfully but fail some time before self.ex_gw_port = self.get_ex_gw_port() (An example of a failure would be an RPC error when trying to update FIP statuses. In such a case extra routes would not be configured either, and post-router creation events would not be sent, which means that for example the metadata proxy wouldn't be started). + * Create a new router + * Attach external interface - Execute external_gateway_added successfully but fail some time before self.ex_gw_port = self.get_ex_gw_port() (An example of a failure would be an RPC error when trying to update FIP statuses. In such a case extra routes would not be configured either, and post-router creation events would not be sent, which means that for example the metadata proxy wouldn't be started). Any follow up update to the router (Add/remove interface, add/remove FIP) will fail non-idempotent operations on the external device. This is because any update will try to add the gateway again (Because self.ex_gw_port = None). Even without a specific failure, reconfiguring the external device is wasteful. HA routers in particular will fail by throwing VIPDuplicateAddressException for the external device's VIP. This behavior was actually changed in a recent Mitaka patch (https://review.openstack.org/#/c/196893/50/neutron/agent/l3/ha_router.py), so this affects Juno to Liberty. The impact on legacy or distributed routers is less severe as their process_external and routes_updated seem to be idempotent - Verified against master via a makeshift functional test, I could not vouch for previous releases. ** Description changed: Reproduction: * Create a new router * Attach external interface - Execute external_gateway_added successfully but fail some time before self.ex_gw_port = self.get_ex_gw_port() (An example of a failure would be an RPC error when trying to update FIP statuses. In such a case extra routes would not be configured either, and post-router creation events would not be sent, which means that for example the metadata proxy wouldn't be started). Any follow up update to the router (Add/remove interface, add/remove FIP) will fail non-idempotent operations on the external device. This is because any update will try to add the gateway again (Because self.ex_gw_port = None). Even without a specific failure, reconfiguring the external device is wasteful. HA routers in particular will fail by throwing VIPDuplicateAddressException for the external device's VIP. This behavior was actually changed in a recent Mitaka patch (https://review.openstack.org/#/c/196893/50/neutron/agent/l3/ha_router.py), - so this affects Juno to Liberty. + so this affects Juno to Liberty but not master and future releases. The impact on legacy or distributed routers is less severe as their process_external and routes_updated seem to be idempotent - Verified against master via a makeshift functional test, I could not vouch for previous releases. ** Description changed: Reproduction: * Create a new router * Attach external interface - Execute external_gateway_added successfully but fail some time before self.ex_gw_port = self.get_ex_gw_port() (An example of a failure would be an RPC error when trying to update FIP statuses. In such a case extra routes would not be configured either, and post-router creation events would not be sent, which means that for example the metadata proxy wouldn't be started). Any follow up update to the router (Add/remove interface, add/remove FIP) will fail non-idempotent operations on the external device. This is because any update will try to add the gateway again (Because self.ex_gw_port = None). Even without a specific failure, reconfiguring the external device is wasteful. HA routers in particular will fail by throwing VIPDuplicateAddressException for the external device's VIP. This behavior was actually changed in a recent Mitaka patch (https://review.openstack.org/#/c/196893/50/neutron/agent/l3/ha_router.py), so this affects Juno to Liberty but not master and future releases. The impact on legacy or distributed routers is less severe as their process_external and routes_updated seem to be idempotent - Verified against master via a makeshift functional test, I could not vouch for previous releases. + + Severity: It's severe for HA routers from Juno to Liberty, but not as + much for other routes types or HA routers on master. -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1523999 Title: Any error in L3 agent after external gateway is configured but before the local cache is updated results in errors in subsequent router updates Status in neutron: New Bug description: Reproduction: * Create a new router * Attach external interface - Execute external_gateway_added successfully but fail some time before self.ex_gw_port = self.get_ex_gw_port() (An example of a failure would be an RPC error when trying to update FIP statuses. In such a case extra routes would not be configured either, and post-router creation events would not be sent, which means that for example the metadata proxy wouldn't be started). Any follow up update to the router (Add/remove interface, add/remove FIP) will fail non-idempotent operations on the external device. This is because any update will try to add the gateway again (Because self.ex_gw_port = None). Even without a specific failure, reconfiguring the external device is wasteful. HA routers in particular will fail by throwing VIPDuplicateAddressException for the external device's VIP. This behavior was actually changed in a recent Mitaka patch (https://review.openstack.org/#/c/196893/50/neutron/agent/l3/ha_router.py), so this affects Juno to Liberty but not master and future releases. The impact on legacy or distributed routers is less severe as their process_external and routes_updated seem to be idempotent - Verified against master via a makeshift functional test, I could not vouch for previous releases. Severity: It's severe for HA routers from Juno to Liberty, but not as much for other routes types or HA routers on master. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1523999/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp