Public bug reported: Now if we set neutron-server with 2 more workers or two neutron-server node behind a loadbalancer, then we live-migrate a VM will cause l2 pop failed(not always), the reason is that: 1. when nova finish live-migrating a VM, it update port host id to destination host 2. one neutron-server worker receive this request and do l2 pop, it check this port's host id was changed, but status is ACTIVE, then it record this port to its memory 3. when l2 agent scans this port, and update this port's status from ACTIVE->BUILD-ACTIVE, but another neutron-server worker receive this RPC request, then l2 pop will fail for this port
def update_port_postcommit(self, context): ... if port['device_owner'] == const.DEVICE_OWNER_DVR_INTERFACE: if context.status == const.PORT_STATUS_ACTIVE: self._update_port_up(context) if context.status == const.PORT_STATUS_DOWN: agent_host = context.host fdb_entries = self._get_agent_fdb( context, port, agent_host) self.L2populationAgentNotify.remove_fdb_entries( self.rpc_ctx, fdb_entries) elif (context.host != context.original_host and context.status == const.PORT_STATUS_ACTIVE and not self.migrated_ports.get(orig['id'])): # The port has been migrated. We have to store the original # binding to send appropriate fdb once the port will be set # on the destination host self.migrated_ports[orig['id']] = ( (orig, context.original_host)) ** Affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1493341 Title: l2 pop failed if live-migrate a VM with multiple neutron-server workers Status in neutron: New Bug description: Now if we set neutron-server with 2 more workers or two neutron-server node behind a loadbalancer, then we live-migrate a VM will cause l2 pop failed(not always), the reason is that: 1. when nova finish live-migrating a VM, it update port host id to destination host 2. one neutron-server worker receive this request and do l2 pop, it check this port's host id was changed, but status is ACTIVE, then it record this port to its memory 3. when l2 agent scans this port, and update this port's status from ACTIVE->BUILD-ACTIVE, but another neutron-server worker receive this RPC request, then l2 pop will fail for this port def update_port_postcommit(self, context): ... if port['device_owner'] == const.DEVICE_OWNER_DVR_INTERFACE: if context.status == const.PORT_STATUS_ACTIVE: self._update_port_up(context) if context.status == const.PORT_STATUS_DOWN: agent_host = context.host fdb_entries = self._get_agent_fdb( context, port, agent_host) self.L2populationAgentNotify.remove_fdb_entries( self.rpc_ctx, fdb_entries) elif (context.host != context.original_host and context.status == const.PORT_STATUS_ACTIVE and not self.migrated_ports.get(orig['id'])): # The port has been migrated. We have to store the original # binding to send appropriate fdb once the port will be set # on the destination host self.migrated_ports[orig['id']] = ( (orig, context.original_host)) To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1493341/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp