Github user ustcweizhou commented on a diff in the pull request: https://github.com/apache/cloudstack/pull/1198#discussion_r47204914 --- Diff: engine/orchestration/src/org/apache/cloudstack/engine/orchestration/NetworkOrchestrator.java --- @@ -2558,6 +2575,62 @@ public boolean restartNetwork(Long networkId, Account callerAccount, User caller } } + /* If there are redundant routers in the isolated network, we follow the steps to make the network working better + * (1) destroy backup router if exists + * (2) create new backup router + * (3) destroy master router (then the backup will become master) + * (4) create a new router as backup router. + */ + private boolean restartGuestNetworkWithRedundantRouters(NetworkVO network, List<DomainRouterVO> routers, ReservationContext context) throws ResourceUnavailableException, ConcurrentOperationException, InsufficientCapacityException { + Account caller = CallContext.current().getCallingAccount(); + long callerUserId = CallContext.current().getCallingUserId(); + + // check the master and backup redundant state + DomainRouterVO masterRouter = null; + DomainRouterVO backupRouter = null; + if (routers != null && routers.size() == 1) { + masterRouter = routers.get(0); + } if (routers != null && routers.size() == 2) { + DomainRouterVO router1 = routers.get(0); + DomainRouterVO router2 = routers.get(1); + if (router1.getRedundantState() == RedundantState.MASTER || router2.getRedundantState() == RedundantState.BACKUP) { + masterRouter = router1; + backupRouter = router2; + } else if (router1.getRedundantState() == RedundantState.BACKUP || router2.getRedundantState() == RedundantState.MASTER) { + masterRouter = router2; + backupRouter = router1; + } else { // both routers are in UNKNOWN state + masterRouter = router1; + backupRouter = router2; + } + } + + NetworkOfferingVO offering = _networkOfferingDao.findByIdIncludingRemoved(network.getNetworkOfferingId()); + DeployDestination dest = new DeployDestination(_dcDao.findById(network.getDataCenterId()), null, null, null); + List<Provider> providersToImplement = getNetworkProviders(network.getId()); + + // destroy backup router + if (backupRouter != null) { + _routerService.destroyRouter(backupRouter.getId(), caller, callerUserId); + } + // create new backup router + implementNetworkElements(dest, context, network, offering, providersToImplement); + + // destroy master router + if (masterRouter != null) { + try { + Thread.sleep(10000); // wait 10s for the keepalived/conntrackd on backup router --- End diff -- @wilderrodrigues I made this change for 4.2.1 at first. I met some issues during the restartnetwork because of the keepalived/conntrackd, hence I added the sleep. 10 seconds is enough to wait the services to be running. I did not check the difference between keepalived on Debian 7.0.0 and Debian 7.9.0, so I thought the issue still exist. I am busy on other issues and have no time on writing the test, automated test and improvement are welcome.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---