Thanks Sunil for confirmation. Btw, I have raised YARN-7453 <https://issues.apache.org/jira/browse/YARN-7453> JIRA to track this issue.
- Rohith Sharma K S On 7 November 2017 at 16:44, Sunil G <sun...@apache.org> wrote: > Hi Subru and Arun. > > Thanks for driving 2.9 release. Great work! > > I installed cluster built from source. > - Ran few MR jobs with application priority enabled. Runs fine. > - Accessed new UI and it also seems fine. > > However I am also getting same issue as Rohith reported. > - Started an HA cluster > - Pushed RM to standby > - Pushed back RM to active then seeing an exception. > > org.apache.hadoop.ha.ServiceFailedException: RM could not transition to > Active > at > org.apache.hadoop.yarn.server.resourcemanager. > ActiveStandbyElectorBasedElectorServic > e.becomeActive(ActiveStandbyElectorBasedElectorService.java:146) > at > org.apache.hadoop.ha.ActiveStandbyElector.becomeActive( > ActiveStandbyElector.java:894 > ) > > Caused by: org.apache.zookeeper.KeeperException$NoAuthException: > KeeperErrorCode = NoAuth > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:113) > at org.apache.zookeeper.ZooKeeper.multiInternal( > ZooKeeper.java:949) > > Will check and post more details, > > - Sunil > > > On Tue, Nov 7, 2017 at 12:47 PM Rohith Sharma K S < > rohithsharm...@apache.org> > wrote: > > > Thanks Subru/Arun for the great work! > > > > Downloaded source and built from it. Deployed RM HA non-secured cluster > > along with new YARN UI and ATSv2. > > > > I am facing basic RM HA switch issue after first time successful start. > > *Can > > anyone else is facing this issue?* > > > > When RM is switched from ACTIVE to STANDBY to ACTIVE, RM never switch to > > active successfully. Exception trace I see from the log is > > > > 2017-11-07 12:35:56,540 WARN org.apache.hadoop.ha.ActiveStandbyElector: > > Exception handling the winning of election > > org.apache.hadoop.ha.ServiceFailedException: RM could not transition to > > Active > > at > > > > org.apache.hadoop.yarn.server.resourcemanager. > ActiveStandbyElectorBasedElectorService.becomeActive( > ActiveStandbyElectorBasedElectorService.java:146) > > at > > > > org.apache.hadoop.ha.ActiveStandbyElector.becomeActive( > ActiveStandbyElector.java:894) > > at > > > > org.apache.hadoop.ha.ActiveStandbyElector.processResult( > ActiveStandbyElector.java:473) > > at > > > > org.apache.zookeeper.ClientCnxn$EventThread. > processEvent(ClientCnxn.java:599) > > at org.apache.zookeeper.ClientCnxn$EventThread.run( > ClientCnxn.java:498) > > Caused by: org.apache.hadoop.ha.ServiceFailedException: Error when > > transitioning to Active mode > > at > > > > org.apache.hadoop.yarn.server.resourcemanager.AdminService. > transitionToActive(AdminService.java:325) > > at > > > > org.apache.hadoop.yarn.server.resourcemanager. > ActiveStandbyElectorBasedElectorService.becomeActive( > ActiveStandbyElectorBasedElectorService.java:144) > > ... 4 more > > Caused by: org.apache.hadoop.service.ServiceStateException: > > org.apache.zookeeper.KeeperException$NoAuthException: KeeperErrorCode = > > NoAuth > > at > > > > org.apache.hadoop.service.ServiceStateException.convert( > ServiceStateException.java:105) > > at > > org.apache.hadoop.service.AbstractService.start( > AbstractService.java:205) > > at > > > > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager. > startActiveServices(ResourceManager.java:1131) > > at > > > > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run( > ResourceManager.java:1171) > > at > > > > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run( > ResourceManager.java:1167) > > at java.security.AccessController.doPrivileged(Native Method) > > at javax.security.auth.Subject.doAs(Subject.java:422) > > at > > > > org.apache.hadoop.security.UserGroupInformation.doAs( > UserGroupInformation.java:1886) > > at > > > > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager. > transitionToActive(ResourceManager.java:1167) > > at > > > > org.apache.hadoop.yarn.server.resourcemanager.AdminService. > transitionToActive(AdminService.java:320) > > ... 5 more > > Caused by: org.apache.zookeeper.KeeperException$NoAuthException: > > KeeperErrorCode = NoAuth > > at > > org.apache.zookeeper.KeeperException.create(KeeperException.java:113) > > at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:949) > > at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:915) > > at > > > > org.apache.curator.framework.imps.CuratorTransactionImpl.doOperation( > CuratorTransactionImpl.java:159) > > at > > > > org.apache.curator.framework.imps.CuratorTransactionImpl.access$200( > CuratorTransactionImpl.java:44) > > at > > > > org.apache.curator.framework.imps.CuratorTransactionImpl$2. > call(CuratorTransactionImpl.java:129) > > at > > > > org.apache.curator.framework.imps.CuratorTransactionImpl$2. > call(CuratorTransactionImpl.java:125) > > at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107) > > at > > > > org.apache.curator.framework.imps.CuratorTransactionImpl. > commit(CuratorTransactionImpl.java:122) > > at > > > > org.apache.hadoop.util.curator.ZKCuratorManager$SafeTransaction.commit( > ZKCuratorManager.java:403) > > at > > > > org.apache.hadoop.util.curator.ZKCuratorManager. > safeSetData(ZKCuratorManager.java:372) > > at > > > > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore. > getAndIncrementEpoch(ZKRMStateStore.java:493) > > at > > > > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ > RMActiveServices.serviceStart(ResourceManager.java:754) > > at > > org.apache.hadoop.service.AbstractService.start( > AbstractService.java:194) > > ... 13 more > > > > Thanks & Regards > > Rohith Sharma K S > > > > On 4 November 2017 at 04:20, Arun Suresh <asur...@apache.org> wrote: > > > > > Hi folks, > > > > > > Apache Hadoop 2.9.0 is the first stable release of Hadoop 2.9 line > > and > > > will be the latest stable/production release for Apache Hadoop - it > > > includes 30 New Features with 500+ subtasks, 407 Improvements, 787 Bug > > > fixes new fixed issues since 2.8.2 . > > > > > > More information about the 2.9.0 release plan can be found here: > > > *https://cwiki.apache.org/confluence/display/HADOOP/ > > > Roadmap#Roadmap-Version2.9 > > > <https://cwiki.apache.org/confluence/display/HADOOP/ > > > Roadmap#Roadmap-Version2.9>* > > > > > > New RC is available at: > > > http://home.apache.org/~asuresh/hadoop-2.9.0-RC0/ > > > > > > The RC tag in git is: release-2.9.0-RC0, and the latest commit id > > is: > > > 6697f0c18b12f1bdb99cbdf81394091f4fef1f0a > > > > > > The maven artifacts are available via repository.apache.org at: > > > * > > https://repository.apache.org/content/repositories/orgapachehadoop-1065/ > > > < > > https://repository.apache.org/content/repositories/orgapachehadoop-1065/ > > > >* > > > > > > Please try the release and vote; the vote will run for the usual > 5 > > > days, ending on 11/10/2017 4pm PST time. > > > > > > Thanks, > > > > > > Arun/Subru > > > > > >