[ https://issues.apache.org/jira/browse/HADOOP-19337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xiaoqiao He resolved HADOOP-19337. ---------------------------------- Fix Version/s: 3.5.0 Hadoop Flags: Reviewed Resolution: Fixed > Fix ZKFailoverController NPE issue due to integer overflow in parseInt when > initHM. > ----------------------------------------------------------------------------------- > > Key: HADOOP-19337 > URL: https://issues.apache.org/jira/browse/HADOOP-19337 > Project: Hadoop Common > Issue Type: Bug > Components: common > Reporter: ConfX > Assignee: ConfX > Priority: Critical > Labels: pull-request-available > Fix For: 3.5.0 > > > h3. *What Happened:* > A null pointer exception occurs when trying to shutdown healthMonitor. > healthMonitor is not initialized if the ha.health-monitor.rpc-timeout.ms is > set to 4294967295. The healthMonitor constructor uses parseInt() to parse > configuration values and this value overflows for parseInt and throws an > exception during healthMonitor initialization. > h3. *Buggy Code:* > > {code:java} > try { > initRPC(); > initHM(); // -> This throws a java.Lang.NumberFormatException and > healthMonitor is not intialized > startRPC(); > mainLoop(); > } catch (Exception e) { > LOG.error("The failover controller encounters runtime error: ", e); > throw e; > } > ... > > healthMonitor.shutdown(); // -> NPE if healthMonitor is not intialized > healthMonitor.join(); > } {code} > > > {code:java} > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.ha.ZKFailoverController.doRun(ZKFailoverController.java:266) > at > org.apache.hadoop.ha.ZKFailoverController.access$000(ZKFailoverController.java:65) > at > org.apache.hadoop.ha.ZKFailoverController$1.run(ZKFailoverController.java:186) > at > org.apache.hadoop.ha.ZKFailoverController$1.run(ZKFailoverController.java:182) > at > org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:520) > at > org.apache.hadoop.ha.ZKFailoverController.run(ZKFailoverController.java:182) > at > org.apache.hadoop.ha.MiniZKFCCluster$DummyZKFCThread.doWork(MiniZKFCCluster.java:301) > at > org.apache.hadoop.test.MultithreadedTestUtil$TestingThread.run(MultithreadedTestUtil.java:189) > {code} > h3. *How to Reproduce:* > (1) Set ha.health-monitor.rpc-timeout.ms to 4294967295. > (2) Run: test: > org.apache.hadoop.ha.TestZKFailoverController#testVerifyObserverState -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org