[ 
https://issues.apache.org/jira/browse/HADOOP-19337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoqiao He resolved HADOOP-19337.
----------------------------------
    Fix Version/s: 3.5.0
     Hadoop Flags: Reviewed
       Resolution: Fixed

> Fix ZKFailoverController NPE issue due to integer overflow in parseInt when 
> initHM.
> -----------------------------------------------------------------------------------
>
>                 Key: HADOOP-19337
>                 URL: https://issues.apache.org/jira/browse/HADOOP-19337
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: common
>            Reporter: ConfX
>            Assignee: ConfX
>            Priority: Critical
>              Labels: pull-request-available
>             Fix For: 3.5.0
>
>
> h3. *What Happened:* 
> A null pointer exception occurs when trying to shutdown healthMonitor. 
> healthMonitor is not initialized if the ha.health-monitor.rpc-timeout.ms is 
> set to 4294967295. The healthMonitor constructor uses parseInt() to parse 
> configuration values and this value overflows for parseInt and throws an 
> exception during healthMonitor initialization. 
> h3. *Buggy Code:* 
>  
> {code:java}
> try {
>   initRPC();
>   initHM(); // -> This throws a java.Lang.NumberFormatException and   
> healthMonitor is not intialized
>   startRPC();
>   mainLoop();
> } catch (Exception e) {
>   LOG.error("The failover controller encounters runtime error: ", e);
>   throw e;
> } 
>   ...
>   
>   healthMonitor.shutdown(); // -> NPE if healthMonitor is not intialized
>   healthMonitor.join();
> } {code}
>  
>  
> {code:java}
> Caused by: java.lang.NullPointerException
>         at 
> org.apache.hadoop.ha.ZKFailoverController.doRun(ZKFailoverController.java:266)
>         at 
> org.apache.hadoop.ha.ZKFailoverController.access$000(ZKFailoverController.java:65)
>         at 
> org.apache.hadoop.ha.ZKFailoverController$1.run(ZKFailoverController.java:186)
>         at 
> org.apache.hadoop.ha.ZKFailoverController$1.run(ZKFailoverController.java:182)
>         at 
> org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:520)
>         at 
> org.apache.hadoop.ha.ZKFailoverController.run(ZKFailoverController.java:182)
>         at 
> org.apache.hadoop.ha.MiniZKFCCluster$DummyZKFCThread.doWork(MiniZKFCCluster.java:301)
>         at 
> org.apache.hadoop.test.MultithreadedTestUtil$TestingThread.run(MultithreadedTestUtil.java:189)
>  {code}
> h3. *How to Reproduce:*
> (1) Set ha.health-monitor.rpc-timeout.ms to 4294967295.  
> (2) Run: test: 
> org.apache.hadoop.ha.TestZKFailoverController#testVerifyObserverState



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

Reply via email to