Hrishikesh Gadre created HDFS-13799:
---------------------------------------

             Summary: TestEditLogTailer#testTriggersLogRollsForAllStandbyNN 
test fail due to missing synchronization between rollEditsRpcExecutor and 
tailerThread shutdown
                 Key: HDFS-13799
                 URL: https://issues.apache.org/jira/browse/HDFS-13799
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: ha
    Affects Versions: 3.0.0
            Reporter: Hrishikesh Gadre


TestEditLogTailer#testTriggersLogRollsForAllStandbyNN unit test is failing in 
our internal environment with following error,
{noformat}
java.lang.AssertionError: Test resulted in an unexpected exit
        at 
org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer.testTriggersLogRollsForAllStandbyNN(TestEditLogTailer.java:245){noformat}
This test failure is due to following error during shutdown of the 
MiniDfsCluster
{noformat}
2018-07-31 21:59:27,806 [main] INFO  hdfs.MiniDFSCluster 
(MiniDFSCluster.java:shutdown(1965)) - Shutting down the Mini HDFS Cluster
2018-07-31 21:59:27,806 [main] FATAL hdfs.MiniDFSCluster 
(MiniDFSCluster.java:shutdown(1968)) - Test resulted in an unexpected exit
1: java.util.concurrent.RejectedExecutionException: Task 
java.util.concurrent.FutureTask@1ce1d2b6 rejected from 
java.util.concurrent.ThreadPoolExecutor@12263f5a[Terminated, pool size = 0, 
active threads = 0, queued tasks = 0, completed tasks = 0]
        at org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:265)
        at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:441)
        at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$400(EditLogTailer.java:380)
        at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:397)
        at 
org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:482)
        at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:393)
Caused by: java.util.concurrent.RejectedExecutionException: Task 
java.util.concurrent.FutureTask@1ce1d2b6 rejected from 
java.util.concurrent.ThreadPoolExecutor@12263f5a[Terminated, pool size = 0, 
active threads = 0, queued tasks = 0, completed tasks = 0]
        at 
java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047)
        at 
java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823)
        at 
java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369)
        at 
java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:134)
        at 
java.util.concurrent.Executors$DelegatedExecutorService.submit(Executors.java:681)
        at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.triggerActiveLogRoll(EditLogTailer.java:351)
        at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:411)
        ... 4 more{noformat}
It looks like the EditLogTailer class is not handling the shutdown correctly. 
Specifically the EditLogTailer#stop() method shuts down the 
rollEditsRpcExecutor executor service before setting the tailerThread#shouldRun 
flag. This is a race condition since the tailerThread can try to submit a new 
task to this executor service which has been asked to shutdown. If that 
happens, it will receive an unexpected RejectedExecutionException, resulting in 
a test failure. The solution should be to properly synchronize shutdown of 
tailerThread with rollEditsRpcExecutor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Reply via email to