[ https://issues.apache.org/jira/browse/HIVE-13429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15234068#comment-15234068 ]
Hive QA commented on HIVE-13429: -------------------------------- Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12797803/HIVE-13429.4.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 9970 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3 org.apache.hadoop.hive.metastore.TestMetaStoreAuthorization.testMetaStoreAuthorization org.apache.hadoop.hive.ql.security.TestExtendedAcls.org.apache.hadoop.hive.ql.security.TestExtendedAcls org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadTableFailure org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testConnection org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testNegativeTokenAuth org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testProxyAuth {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7535/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7535/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7535/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 8 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12797803 - PreCommit-HIVE-TRUNK-Build > Tool to remove dangling scratch dir > ----------------------------------- > > Key: HIVE-13429 > URL: https://issues.apache.org/jira/browse/HIVE-13429 > Project: Hive > Issue Type: Improvement > Reporter: Daniel Dai > Assignee: Daniel Dai > Attachments: HIVE-13429.1.patch, HIVE-13429.2.patch, > HIVE-13429.3.patch, HIVE-13429.4.patch > > > We have seen in some cases, user will leave the scratch dir behind, and > eventually eat out hdfs storage. This could happen when vm restarts and leave > no chance for Hive to run shutdown hook. This is applicable for both HiveCli > and HiveServer2. Here we provide an external tool to clear dead scratch dir > as needed. > We need a way to identify which scratch dir is in use. We will rely on HDFS > write lock for that. Here is how HDFS write lock works: > 1. A HDFS client open HDFS file for write and only close at the time of > shutdown > 2. Cleanup process can try to open HDFS file for write. If the client holding > this file is still running, we will get exception. Otherwise, we know the > client is dead > 3. If the HDFS client dies without closing the HDFS file, NN will reclaim the > lease after 10 min, ie, the HDFS file hold by the dead client is writable > again after 10 min > So here is how we remove dangling scratch directory in Hive: > 1. HiveCli/HiveServer2 opens a well-named lock file in scratch directory and > only close it when we about to drop scratch directory > 2. A command line tool cleardanglingscratchdir will check every scratch > directory and try open the lock file for write. If it does not get exception, > meaning the owner is dead and we can safely remove the scratch directory > 3. The 10 min window means it is possible a HiveCli/HiveServer2 is dead but > we still cannot reclaim the scratch directory for another 10 min. But this > should be tolerable -- This message was sent by Atlassian JIRA (v6.3.4#6332)