[ https://issues.apache.org/jira/browse/HIVE-20025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16532351#comment-16532351 ]
Hive QA commented on HIVE-20025: -------------------------------- Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12930097/HIVE-20025.04.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 14637 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12367/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12367/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12367/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12930097 - PreCommit-HIVE-Build > Clean-up of event files created by HiveProtoLoggingHook. > -------------------------------------------------------- > > Key: HIVE-20025 > URL: https://issues.apache.org/jira/browse/HIVE-20025 > Project: Hive > Issue Type: Bug > Components: HiveServer2 > Affects Versions: 3.0.0 > Reporter: Sankar Hariappan > Assignee: Sankar Hariappan > Priority: Major > Labels: Hive, hooks, pull-request-available > Fix For: 4.0.0 > > Attachments: HIVE-20025.01.patch, HIVE-20025.02.patch, > HIVE-20025.03.patch, HIVE-20025.04.patch > > > Currently, HiveProtoLoggingHook write event data to hdfs. The number of files > can grow to very large numbers. > Since the files are created under a folder with Date being a part of the > path, hive should have a way to clean up data older than a certain configured > time / date. This can be a job that can run with as little frequency as just > once a day. > This time should be set to 1 week default. There should also be a sane upper > bound of # of files so that when a large cluster generates a lot of files > during a spike, we don't force the cluster fall over. -- This message was sent by Atlassian JIRA (v7.6.3#76005)