[ https://issues.apache.org/jira/browse/HIVE-20259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16567500#comment-16567500 ]
Jason Dere commented on HIVE-20259: ----------------------------------- Attached patch with utility DirectoryMarkerUpdate/Cleanup classes to create .cacheupdate files in the cache directory, to indicate that this directory should not be cleaned up by any other process performing DirectoryMarkerCleanup. This uses the last modify date of the .cacheupdate file to determine whether the file should be cleaned up, if the instance running cleanup determines this date is too old then the directory will be deleted. Another option, rather than relying on the lastModifyDate of the .cacheupdate file, would be for the .cacheupdate file contents to contain a long string indicating when this directory should be considered stale and safe to delete. This would have the benefit that the determination of when the directory should be cleaned up would depend on the settings of the application that wrote the .cacheupdate file, rather than on the settings of the application instance that is performing the cleanup. Though it means more file operations - it would involve having to read the file rather than just dealing with the file metadata. > Cleanup of results cache directory > ---------------------------------- > > Key: HIVE-20259 > URL: https://issues.apache.org/jira/browse/HIVE-20259 > Project: Hive > Issue Type: Sub-task > Reporter: Jason Dere > Assignee: Jason Dere > Priority: Major > Attachments: HIVE-20259.1.patch > > > The query results cache directory is currently deleted at process exit. This > does not work in the case of a kill -9 or a sudden process exit of Hive. > There should be some cleanup mechanism in place to take care of any old cache > directories that were not deleted at process exit. -- This message was sent by Atlassian JIRA (v7.6.3#76005)