[ 
https://issues.apache.org/jira/browse/HIVE-12353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15088121#comment-15088121
 ] 

Alan Gates commented on HIVE-12353:
-----------------------------------

One thought I had was, would it make more sense to create a separate table 
COMPLETED_COMPACTION (like COMPLETED_TXN_COMPONENTS)?  This would give you a 
place to store stuff, possibly for a longer time, without worrying about 
clogging up COMPACTION_QUEUE.  It might also help with HIVE-11444, as you could 
generate stats off this table.

> When Compactor fails it calls CompactionTxnHandler.markedCleaned().  it 
> should not.
> -----------------------------------------------------------------------------------
>
>                 Key: HIVE-12353
>                 URL: https://issues.apache.org/jira/browse/HIVE-12353
>             Project: Hive
>          Issue Type: Bug
>          Components: Transactions
>    Affects Versions: 1.0.0
>            Reporter: Eugene Koifman
>            Assignee: Eugene Koifman
>            Priority: Blocker
>
> One of the things that this method does is delete entries from TXN_COMPONENTS 
> for partition that it was trying to compact.
> This causes Aborted transactions in TXNS to become empty according to
> CompactionTxnHandler.cleanEmptyAbortedTxns() which means they can now be 
> deleted.  
> Once they are deleted, data that belongs to these txns is deemed committed...
> We should extend COMPACTION_QUEUE state with 'f' and 's' (failed, success) 
> states.  We should also not delete then entry from markedCleaned()
> We'll have separate process that cleans 'f' and 's' records after X minutes 
> (or after > N records for a given partition exist).
> This allows SHOW COMPACTIONS to show some history info and how many times 
> compaction failed on a given partition (subject to retention interval) so 
> that we don't have to call markCleaned() on Compactor failures at the same 
> time preventing Compactor to constantly getting stuck on the same bad 
> partition/table.
> Ideally we'd want to include END_TIME field.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to