[jira] [Commented] (HIVE-21402) Compaction state remains 'working' when major compaction fails

Peter Vary (JIRA) Mon, 11 Mar 2019 08:08:58 -0700


    [ 
https://issues.apache.org/jira/browse/HIVE-21402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16789655#comment-16789655
 ]


Peter Vary commented on HIVE-21402:
-----------------------------------

Yeah, and that catch just prints out the error to the log and leave the 
compaction in "working" status. That's left me scratching my head for a while :D

My understanding of the compaction is the following (mostly by documentation 
ATM):
 * If a compaction fails then it is put to the COMPLETED_COMPACTION table with 
the status marked as failed. And will be retried later if the conditions are 
still met.
 * If the number of the compaction failures are bigger for that compaction than 
{{metastore.compactor.initiator.failed.compacts.threshold}} then it will not be 
scheduled again.
 * If a compaction is found in the "working" state for longer than 
{{hive.compactor.worker.timeout}} by the initiator thread then it is put back 
to "initiated" state - so it will be queued again later. The config comment 
says "declared failed" but I think it does not put a new entry to the 
COMPLETED_COMPACTION table, so it is not counted when checking against the 
failed.compacts.threshold.

So if my understanding the above process is correct then if we catch the 
Throwable then we will have a few (by default 2) failed compactions very close 
to each other, on the other hand if we do not catch Throwable then we will have 
a continuously "working" compaction forever.

Or maybe I am totally off - learning/learning/learning :) :) :)

Thanks,

Peter

 

> Compaction state remains 'working' when major compaction fails
> --------------------------------------------------------------
>
>                 Key: HIVE-21402
>                 URL: https://issues.apache.org/jira/browse/HIVE-21402
>             Project: Hive
>          Issue Type: Bug
>          Components: Transactions
>    Affects Versions: 4.0.0
>            Reporter: Peter Vary
>            Assignee: Peter Vary
>            Priority: Major
>         Attachments: HIVE-21402.patch
>
>
> When calcite is not on the HMS classpath, and query based compaction is 
> enabled then the compaction fails with NoClassDefFound error. Since the catch 
> block only catches Exceptions the following code block is not executed:
> {code:java}
> } catch (Exception e) {
>   LOG.error("Caught exception while trying to compact " + ci +
>       ".  Marking failed to avoid repeated failures, " + 
> StringUtils.stringifyException(e));
>   msc.markFailed(CompactionInfo.compactionInfoToStruct(ci));
>   msc.abortTxns(Collections.singletonList(compactorTxnId));
> }
> {code}
> So the compaction is not set to failed.
> Would be better to catch Throwable instead of Exception



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-21402) Compaction state remains 'working' when major compaction fails

Reply via email to