[ https://issues.apache.org/jira/browse/HIVE-21402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16789655#comment-16789655 ]
Peter Vary commented on HIVE-21402: ----------------------------------- Yeah, and that catch just prints out the error to the log and leave the compaction in "working" status. That's left me scratching my head for a while :D My understanding of the compaction is the following (mostly by documentation ATM): * If a compaction fails then it is put to the COMPLETED_COMPACTION table with the status marked as failed. And will be retried later if the conditions are still met. * If the number of the compaction failures are bigger for that compaction than {{metastore.compactor.initiator.failed.compacts.threshold}} then it will not be scheduled again. * If a compaction is found in the "working" state for longer than {{hive.compactor.worker.timeout}} by the initiator thread then it is put back to "initiated" state - so it will be queued again later. The config comment says "declared failed" but I think it does not put a new entry to the COMPLETED_COMPACTION table, so it is not counted when checking against the failed.compacts.threshold. So if my understanding the above process is correct then if we catch the Throwable then we will have a few (by default 2) failed compactions very close to each other, on the other hand if we do not catch Throwable then we will have a continuously "working" compaction forever. Or maybe I am totally off - learning/learning/learning :) :) :) Thanks, Peter > Compaction state remains 'working' when major compaction fails > -------------------------------------------------------------- > > Key: HIVE-21402 > URL: https://issues.apache.org/jira/browse/HIVE-21402 > Project: Hive > Issue Type: Bug > Components: Transactions > Affects Versions: 4.0.0 > Reporter: Peter Vary > Assignee: Peter Vary > Priority: Major > Attachments: HIVE-21402.patch > > > When calcite is not on the HMS classpath, and query based compaction is > enabled then the compaction fails with NoClassDefFound error. Since the catch > block only catches Exceptions the following code block is not executed: > {code:java} > } catch (Exception e) { > LOG.error("Caught exception while trying to compact " + ci + > ". Marking failed to avoid repeated failures, " + > StringUtils.stringifyException(e)); > msc.markFailed(CompactionInfo.compactionInfoToStruct(ci)); > msc.abortTxns(Collections.singletonList(compactorTxnId)); > } > {code} > So the compaction is not set to failed. > Would be better to catch Throwable instead of Exception -- This message was sent by Atlassian JIRA (v7.6.3#76005)