[ https://issues.apache.org/jira/browse/HIVE-19124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16428599#comment-16428599 ]
Eugene Koifman commented on HIVE-19124: --------------------------------------- The flow for Acid compaction is 1. Initiator - uses some heuristics (like number of delta files, number of aborted txns, etc) to schedule compactions 2. Worker - picks an item from compaction queue (either put there by Initiator or via explicit Alter Table) runs a job to produce new files. 3. Cleaner - removes 'obsolete' files when it's safe to do so. For MM tables the flow so far was 1. Initiator - look for sufficient number of aborted txns and schedule compaction. 2. Worker - delete the delta_x_x dirs where X is aborted 3. Cleaner - does nothing compaction_queue/complete_compaction_queue are metastore tables representing the queue (the later keeps historical info) and driver SHOW COMPACTIONS. With this patch, Alter Table Compact for MM table will do the IOW + delete aborted if any, but the auto initiated compaction will also do IOW. Do we want that? Should Initiator have some logic to decide if IOW is needed? Maybe this can be a follow up ticket. Longer term I'd like to remove the Cleaner part altogether and move that logic to Worker - maybe 3.1 timeframe. {{//txnManager.closeTxnManager();}} - I don't think this is needed unless Session is shutdown. This is done in the cleaner {noformat} // TODO: Also delete obsolete directories? How do we account for readers? 477 /*List<FileStatus> obsolete = dir.getObsolete(); 478 for (FileStatus stat : obsolete) { 479 filesToDelete.add(stat.getPath()); 480 }*/ {noformat} Can you explain this? {{// TODO: move to global? should be ok if it's always the same thread.}} There should be some logic to shutdown the session if there are any errors. I've seen situations where Session init fails, but it's still attached to ThreadLocal and so every Worker in that thread will always get a bad session. HIVE-18808 is an example, it has links to others > implement a basic major compactor for MM tables > ----------------------------------------------- > > Key: HIVE-19124 > URL: https://issues.apache.org/jira/browse/HIVE-19124 > Project: Hive > Issue Type: Bug > Components: Transactions > Reporter: Sergey Shelukhin > Assignee: Sergey Shelukhin > Priority: Major > Labels: mm-gap-2 > Attachments: HIVE-19124.patch > > > For now, it will run a query directly and only major compactions will be > supported. -- This message was sent by Atlassian JIRA (v7.6.3#76005)