[ https://issues.apache.org/jira/browse/HIVE-26825?focusedWorklogId=836574&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-836574 ]
ASF GitHub Bot logged work on HIVE-26825: ----------------------------------------- Author: ASF GitHub Bot Created on: 03/Jan/23 10:02 Start Date: 03/Jan/23 10:02 Worklog Time Spent: 10m Work Description: veghlaci05 commented on code in PR #3864: URL: https://github.com/apache/hive/pull/3864#discussion_r1060420473 ########## ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MetaStoreCompactorThread.java: ########## @@ -133,4 +138,14 @@ protected static long updateCycleDurationMetric(String metric, long startedAt) { } return 0; } + <T extends TBase<T,?>> T computeIfAbsent(Optional<Cache<String, TBase>> metaCache, String key, Callable<T> callable) throws Exception { Review Comment: Nit: new line between methods ########## ql/src/test/org/apache/hadoop/hive/ql/txn/compactor/TestCleaner.java: ########## @@ -1093,5 +1095,34 @@ public void testReady() throws Exception { Assert.assertEquals(TxnStore.CLEANING_RESPONSE, rsp.getCompacts().get(0).getState()); } + @Test + public void testMetaCache() throws Exception { + conf.setBoolVar(HIVE_COMPACTOR_DELAYED_CLEANUP_ENABLED, false); + + Table t = newTable("default", "retry_test", false); + + addBaseFile(t, null, 20L, 20); + addDeltaFile(t, null, 21L, 22L, 2); + addDeltaFile(t, null, 23L, 24L, 2); + burnThroughTransactions("default", "retry_test", 25); + + CompactionRequest rqst = new CompactionRequest("default", "retry_test", CompactionType.MAJOR); + long compactTxn = compactInTxn(rqst); + addBaseFile(t, null, 25L, 25, compactTxn); + + //Prevent cleaner from marking the compaction as cleaned + TxnStore mockedHandler = spy(txnHandler); + doThrow(new RuntimeException()).when(mockedHandler).markCleaned(nullable(CompactionInfo.class)); + Cleaner cleaner = Mockito.spy(new Cleaner()); + cleaner.setConf(conf); + cleaner.init(new AtomicBoolean(true)); + cleaner.run(); + + ShowCompactResponse rsp = txnHandler.showCompact(new ShowCompactRequest()); + List<ShowCompactResponseElement> compacts = rsp.getCompacts(); + Assert.assertEquals(1, compacts.size()); + Mockito.verify(cleaner, times(1)).resolveTable(Mockito.any()); Review Comment: For a single run without removing any files, `resolveTable` will be called only once anyways so this test does not ensure that the cache is used. You should run cleaning for the **same table** twice and assert that `resolveTable` called only once, while `computeIfAbsent` called twice. Issue Time Tracking ------------------- Worklog Id: (was: 836574) Time Spent: 40m (was: 0.5h) > Compactor: Cleaner shouldn't fetch table details again and again for > partitioned tables > --------------------------------------------------------------------------------------- > > Key: HIVE-26825 > URL: https://issues.apache.org/jira/browse/HIVE-26825 > Project: Hive > Issue Type: Improvement > Components: Transactions > Reporter: KIRTI RUGE > Assignee: KIRTI RUGE > Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > Cleaner shouldn't be fetch table/partition details for all its partitions. > When there are large number of databases/tables, it takes lot of time for > Initiator to complete its initial iteration and load on DB also goes higher. -- This message was sent by Atlassian Jira (v8.20.10#820010)