Ádám Szita created HIVE-22705: --------------------------------- Summary: LLAP cache is polluted by query-based compactor Key: HIVE-22705 URL: https://issues.apache.org/jira/browse/HIVE-22705 Project: Hive Issue Type: Improvement Reporter: Ádám Szita Assignee: Ádám Szita
One of the steps that query-based compaction does is the verification of ACID sort order by using the _validate_acid_sort_order_ UDF. This is a prerequisite before the actual compaction can happen, and is done by a [query that reads the whole table content|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MajorQueryCompactor.java#L161-L167]. This results in the whole table content being populated into the cache. The problem is that this content is not useful and will rather pollute the cache space, as it can never be used again: cache content binds to files (file IDs) that obviously will be changed in this case by compaction. I propose we disable LLAP caching in the session of query-based compaction's queries. -- This message was sent by Atlassian Jira (v8.3.4#803005)