Re: [DISCUSS] Compactor (Query vs MR) roadmap

2022-02-03 Thread Stamatis Zampetakis
Hi Karen, Many thanks for joining the discussion. The fact that there are two components with quite a bit of overlap in their behavior is not something that can be easily maintained in the long term. Additionally, I have the impression that some commercial offerings of Hive are using the QB compa

Re: [DISCUSS] Compactor (Query vs MR) roadmap

2022-02-02 Thread Karen Coppage
Hi Stamatis, Thanks for your questions. You bring up good points. A bit about the state of the two compaction implementations: MR compaction (uses class CompactorMR) is older and more stable. I have only seen a couple bugs in the past few years. QB (query-based) compaction is required when YARN

[DISCUSS] Compactor (Query vs MR) roadmap

2022-01-31 Thread Stamatis Zampetakis
Hi all, In the current master, there are two approaches for performing compactions of ACID tables [1]: * using hard-coded MapReduce jobs (aka. CompactorMR [2]); * using HiveQL queries (aka. QueryCompactor [3]) and delegating the execution to the underlying engine (MR, Tez, other); The motivation