[DISCUSS] Properties for scheduling compactions on specific queues

2022-01-31 Thread Stamatis Zampetakis
Hi all, This email is an attempt to converge on which Hive/Tez/MR properties someone should use in order to schedule a compaction on specific queues. For those who are not familiar with how queues are used the YARN capacity scheduler documentation [1] gives the general idea. Using specific queues

Re: [DISCUSS] Properties for scheduling compactions on specific queues

2022-01-31 Thread Alessandro Solimando
Hi Stamatis, the proposal seems reasonable to me. I think that setting the two properties you mention, independently from the underlying execution engine in use, should lead to the same result. In addition, I also agree that we should deprecate the per-execution engine properties. Best regards,

[jira] [Created] (HIVE-25915) Query based MINOR compaction fails with NPE if the data is loaded into the ACID table

2022-01-31 Thread Jira
László Végh created HIVE-25915: -- Summary: Query based MINOR compaction fails with NPE if the data is loaded into the ACID table Key: HIVE-25915 URL: https://issues.apache.org/jira/browse/HIVE-25915 Proje

[jira] [Created] (HIVE-25916) Optimise updateCompactionMetricsData

2022-01-31 Thread Jira
László Pintér created HIVE-25916: Summary: Optimise updateCompactionMetricsData Key: HIVE-25916 URL: https://issues.apache.org/jira/browse/HIVE-25916 Project: Hive Issue Type: Improvement

[jira] [Created] (HIVE-25917) Use default value for 'hive.default.nulls.last' when no config is available instead of false

2022-01-31 Thread Alessandro Solimando (Jira)
Alessandro Solimando created HIVE-25917: --- Summary: Use default value for 'hive.default.nulls.last' when no config is available instead of false Key: HIVE-25917 URL: https://issues.apache.org/jira/browse/HIVE

[DISCUSS] Compactor (Query vs MR) roadmap

2022-01-31 Thread Stamatis Zampetakis
Hi all, In the current master, there are two approaches for performing compactions of ACID tables [1]: * using hard-coded MapReduce jobs (aka. CompactorMR [2]); * using HiveQL queries (aka. QueryCompactor [3]) and delegating the execution to the underlying engine (MR, Tez, other); The motivation