Hive Query Optimization

2019-08-26 Thread Soupam Mandal
0 We have 7 tables and each table is partitioned by record_date.There is a query which involves inner join with all these tables and join is based on consumer_id. The join involves multiple partition join. Currently querying 1 week data takes very long time around 20-30 mins. We want to optimize t

Re: Hive Major Compaction fails (cleaning step)

2019-08-26 Thread David Morin
Something weird. Instead of using the hive keytab to insert data, I've found that if the base directory (/apps/hive/warehouse/test.db/mytab/base_013) belongs to the 'hive' user, then the compaction succeeds (cleaning step is ok). Even if the delta_ directories don't belong to hive. Weird, isn't

Hive on Tez : yarn logs missing timestamp

2019-08-26 Thread Viral Bajaria
Hi, We are using Hive on Tez (see versions below) and aren't able to get TezChild class to log the timestamp even though tez-container-log4j.properties has the ISO time in the logger pattern. Sample Logs: [TezChild] INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator - FS[3]: records written -