0
We have 7 tables and each table is partitioned by record_date.There is a
query which involves inner join with all these tables and join is based on
consumer_id. The join involves multiple partition join. Currently querying
1 week data takes very long time around 20-30 mins. We want to optimize
t
Something weird.
Instead of using the hive keytab to insert data, I've found that if the
base directory (/apps/hive/warehouse/test.db/mytab/base_013) belongs to
the 'hive' user, then the compaction succeeds (cleaning step is ok). Even
if the delta_ directories don't belong to hive.
Weird, isn't
Hi,
We are using Hive on Tez (see versions below) and aren't able to get
TezChild class to log the timestamp even though
tez-container-log4j.properties has the ISO time in the logger pattern.
Sample Logs:
[TezChild] INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator -
FS[3]: records written -