If you want it to compact automatically you should not put
NO_AUTO_COMPACTION in the table properties.
First question, did you turn on the compactor on your metastore thrift
server? To do this you need to set a couple of values in the
metastore's hive-site.xml:
hive.compactor.initiator.on=true
hive.compactor.worker.threads=1 # or more
Alan.
Sachin Pasalkar <mailto:sachin_pasal...@symantec.com>
September 14, 2015 at 3:03
Hi,
We are writing direct orc file from storm topology instead of using
hive streaming (Due to performance issue with our data). However, we
want to compact the data. So we have added the
"NO_AUTO_COMPACTION"=“false” option in table which we created to read
data(1.6 GB scattered in multiple small files) in ORC file. Does
“NO_AUTO_COMPACTION” means it will not compact data while hive
streaming is used? If no, why it did not compact our data into 1 file?
We also tried manually calling compaction from java code using
org.apache.hadoop.hive.metastore.txn.TxnHandler’s compact API which
shows it has started compaction, when we execute command Show
compactions. But still does not work. I don’t want to execute the
manual commands from command line.
Is there any way?
PS: We are writing all files in one directory only.
Thanks,
Sachin