I thought ORC file can be generated only by running hive query on staging table and inserting into ORC table. If there is option to generate ORC file at client side by using java code then can you share that code or links related to that? Thanks, Chandra
From: Abhay Bansal [mailto:abhaybansal.1...@gmail.com] Sent: Thursday, April 03, 2014 11:06 AM To: user@hive.apache.org Subject: Predicate pushdown optimisation not working for ORC I am new to Hive, apologise for asking such a basic question. Following exercise was done with hive .12 and hadoop 0.20.203 I created a ORC file form java, and pushed it into a table with the same schema. I checked the conf property <property><name>hive.optimize.ppd</name><value>true</value></property> which should ideally use the ppd optimisation. I ran a query "select sourceipv4address,sessionid,url from test where sourceipv4address="dummy";" Just to see if the ppd optimization is working I checked the hadoop logs where I found ./userlogs/job_201404010833_0036/attempt_201404010833_0036_m_000000_0/syslog:2014-04-03 05:01:39,913 INFO org.apache.hadoop.hive.ql.io.orc.OrcInputFormat: included column ids = 3,8,13 ./userlogs/job_201404010833_0036/attempt_201404010833_0036_m_000000_0/syslog:2014-04-03 05:01:39,914 INFO org.apache.hadoop.hive.ql.io.orc.OrcInputFormat: included columns names = sourceipv4address,sessionid,url ./userlogs/job_201404010833_0036/attempt_201404010833_0036_m_000000_0/syslog:2014-04-03 05:01:39,914 INFO org.apache.hadoop.hive.ql.io.orc.OrcInputFormat: No ORC pushdown predicate I am not sure which part of it I missed. Any help would be appreciated. Thanks, -Abhay