I am new to Hive, apologise for asking such a basic question. Following exercise was done with hive .12 and hadoop 0.20.203
I created a ORC file form java, and pushed it into a table with the same schema. I checked the conf property <property><name>hive.optimize.ppd</name><value>true</value></property> which should ideally use the ppd optimisation. I ran a query "select sourceipv4address,sessionid,url from test where sourceipv4address="dummy";" Just to see if the ppd optimization is working I checked the hadoop logs where I found ./userlogs/job_201404010833_0036/attempt_201404010833_0036_m_000000_0/syslog:2014-04-03 05:01:39,913 INFO org.apache.hadoop.hive.ql.io.orc.OrcInputFormat: included column ids = 3,8,13 ./userlogs/job_201404010833_0036/attempt_201404010833_0036_m_000000_0/syslog:2014-04-03 05:01:39,914 INFO org.apache.hadoop.hive.ql.io.orc.OrcInputFormat: included columns names = sourceipv4address,sessionid,url ./userlogs/job_201404010833_0036/attempt_201404010833_0036_m_000000_0/syslog:2014-04-03 05:01:39,914 INFO org.apache.hadoop.hive.ql.io.orc.OrcInputFormat: *No ORC pushdown predicate* I am not sure which part of it I missed. Any help would be appreciated. Thanks, -Abhay