I am new to Hive, apologise for asking such a basic question.

Following exercise was done with hive .12 and hadoop 0.20.203

I created a ORC file form java, and pushed it into a table with the same
schema. I checked the conf
property <property><name>hive.optimize.ppd</name><value>true</value></property>
which should ideally use the ppd optimisation.

I ran a query "select sourceipv4address,sessionid,url from test where
sourceipv4address="dummy";"

Just to see if the ppd optimization is working I checked the hadoop logs
where I found

./userlogs/job_201404010833_0036/attempt_201404010833_0036_m_000000_0/syslog:2014-04-03
05:01:39,913 INFO org.apache.hadoop.hive.ql.io.orc.OrcInputFormat: included
column ids = 3,8,13
./userlogs/job_201404010833_0036/attempt_201404010833_0036_m_000000_0/syslog:2014-04-03
05:01:39,914 INFO org.apache.hadoop.hive.ql.io.orc.OrcInputFormat: included
columns names = sourceipv4address,sessionid,url
./userlogs/job_201404010833_0036/attempt_201404010833_0036_m_000000_0/syslog:2014-04-03
05:01:39,914 INFO org.apache.hadoop.hive.ql.io.orc.OrcInputFormat: *No ORC
pushdown predicate*

 I am not sure which part of it I missed. Any help would be appreciated.

Thanks,
-Abhay

Reply via email to