Re: ORC queries inefficient for sorted field

2014-02-24 Thread Prasanth Jayachandran
Hi Bryan ORC indexes are used only for the selection of stripes and row groups and not for answering queries. You can enable hive.compute.query.using.stats flag to answer queries using metadata. When this flag is enabled, hive metastore is checked to see if column statistics exists for the req

ORC queries inefficient for sorted field

2014-02-22 Thread Bryan Jeffrey
Hello. I'm running Hadoop 2.2.0 and Hive 0.12.0. I have an ORC table partitioned by 'range', and sorted by 'time'. I want to select the max(time) value from a table for a given set of partitions. I begin with a query that looks like the following: select max(time) from my_table where range > 1