Hi, I have a table pokes(foo int, bar string) on which I created a compact index (on foo). I set hive.optimze.autoindex=true and then run the query select * from pokes where name = 'somename'. However, when I enable logging, I do not see the phase in which the optimizer makes use of the index. I am suspecting that it did not use the index to execute the query. If so, how should I force it to use the index?
Thank you in advance for your reply, Mahsa Below is the complete log: 12/05/19 17:01:37 INFO ql.Driver: <PERFLOG method=Driver.run> 12/05/19 17:01:37 INFO ql.Driver: <PERFLOG method=compile> 12/05/19 17:01:37 INFO parse.ParseDriver: Parsing command: select * from pokes where pokes.bar = 'John' 12/05/19 17:01:37 INFO parse.ParseDriver: Parse Completed 12/05/19 17:01:37 INFO parse.SemanticAnalyzer: Starting Semantic Analysis 12/05/19 17:01:37 INFO parse.SemanticAnalyzer: Completed phase 1 of Semantic Analysis 12/05/19 17:01:37 INFO parse.SemanticAnalyzer: Get metadata for source tables 12/05/19 17:01:37 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=pokes 12/05/19 17:01:37 INFO hive.log: DDL: struct pokes { i32 foo, string bar} 12/05/19 17:01:37 INFO parse.SemanticAnalyzer: Get metadata for subqueries 12/05/19 17:01:37 INFO parse.SemanticAnalyzer: Get metadata for destination tables 12/05/19 17:01:37 INFO parse.SemanticAnalyzer: Completed getting MetaData in Semantic Analysis 12/05/19 17:01:37 INFO hive.log: DDL: struct pokes { i32 foo, string bar} 12/05/19 17:01:37 INFO ppd.OpProcFactory: Processing for FS(3) 12/05/19 17:01:37 INFO ppd.OpProcFactory: Processing for SEL(2) 12/05/19 17:01:37 INFO ppd.OpProcFactory: Processing for FIL(1) 12/05/19 17:01:37 INFO ppd.OpProcFactory: Pushdown Predicates of FIL For Alias : pokes 12/05/19 17:01:37 INFO ppd.OpProcFactory: (bar = 'John') 12/05/19 17:01:37 INFO ppd.OpProcFactory: Processing for TS(0) 12/05/19 17:01:37 INFO ppd.OpProcFactory: Pushdown Predicates of TS For Alias : pokes 12/05/19 17:01:37 INFO ppd.OpProcFactory: (bar = 'John') 12/05/19 17:01:37 INFO hive.log: DDL: struct pokes { i32 foo, string bar} 12/05/19 17:01:37 INFO hive.log: DDL: struct pokes { i32 foo, string bar} 12/05/19 17:01:37 INFO hive.log: DDL: struct pokes { i32 foo, string bar} 12/05/19 17:01:37 INFO physical.MetadataOnlyOptimizer: Looking for table scans where optimization is applicable 12/05/19 17:01:37 INFO physical.MetadataOnlyOptimizer: Found 0 metadata only table scans 12/05/19 17:01:37 INFO parse.SemanticAnalyzer: Completed plan generation 12/05/19 17:01:37 INFO ql.Driver: Semantic Analysis Completed 12/05/19 17:01:37 INFO ql.Driver: Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:foo, type:int, comment:null), FieldSchema(name:bar, type:string, comment:null)], properties:null) 12/05/19 17:01:37 INFO ql.Driver: </PERFLOG method=compile start=1337461297242 end=1337461297288 duration=46> 12/05/19 17:01:37 INFO ql.Driver: <PERFLOG method=Driver.execute> 12/05/19 17:01:37 INFO ql.Driver: Starting command: select * from pokes where pokes.bar = 'John' Total MapReduce jobs = 1 12/05/19 17:01:37 INFO ql.Driver: Total MapReduce jobs = 1 Launching Job 1 out of 1 12/05/19 17:01:37 INFO ql.Driver: Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator 12/05/19 17:01:37 INFO exec.Task: Number of reduce tasks is set to 0 since there's no reduce operator 12/05/19 17:01:37 INFO exec.ExecDriver: Using org.apache.hadoop.hive.ql.io.CombineHiveInputFormat 12/05/19 17:01:37 INFO exec.ExecDriver: adding libjars: file:///usr/local/hive/build/dist/lib/hive-builtins-0.10.0-SNAPSHOT.jar 12/05/19 17:01:37 INFO exec.ExecDriver: Processing alias pokes 12/05/19 17:01:37 INFO exec.ExecDriver: Adding input file hdfs://localhost:54310/user/hive/warehouse/pokes 12/05/19 17:01:37 INFO exec.Utilities: Content Summary not cached for hdfs://localhost:54310/user/hive/warehouse/pokes 12/05/19 17:01:37 INFO exec.ExecDriver: Making Temp Directory: hdfs://localhost:54310/tmp/hive-hduser/hive_2012-05-19_17-01-37_242_5525578509473503196/-ext-10001 12/05/19 17:01:37 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 12/05/19 17:01:37 INFO io.CombineHiveInputFormat: CombineHiveInputSplit creating pool for hdfs://localhost:54310/user/hive/warehouse/pokes; using filter path hdfs://localhost:54310/user/hive/warehouse/pokes 12/05/19 17:01:37 INFO mapred.FileInputFormat: Total input paths to process : 1 12/05/19 17:01:37 INFO io.CombineHiveInputFormat: number of splits 1 Starting Job = job_201205131854_0113, Tracking URL = http://localhost:50030/jobdetails.jsp?jobid=job_201205131854_0113 12/05/19 17:01:37 INFO exec.Task: Starting Job = job_201205131854_0113, Tracking URL = http://localhost:50030/jobdetails.jsp?jobid=job_201205131854_0113 Kill Command = /usr/local/hadoop/bin/../bin/hadoop job -Dmapred.job.tracker=localhost:54311 -kill job_201205131854_0113 12/05/19 17:01:37 INFO exec.Task: Kill Command = /usr/local/hadoop/bin/../bin/hadoop job -Dmapred.job.tracker=localhost:54311 -kill job_201205131854_0113 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0 12/05/19 17:01:43 INFO exec.Task: Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0 2012-05-19 17:01:43,764 Stage-1 map = 0%, reduce = 0% 12/05/19 17:01:43 INFO exec.Task: 2012-05-19 17:01:43,764 Stage-1 map = 0%, reduce = 0% 2012-05-19 17:01:46,773 Stage-1 map = 100%, reduce = 0% 12/05/19 17:01:46 INFO exec.Task: 2012-05-19 17:01:46,773 Stage-1 map = 100%, reduce = 0% 2012-05-19 17:01:49,782 Stage-1 map = 100%, reduce = 100% 12/05/19 17:01:49 INFO exec.Task: 2012-05-19 17:01:49,782 Stage-1 map = 100%, reduce = 100% Ended Job = job_201205131854_0113 12/05/19 17:01:49 INFO exec.Task: Ended Job = job_201205131854_0113 12/05/19 17:01:49 INFO exec.FileSinkOperator: Moving tmp dir: hdfs://localhost:54310/tmp/hive-hduser/hive_2012-05-19_17-01-37_242_5525578509473503196/_tmp.-ext-10001 to: hdfs://localhost:54310/tmp/hive-hduser/hive_2012-05-19_17-01-37_242_5525578509473503196/_tmp.-ext-10001.intermediate 12/05/19 17:01:49 INFO exec.FileSinkOperator: Moving tmp dir: hdfs://localhost:54310/tmp/hive-hduser/hive_2012-05-19_17-01-37_242_5525578509473503196/_tmp.-ext-10001.intermediate to: hdfs://localhost:54310/tmp/hive-hduser/hive_2012-05-19_17-01-37_242_5525578509473503196/-ext-10001 12/05/19 17:01:49 INFO ql.Driver: </PERFLOG method=Driver.execute start=1337461297288 end=1337461309816 duration=12528> MapReduce Jobs Launched: 12/05/19 17:01:49 INFO ql.Driver: MapReduce Jobs Launched: Job 0: Map: 1 HDFS Read: 5812 HDFS Write: 0 SUCCESS 12/05/19 17:01:49 INFO ql.Driver: Job 0: Map: 1 HDFS Read: 5812 HDFS Write: 0 SUCCESS Total MapReduce CPU Time Spent: 0 msec 12/05/19 17:01:49 INFO ql.Driver: Total MapReduce CPU Time Spent: 0 msec OK 12/05/19 17:01:49 INFO ql.Driver: OK 12/05/19 17:01:49 INFO ql.Driver: <PERFLOG method=releaseLocks> 12/05/19 17:01:49 INFO ql.Driver: </PERFLOG method=releaseLocks start=1337461309816 end=1337461309817 duration=1> 12/05/19 17:01:49 INFO ql.Driver: </PERFLOG method=Driver.run start=1337461297242 end=1337461309817 duration=12575> Time taken: 12.599 seconds 12/05/19 17:01:49 INFO CliDriver: Time taken: 12.599 seconds