Hi, Not sure if it is relevant to your problem but I'm just checking that you know about hive.optimize.index.filter.compact.minsize it's set to 5Gbytes by default and if the estimated query size is less than this then the index won't be used. HTH.
Regards Peter Marron Senior Developer, Research & Development Office: +44 (0) 118-940-7609 peter.mar...@trilliumsoftware.com<mailto:peter.mar...@trilliumsoftware.com> Theale Court First Floor, 11-13 High Street, Theale, RG7 5AH, UK [cid:image001.png@01CF2135.7231BFC0] [cid:image002.png@01CF2135.7231BFC0]<https://www.facebook.com/pages/Trillium-Software/109184815778307> [cid:image003.png@01CF2135.7231BFC0]<https://twitter.com/TrilliumSW> [cid:image004.png@01CF2135.7231BFC0]<http://www.linkedin.com/company/17710> www.trilliumsoftware.com<http://www.trilliumsoftware.com/> Be Certain About Your Data. Be Trillium Certain. From: Thilina Gunarathne [mailto:cset...@gmail.com] Sent: 03 February 2014 16:08 To: user Subject: Index not getting used for the queries Dear all, I created a compact index for a table with several hundred million records as follows. The table is partitioned by the month. The index on A and B was created successfully, but I can't see it getting used in the queries. It would be great if one of you experts can shed some light on what am I missing. I'm using hive 0.9. set hive.exec.parallel=false; CREATE INDEX idx_yyyy ON TABLE yyyy(a,b) AS 'org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler' WITH DEFERRED REBUILD COMMENT 'Index for yyyy table. Indexing on A and B'; ALTER INDEX idx_yyyy on yyyy REBUILD; hive> describe yyyy; OK a bigint ... b bigint .... month int hive> show index on yyyy; OK idx_yyyy yyyy a, b default__yyyy_p_idx_yyyy__ compact Index for tm top50 table. Indexing on A and B hive> explain select a,b from tm_top50_p where a=113231 and month=201308; OK ABSTRACT SYNTAX TREE: (TOK_QUERY (TOK_FROM (TOK_TABREF (TOK_TABNAME yyyy))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL a)) (TOK_SELEXPR (TOK_TABLE_OR_COL b))) (TOK_WHERE (and (= (TOK_TABLE_OR_COL a) 113231) (= (TOK_TABLE_OR_COL month) 201308))))) STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 is a root stage STAGE PLANS: Stage: Stage-1 Map Reduce Alias -> Map Operator Tree: yyyy TableScan alias: yyyy Filter Operator predicate: expr: (a = 113231) type: boolean Select Operator expressions: expr: a type: bigint expr: b type: bigint outputColumnNames: _col0, _col1 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Stage: Stage-0 Fetch Operator limit: -1 thanks a lot, Thilina -- https://www.cs.indiana.edu/~tgunarat/ http://www.linkedin.com/in/thilina http://thilina.gunarathne.org
<<inline: image001.png>>
<<inline: image002.png>>
<<inline: image003.png>>
<<inline: image004.png>>