[ https://issues.apache.org/jira/browse/HIVE-13377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216620#comment-15216620 ]
Gabriel C Balan commented on HIVE-13377: ---------------------------------------- Gently pinging [~Ferd], [~dongc], [~spena], [~ashutoshc]. > Lost rows when using compact index on parquet table > --------------------------------------------------- > > Key: HIVE-13377 > URL: https://issues.apache.org/jira/browse/HIVE-13377 > Project: Hive > Issue Type: Bug > Components: Indexing > Affects Versions: 1.1.0 > Environment: linux, cdh 5.5.0 > Reporter: Gabriel C Balan > Priority: Minor > > Query with where clause on a parquet table loses rows when using a compact > index. The query produces the right results without the index. > {code} > create table small_parq(i int) stored as parquet; > insert into table small_parq values (1), (2), (3), (4), (5), (6), (7), (8), > (9), (10), (11); > set hive.optimize.index.filter=true; > set hive.optimize.index.filter.compact.minsize=50; > create index comp_idx on table small_parq (i) as 'compact' WITH DEFERRED > REBUILD; > alter index comp_idx on small_parq rebuild; > select * from small_parq where i=3; > --this correctly produces 1 row (value 3). > select * from small_parq where i=11; > --this incorrectly produces 0 rows. > --I see correct results when looking for a row in [1,6]; > --I see bad results when looking for a row in [7,11]. > --All is well once I disable the compact index > set hive.optimize.index.filter.compact.minsize=50000000; > select * from small_parq where i=11; > --now it correctly produces 1 row (value 11). > {code} > It seems I can't reproduce this issue if the base table was ORC, SEQ, AVRO, > TEXTFILE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)