[ https://issues.apache.org/jira/browse/HIVE-11033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Prasanth Jayachandran updated HIVE-11033: ----------------------------------------- Attachment: HIVE-11033.patch > BloomFilter index is not honored by ORC reader > ---------------------------------------------- > > Key: HIVE-11033 > URL: https://issues.apache.org/jira/browse/HIVE-11033 > Project: Hive > Issue Type: Bug > Affects Versions: 1.2.0 > Reporter: Allan Yan > Attachments: HIVE-11033.patch > > > There is a bug in the org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl class > which caused the bloom filter index saved in the ORC file not being used. The > root cause is the bloomFilterIndices variable defined in the SargApplier > class superseded the one defined in its parent class. Therefore, in the > ReaderImpl.pickRowGroups() > {code} > protected boolean[] pickRowGroups() throws IOException { > // if we don't have a sarg or indexes, we read everything > if (sargApp == null) { > return null; > } > readRowIndex(currentStripe, included, sargApp.sargColumns); > return sargApp.pickRowGroups(stripes.get(currentStripe), indexes); > } > {code} > The bloomFilterIndices populated by readRowIndex() is not picked up by > sargApp object. One solution is to make SargApplier.bloomFilterIndices a > reference to its parent counterpart. > {noformat} > 18:46 $ diff src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java > src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java.original > 174d173 > < bloomFilterIndices = new OrcProto.BloomFilterIndex[types.size()]; > 178c177 > < sarg, options.getColumnNames(), strideRate, types, > included.length, bloomFilterIndices); > --- > > sarg, options.getColumnNames(), strideRate, types, > > included.length); > 204a204 > > bloomFilterIndices = new OrcProto.BloomFilterIndex[types.size()]; > 673c673 > < List<OrcProto.Type> types, int includedCount, > OrcProto.BloomFilterIndex[] bloomFilterIndices) { > --- > > List<OrcProto.Type> types, int includedCount) { > 677c677 > < this.bloomFilterIndices = bloomFilterIndices; > --- > > bloomFilterIndices = new OrcProto.BloomFilterIndex[types.size()]; > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)