[ https://issues.apache.org/jira/browse/HIVE-24162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ASF GitHub Bot updated HIVE-24162: ---------------------------------- Labels: pull-request-available (was: ) > Query based compaction looses bloom filter > ------------------------------------------ > > Key: HIVE-24162 > URL: https://issues.apache.org/jira/browse/HIVE-24162 > Project: Hive > Issue Type: Bug > Reporter: Peter Varga > Assignee: Peter Varga > Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > *Steps to reproduce:* > > {noformat} > +----------------------------------------------------+ > | createtab_stmt | > +----------------------------------------------------+ > | CREATE TABLE `bloomTest`( | > | `msisdn` string, | > | `imsi` varchar(20), | > | `imei` bigint, | > | `cell_id` bigint) | > | ROW FORMAT SERDE | > | 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' | > | STORED AS INPUTFORMAT | > | 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' | > | OUTPUTFORMAT | > | 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' | > | LOCATION | > | > 's3a://dwxtpcds30-wwgq-dwx-managed/clusters/env-6cwwgq/warehouse-1580338415-7dph/warehouse/tablespace/managed/hive/del_db.db/bloomtest' > | > | TBLPROPERTIES ( | > | 'bucketing_version'='2', | > | 'orc.bloom.filter.columns'='msisdn,cell_id,imsi', | > | 'orc.bloom.filter.fpp'='0.02', | > | 'transactional'='true', | > | 'transactional_properties'='default', | > | 'transient_lastDdlTime'='1597222946') | > +----------------------------------------------------+ > insert into bloomTest values ("a", "b", 10, 20); > insert into bloomTest values ("aa", "bb", 100, 200); > insert into bloomTest values ("aaa", "bbb", 1000, 2000); > select * from bloomTest; > +-------------------+-----------------+-----------------+--------------------+ > | bloomtest.msisdn | bloomtest.imsi | bloomtest.imei | bloomtest.cell_id | > +-------------------+-----------------+-----------------+--------------------+ > | a | b | 10 | 20 | > | aa | bb | 100 | 200 | > | aaa | bbb | 1000 | 2000 | > +-------------------+-----------------+-----------------+--------------------+ > {noformat} > - Compact the table > {code:java} > alter table bloomTest compact 'MAJOR'; > {code} > - Wait for the compaction to be over and check for bloom filters in dataset. > > - delta would have it, but not in the base dataset. -- This message was sent by Atlassian Jira (v8.3.4#803005)