[ https://issues.apache.org/jira/browse/HIVE-13039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yongzhi Chen updated HIVE-13039: -------------------------------- Description: BETWEEN becomes exclusive in parquet table when predicate pushdown is on (as it is by default in newer Hive versions). To reproduce(in a cluster, not local setup): CREATE TABLE parquet_tbl( key int, ldate string) PARTITIONED BY ( lyear string ) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'; insert overwrite table parquet_tbl partition (lyear='2016') select 1, '2016-02-03' from src limit 1; set hive.optimize.ppd.storage = true; set hive.optimize.ppd = true; select * from parquet_tbl where ldate between '2016-02-03' and '2016-02-03'; No row will be returned in a cluster. But if you turn off hive.optimize.ppd, one row will be returned. was: BETWEEN becomes exclusive in parquet table when predicate pushdown is on (as it is by default in newer Hive versions). To reproduce(in a cluster, not local setup): CREATE TABLE parquet_tbl( key int, ldate string) PARTITIONED BY ( lyear string ) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'; insert overwrite table parquet_tbl partition (lyear='2016') select 1, '2016-02-03' from src limit 1; set hive.optimize.ppd.storage = true; set hive.optimize.ppd = true; select * from parquet_tbl where ldate between '2016-02-03' and '2016-02-03'; > BETWEEN predicate is not functioning correctly with predicate pushdown on > Parquet table > --------------------------------------------------------------------------------------- > > Key: HIVE-13039 > URL: https://issues.apache.org/jira/browse/HIVE-13039 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer > Affects Versions: 1.2.1, 2.0.0 > Reporter: Yongzhi Chen > Assignee: Yongzhi Chen > > BETWEEN becomes exclusive in parquet table when predicate pushdown is on (as > it is by default in newer Hive versions). To reproduce(in a cluster, not > local setup): > CREATE TABLE parquet_tbl( > key int, > ldate string) > PARTITIONED BY ( > lyear string ) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'; > insert overwrite table parquet_tbl partition (lyear='2016') select > 1, > '2016-02-03' from src limit 1; > set hive.optimize.ppd.storage = true; > set hive.optimize.ppd = true; > select * from parquet_tbl where ldate between '2016-02-03' and '2016-02-03'; > No row will be returned in a cluster. > But if you turn off hive.optimize.ppd, one row will be returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)