Mert Hocanin created HIVE-21419: ----------------------------------- Summary: Partition Pruning not happening when using Apache Ranger masking Key: HIVE-21419 URL: https://issues.apache.org/jira/browse/HIVE-21419 Project: Hive Issue Type: Bug Components: Physical Optimizer, Query Planning Affects Versions: 2.3.2 Environment: I used an AWS Cloudformation script from AWS's big data blog[1]. The EMR AMI uses Hive 2.3.3 and Apache Ranger 1.0.0.
Source Table: CREATE EXTERNAL TABLE analyst1.lineitem_partitioned ( `l_orderkey` int, `l_partkey` int, `l_suppkey` int, `l_linenumber` int, `l_quantity` double, `l_extendedprice` double, `l_discount` double, `l_tax` double, `l_returnflag` string, `l_linestatus` string, `l_commitdate` string, `l_receiptdate` string, `l_shipinstruct` string, `l_shipmode` string, `l_comment` string ) PARTITIONED BY (`l_shipdate` string) STORED AS PARQUET LOCATION '/user/analyst1/tpch/sf100/lineitem'; Destination Table: CREATE EXTERNAL TABLE analyst1.test1( l_commitdate string, l_receiptdate string ) PARTITIONED BY (`l_shipdate` string) STORED AS PARQUET LOCATION '/user/analyst1/tpch/sf100/lineitem_parq_partitioned'; Query: insert overwrite table analyst1.test1 PARTITION (l_shipdate) select l_commitdate, l_receiptdate, l_shipdate from default.lineitem_parq_partitioned where l_shipdate = '1992-01-02'; Ranger Masking Rule: Hive Database: analyst1 Hive Table: lineitem_partitioned Mask Condition Option: Custom: "XXXXXX" (replace the column with a static string for simplicity, but our use case uses a complex UDF). [1] https://aws.amazon.com/blogs/big-data/implementing-authorization-and-auditing-using-apache-ranger-on-amazon-emr/ Reporter: Mert Hocanin Attachments: Operators-in-debugger-with-masking.png, Operators-in-debugger-without-masking.png, hive-jira-schema-explain-plan.txt I have a partitioned table, which I have a Ranger masking policy on a non-partition column. When I am attempting to query the table that includes the column that has masking enabled, then partition pruning no longer occurs. To reproduce: Create two partitioned tables. I used TPC-H tables as they are publicly available and will provide the schemas and queries I used. Insert into the second table from the first table. For example: insert overwrite table analyst1.test1 PARTITION (l_shipdate) select l_commitdate, l_receiptdate, l_shipdate from analyst1.lineitem_partitioned where l_shipdate = '1992-01-02'; I have attached the explain plan when a masking rule on l_commitdate is enabled and when not enabled. I have done a bit of deep dive and see that the pruning expression is not being set when the masking rule is enabled. -- This message was sent by Atlassian JIRA (v7.6.3#76005)