Sourabh Badhya created HIVE-27731:
-------------------------------------

             Summary: Perform metadata delete when only static filters are 
present
                 Key: HIVE-27731
                 URL: https://issues.apache.org/jira/browse/HIVE-27731
             Project: Hive
          Issue Type: Improvement
            Reporter: Sourabh Badhya
            Assignee: Sourabh Badhya


When the query has static filters only, try to perform a metadata delete 
directly rather than moving forward with positional delete.

Some relevant use cases where metadata deletes can be used - 
{code:java}
DELETE FROM ice_table where id = 1;{code}
As seen above only filter is (id = 1). If in scenarios wherein the filter 
corresponds to a partition column then metadata delete is more efficient and 
does not generate additional files.

For partition evolution cases, if it is not possible to perform metadata delete 
then positional delete is done.

Some other optimisations that can be seen here is utilizing vectorized 
expressions for UDFs which provide vectorized expressions such as year - 
{code:java}
DELETE FROM ice_table where id = 1 AND year(datecol) = 2015;{code}
Delete query with Multi-table scans will not optimized using this method since 
determination of where clauses happens at runtime.

A similar optimisation is seen in Spark where metadata delete is done whenever 
possible- 
[https://github.com/apache/iceberg/blob/master/spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java#L297-L389]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to