Krisztian Kasa created HIVE-24854:
-------------------------------------

             Summary: Incremental Materialized view refresh in presence of 
update/delete operations
                 Key: HIVE-24854
                 URL: https://issues.apache.org/jira/browse/HIVE-24854
             Project: Hive
          Issue Type: Improvement
            Reporter: Krisztian Kasa
            Assignee: Krisztian Kasa


Current implementation of incremental Materialized can not be used if any of 
the Materialized view source tables has update or delete operation since the 
last rebuild. In such cases a full rebuild should be performed.

Steps to enable incremental rebuild:
1. Introduce a new virtual column to mark a row deleted
2. Execute the query in the view definition 
2.a. Add filter to each table scan in order to pull only the rows from each 
source table which has a higher writeId than the writeId of the last rebuild - 
this is already implemented by current incremental rebuild
2.b Add row is deleted virtual column to each table scan. In join nodes if any 
of the branches has a deleted row the result row is also deleted.

We should distinguish two type of view definition queries: with and without 
Aggregate.

3.a No aggregate path:
Rewrite the plan of the full rebuild to a multi insert statement with two 
insert branches. One branch to insert new rows into the materialized view table 
and the second one for insert deleted rows to the materialized view delete 
delta.

3.b Aggregate path: TBD




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to