[ https://issues.apache.org/jira/browse/HIVE-27948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Stamatis Zampetakis updated HIVE-27948: --------------------------------------- Attachment: materialized_view_unix_timestamp.q > Wrong results when using materialized views with non-deterministic/dynamic > functions > ------------------------------------------------------------------------------------ > > Key: HIVE-27948 > URL: https://issues.apache.org/jira/browse/HIVE-27948 > Project: Hive > Issue Type: Bug > Components: CBO > Affects Versions: 4.0.0-beta-1 > Reporter: Stamatis Zampetakis > Priority: Major > Attachments: materialized_view_unix_timestamp.q > > > There are certain SQL functions that return different results across > different executions. Usually we refer to these functions as > non-deterministic or dynamic. Some examples are: UNIX_TIMESTAMP(), > CURRENT_TIMESTAMP, CURRENT_DATE, etc. > When a materialized view definition contains such functions the queries that > are using this view may return wrong results. > Consider the following scenario where we populate the employee table with > timestamps representing the future. For making this easily reproable in > self-contained test the timestamps are only a few seconds apart. > {code:sql} > CREATE TABLE EMPS (ENAME STRING, BIRTH_EPOCH_SECS INT) STORED AS ORC > TBLPROPERTIES ('transactional'='true'); > INSERT INTO EMPS > VALUES ('Victor', UNIX_TIMESTAMP()), > ('Alex', UNIX_TIMESTAMP() + 2), > ('Bob', UNIX_TIMESTAMP() + 5), > ('Alice', UNIX_TIMESTAMP() + 10); > CREATE MATERIALIZED VIEW v_emp AS SELECT * FROM EMPS WHERE BIRTH_EPOCH_SECS > <= UNIX_TIMESTAMP(); > {code} > When the materialized view is created it is populated with only the rows that > match the timestamp at the given time. > To demonstrate the problem run the following queries with view based > rewritting disabled and enabled. > {code:sql} > set hive.materializedview.rewriting.sql=false; > SELECT * FROM EMPS WHERE BIRTH_EPOCH_SECS <= UNIX_TIMESTAMP(); > {code} > {noformat} > Victor 1702302786 > Alex 1702302788 > Bob 1702302791 > {noformat} > {code:sql} > set hive.materializedview.rewriting.sql=true; > SELECT * FROM EMPS WHERE BIRTH_EPOCH_SECS <= UNIX_TIMESTAMP(); > {code} > {noformat} > Victor 1702302786 > Alex 1702302788 > {noformat} > Naturally the second query should return more rows than the first one since > UNIX_TIMESTAMP is constantly growing. However, when view based rewritting is > in use the second query will use the results from the materialized view which > are by now obsolete (missing Bob entry). -- This message was sent by Atlassian Jira (v8.20.10#820010)