Thanks Jingsong, Jark, Knauf, Seth for sharing your thoughts. Although we discussed many details about the concept, I think it’s worth to clarify the semantic from long term goals. Temporal table concept was first imported in SQL:2011, I made some investigation of Temporal Table work mechanism in traditional DBMS which implements it like SQL Server[1], PostgreSQL[2] In DBMS, Temporal Table is implemented as a pair of tables, a current table and a history table.The current table contains the current value of each row, the history table contains each previous value for each row. Each row contains a time range constructed by RowStartTime and RowEndTime to define the period validity of the row. The RowStartTime and RowEndTime is changed by DBMS when a DML operation happened, Given a simple temporal table in SQL Server to show how it works: CREATE TABLE dbo.currency ( [currency] VARCHAR(10) NOT NULL PRIMARY KEY, [rate] INT, [RowStart] DATETIME2 GENERATED ALWAYS AS ROW START, [RowEnd] DATETIME2 GENERATED ALWAYS AS ROW END, PERIOD FOR SYSTEM_TIME (RowStart, RowEnd) ) WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.currency_History));
1> select * from currency; // The initial test data, the RowEndTime is the max value of timestamp type currency rate RowStart RowEnd ---------- ----------- -------------------------------------- -------------------------------------- Euro 114 2020-06-29 15:06:24.7459246 9999-12-31 23:59:59.9999999 US Dollar 102 2020-06-29 15:06:24.7503288 9999-12-31 23:59:59.9999999 1> UPDATE dbo.currency SET [rate] = 118 WHERE currency = 'Euro’; // UPDATE Euro currency 2> select * from currency_History; // The history table increased a record that represents the validity period of record (Euro,114) currency rate RowStart RowEnd ---------- ----------- -------------------------------------- -------------------------------------- Euro 114 2020-06-29 15:06:24.7459246 2020-06-29 15:07:01.1245406 1> DELETE FROM dbo.currency WHERE currency = 'Euro’; // DELETE Euro currency 1> select * from currency_History; currency rate RowStart RowEnd // The history table also increased another record that represents the validity period of record (Euro, 118) ---------- ----------- -------------------------------------- -------------------------------------- Euro 114 2020-06-29 15:06:24.7459246 2020-06-29 15:07:01.1245406 Euro 118 2020-06-29 15:07:01.1245406 2020-06-29 15:07:11.2981995 1> select * from currency; currency rate RowStart RowEnd // Current table only keep the latest value ---------- ----------- -------------------------------------- -------------------------------------- US Dollar 102 2020-06-29 15:06:24.7503288 9999-12-31 23:59:59.9999999 The history table is very important for history version tracking, pleas note the DELETE operation also increase a record in history table and the record’s RowEndTime is the system time that the DELETE operation happened. In one word, temporal table use time range [RowStart, RowEnd) to mark period validity, store all versions’ records in history table for history tracking, use DBMS operation time to change the RowStart or RowEnd. Back to our Flink World, temporal table with event time attribute works well in data source that contains INSERT, UPDATE messages except DELETE currently. Let us see what happened in DELETE message scenario(i.e. changelog source), both DBMS Temporal Table and other general table can capture data change by CDC tools and have same format, I used debezuim to capture a SQL server table changes: 1> select * from currency; currency rate RowStart RowEnd ---------- ----------- -------------------------------------- -------------------------------------- Euro 118 2020-06-29 15:07:01.1245406 9999-12-31 23:59:59.9999999 US Dollar 102 2020-06-29 15:06:24.7503288 9999-12-31 23:59:59.9999999 1> DELETE FROM dbo.currency WHERE currency = 'Euro’; // DELETE Euro currency 1> select * from currency_History; currency rate RowStart RowEnd ---------- ----------- -------------------------------------- -------------------------------------- Euro 118 2020-06-29 15:07:01.1245406 2020-06-29 15:07:11.2981995 { // The DELETE record produced by CDC tools(both debezuim and canal are same) "before": { "currency": "Euro", "rate": 118, "RowStart": 1593443221124540600, //2020-06-29 15:07:01.1245406 "RowEnd": -4852116231933722724 //9999-12-31 23:59:59.9999999 }, "after": null, "op": "d”, // DELETE operation "ts_ms": 15934432361354, //2020-06-29 15:07:16.354, the ’ts_ms’ value is bigger than the record delete operation time(2020-06-29 15:07:11.2981995) "transaction": null } The main problem is that the DELETE record only contains current table message which does not contain the expected RowEnd (2020-06-29 15:07:11.2981995) in history table. Without the exact RowEndTime, it’s impossible to obtain exact previous version of temporal table in Flink, `ts_ms` filed in CDC record is an approximate time of RowEndTime but it depends on CDC tool status and can not equal the RowEndTime from semantics angle. Current Temporal Table Function supports: (1) Define temporal table backed upsert data source with process time (2) Define temporal table backed upsert data source with event time I think the proposed Temporal table currently could support: (1) Define temporal table backed upsert(include delete) data source with process time (2) Define temporal table backed upsert data source with event time (3) Do not support define temporal table backed data source that contains DELETE message with event time. Because most CDC tools can not obtain the exact DELETE operation time currently, the “ts_ms” field from meta is just an approximate time which will break event time semantics. And we can support it when CDC tools have the ability to obtain/extract the DML operation time. And this has get consensus from me, Jingsong, Jark and Kurt after offline discuss, the opinions from Knauf and Seth looks like same with us. I’ll prepare a design doc for temporal table, thanks everyone involving and please let me know if you have any concern. Best, Leonard Xu [1] https://docs.microsoft.com/en-us/sql/relational-databases/tables/temporal-tables?view=sql-server-ver15 <https://docs.microsoft.com/en-us/sql/relational-databases/tables/temporal-tables?view=sql-server-ver15> [2] https://pgxn.org/dist/temporal_tables/ <https://pgxn.org/dist/temporal_tables/>