raminqaf commented on code in PR #28287: URL: https://github.com/apache/flink/pull/28287#discussion_r3413820719
########## docs/content.zh/docs/sql/materialized-table/statements.md: ########## @@ -260,6 +261,129 @@ The operation updates the materialized table similarly to [ALTER MATERIALIZED TA See [ALTER MATERIALIZED TABLE AS](#as-select_statement-1) for more details. +## Converting a Table to a Materialized Table + +This lets you adopt a materialized table on top of a table that already exists, without dropping and recreating it. + +`CREATE OR ALTER MATERIALIZED TABLE` can convert an existing regular table into a materialized table in place. The catalog object keeps its identity and underlying storage. Its kind becomes materialized table, and its schema, options, query definition, freshness, and refresh mode are taken from the conversion statement, exactly as for a newly created materialized table. After the conversion, a refresh job is launched just as it is for a newly created materialized table. + +**Enabling conversion** + +Conversion is disabled by default. It is a one-way operation: it permanently turns a regular table into a materialized table and cannot be undone, because there is no operation that converts a materialized table back into a regular table. Keeping it off by default also preserves source compatibility. A `CREATE OR ALTER MATERIALIZED TABLE` that happens to name an existing regular table keeps its previous behavior of being rejected, so no existing workflow silently changes meaning until you opt in. + +When conversion is disabled, `CREATE OR ALTER MATERIALIZED TABLE` against a regular table is rejected. To enable it, set: + +```yaml +table.materialized-table.conversion-from-table.enabled: true +``` + +The option is read at planning time from the session's root configuration, so it must be set when the `TableEnvironment` session is initialized. Set it in the cluster configuration file `config.yaml`, or in the configuration used to create the session. Changing it afterwards with a session-level `SET` statement has no effect. + +**Schema** + +The schema comes from the `CREATE OR ALTER MATERIALIZED TABLE` statement and its query, exactly as for a brand-new materialized table. Nothing is taken from the source table. These are the same rules `CREATE MATERIALIZED TABLE` already uses. + +The examples below read from a source table `orders` and convert an existing regular table `user_spending`: + +```sql +-- Source table the query reads from +CREATE TABLE orders ( + user_id BIGINT NOT NULL, + amount BIGINT, + order_time TIMESTAMP(3) +) WITH ( + 'connector' = '...' +); + +-- The existing regular table to convert. It has a primary key and a watermark, +-- which the conversion does not carry over. +CREATE TABLE user_spending ( + user_id BIGINT NOT NULL, + total_amount BIGINT, + last_order TIMESTAMP(3), + PRIMARY KEY (user_id) NOT ENFORCED, + WATERMARK FOR last_order AS last_order - INTERVAL '5' SECOND +) WITH ( + 'connector' = '...' +); +``` + +With no column list, the schema is exactly the query output: + +```sql +CREATE OR ALTER MATERIALIZED TABLE user_spending + AS SELECT user_id, SUM(amount) AS total_amount FROM orders GROUP BY user_id; +-- columns: user_id, total_amount +-- the source's last_order column, primary key, and watermark are not carried over +``` + +To keep a primary key or watermark, re-declare it in the conversion statement. A watermark needs a rowtime column, so the query must produce one: + +```sql +CREATE OR ALTER MATERIALIZED TABLE user_spending ( + PRIMARY KEY (user_id) NOT ENFORCED, + WATERMARK FOR last_order AS last_order - INTERVAL '5' SECOND Review Comment: Thanks for pointing this out. Learned something new! Updated the docs! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
