Hi Seth, Thanks for bringing up this topic. I think the second approach is a more generic solution. Other connectors can also benefit from this. We also keep the flexibility for generating random timestamps for some scenarios.
Best, Godfrey Seth Wiesman <sjwies...@gmail.com> 于2020年7月24日周五 下午11:30写道: > Hi everyone, > > Currently, the data gen table source only supports a subset of Flink SQL > types. One missing type in particular is TIMESTAMP(3). The reason, I > suspect, it was not added originally is that it doesn't really make sense > to have random timestamps. What you really want is for them to be > ascending. In the use cases of data generation, users typically don't care > about late data. The workaround proposed in the docs is to create your > event time attribute using a computed column. > > CREATE TABLE t ( > ts AS LOCALTIMESTAMP > ) WITH ( > 'connector' = 'datagen' > ) > > The problem is that this does not play well with the LIKE clause. Many > users do not create datagen backed tables from scratch but using the LIKE > clause to shadow a physical table in their catalog - such as Kafka. > > The problem is the LIKE clause does not allow redefining columns so there > is no way to do this for a table with an event time attribute. The below > will fail. > > CREATE TABLE Orders ( > order_id BIGINT, > order_time TIMESTAMP(3) > quantity INT, > cost AS price * quantity, > WATERMARK FOR order_time AS order_time - INTERVAL '5' SECOND, > PRIMARY KEY (order_id) NOT ENFORCED > ) WITH ( > 'connector' = 'kafka', > 'topic' = 'orders', > 'properties.bootstrap.servers' = 'localhost:9092', > 'properties.group.id' = 'orderGroup', > 'format' = 'csv' > ) > > CREATE TEMPORARY TABLE Orders WITH ( > 'connector' = 'datagen' > ) LIKE Orders (EXCLUDING ALL) > > > I see two solutions to this and would like to hear what people think. > > 1) Support TIMESTAMP in datagen tables but always supply strictly ascending > timestamps. The above would now "just work". This semantic makes sense > given the way event time attributes are used in streaming applications and > we can clearly document the behavior. > > 2) Relax the constraints of the LIKE clause to allow overriding physical > columns with computed columns. This would make it clearer to the user what > is happening but would require substantially higher development effort and > I don't know if this feature would add value beyond this one use case. In > practice, this would allow the following. > > Please let me know what you think. > CREATE TEMPORARY TABLE Orders ( > order_time AS LOCALTIMESTAMP > ) WITH ( > 'connector' = 'datagen' > ) LIKE Orders (EXCLUDING ALL) > > Seth >