Hi Seth,
Thanks for bringing up this topic.

I think the second approach is a more generic solution.
Other connectors can also benefit from this.
We also keep the flexibility for generating random timestamps for some
scenarios.

Best,
Godfrey

Seth Wiesman <sjwies...@gmail.com> 于2020年7月24日周五 下午11:30写道:

> Hi everyone,
>
> Currently, the data gen table source only supports a subset of Flink SQL
> types. One missing type in particular is TIMESTAMP(3). The reason, I
> suspect, it was not added originally is that it doesn't really make sense
> to have random timestamps. What you really want is for them to be
> ascending. In the use cases of data generation, users typically don't care
> about late data. The workaround proposed in the docs is to create your
> event time attribute using a computed column.
>
> CREATE TABLE t (
>     ts AS LOCALTIMESTAMP
> ) WITH (
>     'connector' = 'datagen'
> )
>
> The problem is that this does not play well with the LIKE clause. Many
> users do not create datagen backed tables from scratch but using the LIKE
> clause to shadow a physical table in their catalog - such as Kafka.
>
> The problem is the LIKE clause does not allow redefining columns so there
> is no way to do this for a table with an event time attribute. The below
> will fail.
>
> CREATE TABLE Orders (
>     order_id   BIGINT,
>     order_time TIMESTAMP(3)
>     quantity   INT,
>     cost       AS price * quantity,
>     WATERMARK FOR order_time AS order_time - INTERVAL '5' SECOND,
>     PRIMARY KEY (order_id) NOT ENFORCED
> ) WITH (
>     'connector' = 'kafka',
>     'topic' = 'orders',
>     'properties.bootstrap.servers' = 'localhost:9092',
>     'properties.group.id' = 'orderGroup',
>     'format' = 'csv'
> )
>
> CREATE TEMPORARY TABLE Orders WITH (
>     'connector' = 'datagen'
> ) LIKE Orders (EXCLUDING ALL)
>
>
> I see two solutions to this and would like to hear what people think.
>
> 1) Support TIMESTAMP in datagen tables but always supply strictly ascending
> timestamps. The above would now "just work". This semantic makes sense
> given the way event time attributes are used in streaming applications and
> we can clearly document the behavior.
>
> 2) Relax the constraints of the LIKE clause to allow overriding physical
> columns with computed columns. This would make it clearer to the user what
> is happening but would require substantially higher development effort and
> I don't know if this feature would add value beyond this one use case. In
> practice, this would allow the following.
>
> Please let me know what you think.
> CREATE TEMPORARY TABLE Orders (
>     order_time AS LOCALTIMESTAMP
> ) WITH (
>      'connector' = 'datagen'
> ) LIKE Orders (EXCLUDING ALL)
>
> Seth
>

Reply via email to