Thanks Benchao and Leonard.

'Implicitly type conversion' makes sense to me.   I will emphasize the
'Implicitly type conversion' in the document.


Best,
Feng

On Sat, Jun 10, 2023 at 10:11 AM Benchao Li <libenc...@apache.org> wrote:

> Thanks Leonard for the input, "Implicitly type conversion" way sounds good
> to me.
> I also agree that this should be done in planner instead of connector,
> it'll be a lot easier for connector development.
>
> Leonard Xu <xbjt...@gmail.com> 于2023年6月9日周五 20:11写道:
>
> > About the semantics consideration, I have some new input after rethink.
> >
> > 1. We can support both TIMESTAMP and TIMESTAMP_LTZ expression following
> > the syntax  `SELECT [column_name(s)] FROM [table_name] FOR SYSTEM_TIME AS
> > OF `
> >
> > 2. For TIMESTAMP_LTZ type, give a long instant value to CatalogTable is
> > pretty intuitive, for TIMESTAMP_type, it will be implied cast to
> > TIMESTAMP_LTZ type by planner using session timezone and then pass to
> > CatalogTable. This case can be considered as a Function
> AsOfSnapshot(Table
> > t, TIMESTAMP_LTZ arg), which can pass arg with TIMESTAMP_LTZ type, but
> our
> > framework supports implicit type conversion thus users can also pass arg
> > with TIMESTAMP type. Hint, Spark[1] did the  implicit type conversion
> too.
> >
> > 3.I also considered handing over the implicit type conversion to the
> > connector instead of planner, such as passing a TIMESTAMP literal, and
> the
> > connector using the session timezone to perform type conversion, but this
> > is more complicated than previous planner handling, and it’s not friendly
> > to the connector developers.
> >
> > 4. The last point,  TIMESTAMP_LTZ  '1970-01-01 00:00:04.001’ should be an
> > invalid expression as if you can not define a instant point (i.e
> > TIMSTAMP_LTZ semantics in SQL) from a timestamp literal without timezone.
> > You can use explicit type conversion like `cast(ts_ntz as TIMESTAMP_LTZ)`
> > after `FOR SYSTEM_TIME AS OF ` if you want to use
> > Timestamp type/expression/literal without timezone.
> >
> > 5. The last last point, the TIMESTAMP_LTZ type of Flink SQL supports DST
> > time[2] well that will help user avoid many corner case.
> >
> >
> > Best,
> > Leonard
> >
> > [1]
> >
> https://github.com/apache/spark/blob/0ed48feab65f2d86f5dda3e16bd53f2f795f5bc5/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TimeTravelSpec.scala#L56
> > [2]
> >
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/timezone/#daylight-saving-time-support
> >
> >
> >
> >
> > > On Jun 9, 2023, at 1:13 PM, Benchao Li <libenc...@apache.org> wrote:
> > >
> > > As you can see that you must use `UNIX_TIMESTAMP` to do this work,
> that's
> > > where the time zone happens.
> > >
> > > What I'm talking about is casting timestamp/timestamp_ltz to long
> > directly,
> > > that's why the semantic is tricky when you are casting timestamp to
> long
> > > using time zone.
> > >
> > > For other systems, such as SQL server[1], they actually uses a string
> > > instead of timestamp literal `FOR SYSTEM_TIME AS OF '2021-01-01
> > > 00:00:00.0000000'`, I'm not sure whether they convert the string
> > implicitly
> > > to TIMESTAMP_LTZ, or they just have a different definition of the
> syntax.
> > >
> > > But for us, we are definitely using timestamp/timestmap_ltz literal
> here,
> > > that's why it is special, and we must highlight this behavior that we
> are
> > > converting a timestamp without time zone literal to long using the
> > session
> > > time zone.
> > >
> > > [1]
> > >
> >
> https://learn.microsoft.com/en-us/sql/relational-databases/tables/temporal-table-usage-scenarios?view=sql-server-ver16
> > >
> > > Feng Jin <jinfeng1...@gmail.com> 于2023年6月8日周四 11:35写道:
> > >
> > >> Hi all,
> > >>
> > >> thanks for your input
> > >>
> > >>
> > >> @Benchao
> > >>
> > >>> The type for "TIMESTAMP '2023-04-27 00:00:00'" should be "TIMESTAMP
> > >> WITHOUT TIME ZONE", converting it to unix timestamp would use UTC
> > timezone,
> > >> which is not usually expected by users.
> > >>
> > >> It was indeed the case before Flink 1.13, but now my understanding is
> > that
> > >> there have been some slight changes in the definition of TIMESTAMP.
> > >>
> > >> TIMESTAMP is currently used to specify the year, month, day, hour,
> > minute
> > >> and second. We recommend that users use
> > *UNIX_TIMESTAMP(CAST(timestamp_col
> > >> AS STRING))* to convert *TIMESTAMP values* and *long values*. The
> > >> *UNIX_TIMESTAMP* function will use the *LOCAL TIME ZONE*. Therefore,
> > >> whether converting TIMESTAMP or TIMESTAMP_LTZ to Long values will
> > involve
> > >> using the *LOCAL TIME ZONE*.
> > >>
> > >>
> > >> Here is an test:
> > >>
> > >> Flink SQL> SET 'table.local-time-zone' = 'UTC';
> > >> Flink SQL> SELECT UNIX_TIMESTAMP(CAST(TIMESTAMP '1970-01-01 00:00:00'
> as
> > >> STRING)) as `timestamp`;
> > >> ---------------
> > >> timestamp
> > >> --------------
> > >> 0
> > >>
> > >> Flink SQL> SET 'table.local-time-zone' = 'Asia/Shanghai';
> > >> Flink SQL> SELECT UNIX_TIMESTAMP(CAST(TIMESTAMP '1970-01-01 00:00:00'
> as
> > >> STRING)) as `timestamp`;
> > >> ---------------
> > >> timestamp
> > >> --------------
> > >> -28800
> > >>
> > >> Therefore, the current conversion method exposed to users is also
> using
> > >> LOCAL TIME ZONE.
> > >>
> > >>
> > >> @yuxia
> > >>
> > >> Thank you very much for providing the list of behaviors of TIMESTAMP
> in
> > >> other systems.
> > >>
> > >>> I think we can align them to avoid the inconsistency to other engines
> > and
> > >> provide convenience for the external connectors while integrating
> > Flink's
> > >> time travel API.
> > >>
> > >> +1 for this.
> > >>
> > >>> Regarding the inconsistency, I think we can consider time-travel as a
> > >> specical case, and we do needs to highlight this in this FLIP.
> > >> As for "violate the restriction outlined in FLINK-21978[1]", since we
> > cast
> > >> timestamp to epochMillis only for the internal use, and won't expose
> it
> > to
> > >> users, I don't think it will violate the restriction.
> > >> Btw, please add a brief desc to explain the meaning of the parameter
> > >> `timestamp` in method `CatalogBaseTable getTable(ObjectPath tablePath,
> > long
> > >> timestamp)`. Maybe something like "timestamp of the table snapt, which
> > is
> > >> millseconds since 1970-01-01 00:00:00 UTC".
> > >>
> > >> Thank you for the suggestions regarding the document. I will add them
> to
> > >> FLIP.
> > >>
> > >>
> > >> Best,
> > >> Feng
> > >>
> > >>
> > >> On Wed, Jun 7, 2023 at 12:18 PM Benchao Li <libenc...@apache.org>
> > wrote:
> > >>
> > >>> I also share the concern about the timezone problem.
> > >>>
> > >>> The type for "TIMESTAMP '2023-04-27 00:00:00'" should be "TIMESTAMP
> > >> WITHOUT
> > >>> TIME ZONE", converting it to unix timestamp would use UTC timezone,
> > which
> > >>> is not usually expected by users.
> > >>>
> > >>> If we want to keep consistent with the standard, we probably should
> use
> > >>> "TIMESTAMP WITH LOCAL ZONE '2023-04-27 00:00:00'", which type is
> > >> "TIMESTAMP
> > >>> WITH LOCAL TIME ZONE", and converting it to unix timestamp will
> > consider
> > >>> the session timezone, which is the expected result. But it's
> > inconvenient
> > >>> for users.
> > >>>
> > >>> Taking this a special case, and converting "TIMESTAMP '2023-04-27
> > >>> 00:00:00'" to a unix timestamp with session timezone, will be
> > convenient
> > >>> for users, but will break the standard. I will +0.5 for this choice.
> > >>>
> > >>> yuxia <luoyu...@alumni.sjtu.edu.cn> 于2023年6月7日周三 12:06写道:
> > >>>
> > >>>> Hi, Feng Jin.
> > >>>> I think the concern of Leonard may be the inconsistency of the
> > behavior
> > >>> of
> > >>>> TIMESTAMP '2023-04-27 00:00:00' beween timetravel and other sql
> > >>> statement.
> > >>>>
> > >>>> For the normal sql:
> > >>>> `SELECT TIMESTAMP '2023-04-27 00:00:00'`, we won't consider
> timezone.
> > >>>> But for the sql for timetravl:
> > >>>> `SELECT * FROM paimon_tb FOR SYSTEM_TIME AS OF TIMESTAMP '2023-04-27
> > >>>> 00:00:00'`, we will consider the timezone and convert to UTC
> > timestamp.
> > >>>>
> > >>>> The concern is valid. But for time travel, most style of engines,
> > >>>> Spark[1], Hive[2], Trino[3] also do the time conversion with
> > >> considering
> > >>>> the seesion time zone. I think we can align them to avoid the
> > >>> inconsistency
> > >>>> to other engines and provide convenience for the external connectors
> > >>> while
> > >>>> integrating Flink's time travel API.
> > >>>>
> > >>>> Regarding the inconsistency, I think we can consider time-travel as
> a
> > >>>> specical case, and we do needs to highlight this in this FLIP.
> > >>>> As for "violate the restriction outlined in FLINK-21978[1]", since
> we
> > >>> cast
> > >>>> timestamp to epochMillis only for the internal use, and won't expose
> > it
> > >>> to
> > >>>> users, I don't think it will violate the restriction.
> > >>>> Btw, please add a brief desc to explain the meaning of the parameter
> > >>>> `timestamp` in method `CatalogBaseTable getTable(ObjectPath
> tablePath,
> > >>> long
> > >>>> timestamp)`. Maybe something like "timestamp of the table snapt,
> which
> > >> is
> > >>>> millseconds since 1970-01-01 00:00:00 UTC".
> > >>>>
> > >>>> [1]
> > >>>>
> > >>>
> > >>
> >
> https://github.com/apache/spark/blob/0ed48feab65f2d86f5dda3e16bd53f2f795f5bc5/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TimeTravelSpec.scala#L56
> > >>>> [2]
> > >>>>
> > >>>
> > >>
> >
> https://github.com/apache/hive/blob/f5e69dc38d7ff26c70be19adc9d1a3ae90dc4cf2/ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java#L989
> > >>>> [3]
> > >>>>
> > >>>
> > >>
> >
> https://github.com/trinodb/trino/blob/2433d9e60f1abb0d85c32374c1758525560e1a86/plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java#L443
> > >>>>
> > >>>>
> > >>>> Best regards,
> > >>>> Yuxia
> > >>>>
> > >>>> ----- 原始邮件 -----
> > >>>> 发件人: "Feng Jin" <jinfeng1...@gmail.com>
> > >>>> 收件人: "dev" <dev@flink.apache.org>
> > >>>> 发送时间: 星期二, 2023年 6 月 06日 下午 10:15:47
> > >>>> 主题: Re: [DISCUSS] FLIP-308: Support Time Travel In Batch Mode
> > >>>>
> > >>>> Hi everyone
> > >>>>
> > >>>> Thanks everyone for your input.
> > >>>>
> > >>>>
> > >>>> @Yun
> > >>>>
> > >>>>> I think you could add descriptions of how to align backfill time
> > >>> travel
> > >>>> with querying the latest data. And I think you should also update
> the
> > >>>> "Discussion thread" in the original FLIP.
> > >>>>
> > >>>> Thank you for the suggestion, I will update it in the document.
> > >>>>
> > >>>>> I have a question about getting the table schema from the catalog.
> > >> I'm
> > >>>> not sure whether the Catalog#getTable(tablePath, timestamp) will be
> > >>> called
> > >>>> only once.
> > >>>>
> > >>>> I understand that in a query, the schema of the table is determined
> > >>> before
> > >>>> execution. The schema used will be based on the latest schema within
> > >> the
> > >>>> TimeTravel period.
> > >>>>
> > >>>> In addition, due to current syntax limitations, we are unable to
> > >> support
> > >>>> the use of BETWEEN AND.
> > >>>>
> > >>>>
> > >>>> @Jing
> > >>>>
> > >>>>> Would you like to update your thoughts described in your previous
> > >>> email
> > >>>> about why SupportsTimeTravel has been rejected into the FLIP?
> > >>>>
> > >>>> Sure,  I updated the doc.
> > >>>>
> > >>>>
> > >>>>>   Since we always directly add overload methods into Catalog
> > >> according
> > >>>> to new requirements, which makes the interface bloated
> > >>>>
> > >>>> Your concern is valid. If we need to support the long type version
> in
> > >> the
> > >>>> future, we may have to add another method "getTable(ObjectPath, long
> > >>>> version)". However, I understand that
> > >>>> "Catalog.getTable(tablePath).on(timeStamp)" may not meet the
> > >>> requirements.
> > >>>> The timestamp is for Catalog's use, and Catalog obtains the
> > >> corresponding
> > >>>> schema based on this time.
> > >>>>
> > >>>>
> > >>>> @liu @Regards
> > >>>>
> > >>>> I am very sorry for the unclear description in the document. I have
> > >>> updated
> > >>>> relevant descriptions regarding why it needs to be implemented in
> > >>> Catalog.
> > >>>>
> > >>>> Travel not only requires obtaining data at the corresponding time
> > >> point,
> > >>>> but also requires the corresponding Schema at that time point
> > >>>>
> > >>>>
> > >>>> @Shammon
> > >>>>
> > >>>>> Flink or connector such as  iceberg/paimon can create sources from
> > >> the
> > >>>> `CatalogBaseTable` directly without the need to get the snapshot ID
> > >> from
> > >>>> `CatalogTable.getSnapshot()`.  What do you think of it?
> > >>>>
> > >>>> You are right, we don't need the getSnapshot interface for
> > >> PaimonCatalog
> > >>> or
> > >>>> IcebergCatalog tables, but we may need it for temporary tables.
> > >>>>
> > >>>>
> > >>>>
> > >>>> Best,
> > >>>> Feng
> > >>>>
> > >>>>
> > >>>> On Tue, Jun 6, 2023 at 9:32 PM Feng Jin <jinfeng1...@gmail.com>
> > wrote:
> > >>>>
> > >>>>> Sorry I replied to the wrong mail. Please ignore the last email.
> > >>>>>
> > >>>>>
> > >>>>> Hi Leonard
> > >>>>>
> > >>>>>> 1. Unification SQL
> > >>>>>
> > >>>>> I agree that it is crucial for us to support both batch and
> streaming
> > >>>>> processing.  The current design allows for the support of both
> batch
> > >>> and
> > >>>>> streaming processing. I'll update the FLIP later.
> > >>>>>
> > >>>>>
> > >>>>>> 2.Semantics
> > >>>>>
> > >>>>> In my opinion, it would be feasible to perform the conversion based
> > >> on
> > >>>> the
> > >>>>> current session time, regardless of whether it is TIMESTAMP or
> > >>>>> TIMESTAMP_LTZ.
> > >>>>>
> > >>>>> However, this may indeed violate the restriction outlined in
> > >>>>> FLINK-21978[1]  as Benchao mentioned, and I am uncertain as to
> > >> whether
> > >>> it
> > >>>>> is reasonable.
> > >>>>>
> > >>>>>
> > >>>>>>  3.  Some external systems may use timestamp value to mark a
> > >>> version,
> > >>>>> but others may use version number、file position、log offset.
> > >>>>>
> > >>>>> It is true that most systems support time-related operations, and I
> > >>>>> believe that the current design is compatible with most systems.
> > >>> However,
> > >>>>> if we want to support long data type, it may require Calcite to
> > >> support
> > >>>> the
> > >>>>> VERSION AS OF syntax. I understand that this is something that we
> may
> > >>>> need
> > >>>>> to consider in the future.
> > >>>>>
> > >>>>>
> > >>>>> Best,
> > >>>>> Feng
> > >>>>>
> > >>>>> [1] https://issues.apache.org/jira/browse/FLINK-21978
> > >>>>>
> > >>>>> On Tue, Jun 6, 2023 at 8:28 PM Leonard Xu <xbjt...@gmail.com>
> wrote:
> > >>>>>
> > >>>>>> Hi, Feng
> > >>>>>>
> > >>>>>> Thanks for driving this FLIP, very impressive feature that users
> > >> want,
> > >>>>>> I’ve some quick questions here.
> > >>>>>>
> > >>>>>> 1.Unification SQL:
> > >>>>>>        The snapshot  concept exists both in Batch mode and
> > >> Streaming
> > >>>>>> mode,  could we consider a unified proposal? I think users won’t
> > >>> another
> > >>>>>> SQL syntax named
> > >>>>>> Time travel for Streaming mode.
> > >>>>>>
> > >>>>>> 2.Semantics:
> > >>>>>>        Flink supports TIMESTAMP and TIMESTAMP_LTZ types, to get a
> > >>> long
> > >>>>>> timestamp value (getTable(ObjectPath tablePath, long timestamp))
> we
> > >>> need
> > >>>>>> two information i.e. a TIMESTAMP value and current session
> timezone,
> > >>>> how
> > >>>>>> we deal the value with current proposed SQL syntax.
> > >>>>>>
> > >>>>>> 3. Is it enough using sinlge timestamp to track a
> snapshot(version)
> > >> of
> > >>>>>> external table?   Some external systems may use timestamp value to
> > >>> mark
> > >>>> a
> > >>>>>> version, but others may use version number、file position、log
> offset.
> > >>>>>>
> > >>>>>> Best,
> > >>>>>> Leonard
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>>> On Jun 5, 2023, at 3:28 PM, Yun Tang <myas...@live.com> wrote:
> > >>>>>>>
> > >>>>>>> Hi Feng,
> > >>>>>>>
> > >>>>>>> I think this FLIP would provide one important feature to unify
> the
> > >>>>>> stream-SQL and batch-SQL when we backfill the historical data in
> > >> batch
> > >>>> mode.
> > >>>>>>>
> > >>>>>>> For the "Syntax" session, I think you could add descriptions of
> > >> how
> > >>> to
> > >>>>>> align backfill time travel with querying the latest data. And I
> > >> think
> > >>>> you
> > >>>>>> should also update the "Discussion thread" in the original FLIP.
> > >>>>>>>
> > >>>>>>> Moreover, I have a question about getting the table schema from
> > >> the
> > >>>>>> catalog. I'm not sure whether the Catalog#getTable(tablePath,
> > >>> timestamp)
> > >>>>>> will be called only once. If we have a backfill query between
> > >>> 2023-05-29
> > >>>>>> and 2023-06-04 in the past week, and the table schema changed on
> > >>>>>> 2023-06-01, will the query below detect the schema changes during
> > >>>> backfill
> > >>>>>> the whole week?
> > >>>>>>>
> > >>>>>>> SELECT * FROM paimon_tb FOR SYSTEM_TIME AS OF TIMESTAMP BETWEEN
> > >>>>>> '2023-05-29 00:00:00' AND '2023-06-05 00:00:00'
> > >>>>>>>
> > >>>>>>> Best
> > >>>>>>> Yun Tang
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> ________________________________
> > >>>>>>> From: Shammon FY <zjur...@gmail.com>
> > >>>>>>> Sent: Thursday, June 1, 2023 17:57
> > >>>>>>> To: dev@flink.apache.org <dev@flink.apache.org>
> > >>>>>>> Subject: Re: [DISCUSS] FLIP-308: Support Time Travel In Batch
> Mode
> > >>>>>>>
> > >>>>>>> Hi Feng,
> > >>>>>>>
> > >>>>>>> I have one minor comment about the public interface
> > >> `Optional<Long>
> > >>>>>>> getSnapshot()` in the `CatalogTable`.
> > >>>>>>>
> > >>>>>>> As we can get tables from the new method
> > >>> `Catalog.getTable(ObjectPath
> > >>>>>>> tablePath, long timestamp)`, I think the returned
> > >> `CatalogBaseTable`
> > >>>>>> will
> > >>>>>>> have the information of timestamp. Flink or connector such as
> > >>>>>>> iceberg/paimon can create sources from the `CatalogBaseTable`
> > >>> directly
> > >>>>>>> without the need to get the snapshot ID from
> > >>>>>> `CatalogTable.getSnapshot()`.
> > >>>>>>> What do you think of it?
> > >>>>>>>
> > >>>>>>> Best,
> > >>>>>>> Shammon FY
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> On Thu, Jun 1, 2023 at 7:22 AM Jing Ge
> <j...@ververica.com.invalid
> > >>>
> > >>>>>> wrote:
> > >>>>>>>
> > >>>>>>>> Hi Feng,
> > >>>>>>>>
> > >>>>>>>> Thanks for the proposal! Very interesting feature. Would you
> like
> > >>> to
> > >>>>>> update
> > >>>>>>>> your thoughts described in your previous email about why
> > >>>>>> SupportsTimeTravel
> > >>>>>>>> has been rejected into the FLIP? This will help readers
> > >> understand
> > >>>> the
> > >>>>>>>> context (in the future).
> > >>>>>>>>
> > >>>>>>>> Since we always directly add overload methods into Catalog
> > >>> according
> > >>>>>> to new
> > >>>>>>>> requirements, which makes the interface bloated. Just out of
> > >>>> curiosity,
> > >>>>>>>> does it make sense to introduce some DSL design? Like
> > >>>>>>>> Catalog.getTable(tablePath).on(timeStamp),
> > >>>>>>>> Catalog.getTable(tablePath).current() for the most current
> > >> version,
> > >>>> and
> > >>>>>>>> more room for further extension like timestamp range, etc. I
> > >>> haven't
> > >>>>>> read
> > >>>>>>>> all the source code yet and I'm not sure if it is possible. But
> a
> > >>>>>>>> design like this will keep the Catalog API lean and the API/DSL
> > >>> will
> > >>>> be
> > >>>>>>>> self described and easier to use.
> > >>>>>>>>
> > >>>>>>>> Best regards,
> > >>>>>>>> Jing
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> On Wed, May 31, 2023 at 12:08 PM Krzysztof Chmielewski <
> > >>>>>>>> krzysiek.chmielew...@gmail.com> wrote:
> > >>>>>>>>
> > >>>>>>>>> Ok after second though I'm retracting my previous statement
> > >> about
> > >>>>>> Catalog
> > >>>>>>>>> changes you proposed.
> > >>>>>>>>> I do see a benefit for Delta connector actually with this
> change
> > >>> and
> > >>>>>> see
> > >>>>>>>>> why this could be coupled with Catalog.
> > >>>>>>>>>
> > >>>>>>>>> Delta Connector SQL support, also ships a Delta Catalog
> > >>>> implementation
> > >>>>>>>> for
> > >>>>>>>>> Flink.
> > >>>>>>>>> For Delta Catalog, table schema information is fetched from
> > >>>> underlying
> > >>>>>>>>> _delta_log and not stored in metastore. For time travel we
> > >>> actually
> > >>>>>> had a
> > >>>>>>>>> problem, that if we would like to timetravel back to some old
> > >>>> version,
> > >>>>>>>>> where schema was slightly different, then we would have a
> > >> conflict
> > >>>>>> since
> > >>>>>>>>> Catalog would return current schema and not how it was for
> > >> version
> > >>>> X.
> > >>>>>>>>>
> > >>>>>>>>> With your change, our Delta Catalog can actually fetch schema
> > >> for
> > >>>>>>>> version X
> > >>>>>>>>> and send it to DeltaTableFactory. Currency, Catalog can fetch
> > >> only
> > >>>>>>>> current
> > >>>>>>>>> version. What we would also need however is version
> > >>>> (number/timestamp)
> > >>>>>>>> for
> > >>>>>>>>> this table passed to DynamicTableFactory so we could properly
> > >> set
> > >>>>>> Delta
> > >>>>>>>>> standalone library.
> > >>>>>>>>>
> > >>>>>>>>> Regards,
> > >>>>>>>>> Krzysztof
> > >>>>>>>>>
> > >>>>>>>>> śr., 31 maj 2023 o 10:37 Krzysztof Chmielewski <
> > >>>>>>>>> krzysiek.chmielew...@gmail.com> napisał(a):
> > >>>>>>>>>
> > >>>>>>>>>> Hi,
> > >>>>>>>>>> happy to see such a feature.
> > >>>>>>>>>> Small note from my end regarding Catalog changes.
> > >>>>>>>>>>
> > >>>>>>>>>> TL;DR
> > >>>>>>>>>> I don't think it is necessary to delegate this feature to the
> > >>>>>> catalog.
> > >>>>>>>> I
> > >>>>>>>>>> think that since "timetravel" is per job/query property, its
> > >>> should
> > >>>>>> not
> > >>>>>>>>> be
> > >>>>>>>>>> coupled with the Catalog or table definition. In my opinion
> > >> this
> > >>> is
> > >>>>>>>>>> something that DynamicTableFactory only has to know about. I
> > >>> would
> > >>>>>>>> rather
> > >>>>>>>>>> see this feature as it is - SQL syntax enhancement but
> delegate
> > >>>>>> clearly
> > >>>>>>>>> to
> > >>>>>>>>>> DynamicTableFactory.
> > >>>>>>>>>>
> > >>>>>>>>>> I've implemented timetravel feature for Delta Connector  [1]
> > >>> using
> > >>>>>>>>>> current Flink API.
> > >>>>>>>>>> Docs are pending code review, but you can find them here [2]
> > >> and
> > >>>>>>>> examples
> > >>>>>>>>>> are available here [3]
> > >>>>>>>>>>
> > >>>>>>>>>> The timetravel feature that I've implemented is based on Flink
> > >>>> Query
> > >>>>>>>>>> hints.
> > >>>>>>>>>> "SELECT * FROM sourceTable /*+ OPTIONS('versionAsOf' = '1')
> */"
> > >>>>>>>>>>
> > >>>>>>>>>> The " versionAsOf" (we also have 'timestampAsOf') parameter is
> > >>>>>> handled
> > >>>>>>>>> not
> > >>>>>>>>>> by Catalog but by DyntamicTableFactory implementation for
> Delta
> > >>>>>>>>> connector.
> > >>>>>>>>>> The value of this property is passed to Delta standalone lib
> > >> API
> > >>>> that
> > >>>>>>>>>> returns table view for given version.
> > >>>>>>>>>>
> > >>>>>>>>>> I'm not sure how/if proposed change could benefit Delta
> > >> connector
> > >>>>>>>>>> implementation for this feature.
> > >>>>>>>>>>
> > >>>>>>>>>> Thanks,
> > >>>>>>>>>> Krzysztof
> > >>>>>>>>>>
> > >>>>>>>>>> [1]
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>
> > >>>>
> > >>>
> > >>
> >
> https://github.com/delta-io/connectors/tree/flink_table_catalog_feature_branch/flink
> > >>>>>>>>>> [2]
> > >>>>>> https://github.com/kristoffSC/connectors/tree/FlinkSQL_PR_15-docs
> > >>>>>>>>>> [3]
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>
> > >>>>
> > >>>
> > >>
> >
> https://github.com/delta-io/connectors/tree/flink_table_catalog_feature_branch/examples/flink-example/src/main/java/org/example/sql
> > >>>>>>>>>>
> > >>>>>>>>>> śr., 31 maj 2023 o 06:03 liu ron <ron9....@gmail.com>
> > >>> napisał(a):
> > >>>>>>>>>>
> > >>>>>>>>>>> Hi, Feng
> > >>>>>>>>>>>
> > >>>>>>>>>>> Thanks for driving this FLIP, Time travel is very useful for
> > >>> Flink
> > >>>>>>>>>>> integrate with data lake system. I have one question why the
> > >>>>>>>>>>> implementation
> > >>>>>>>>>>> of TimeTravel is delegated to Catalog? Assuming that we use
> > >>> Flink
> > >>>> to
> > >>>>>>>>> query
> > >>>>>>>>>>> Hudi table with the time travel syntax, but we don't use the
> > >>>>>>>>> HudiCatalog,
> > >>>>>>>>>>> instead, we register the hudi table to InMemoryCatalog,  can
> > >> we
> > >>>>>>>> support
> > >>>>>>>>>>> time travel for Hudi table in this case?
> > >>>>>>>>>>> In contrast, I think time travel should bind to connector
> > >>> instead
> > >>>> of
> > >>>>>>>>>>> Catalog, so the rejected alternative should be considered.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Best,
> > >>>>>>>>>>> Ron
> > >>>>>>>>>>>
> > >>>>>>>>>>> yuxia <luoyu...@alumni.sjtu.edu.cn> 于2023年5月30日周二 09:40写道:
> > >>>>>>>>>>>
> > >>>>>>>>>>>> Hi, Feng.
> > >>>>>>>>>>>> Notice this FLIP only support batch mode for time travel.
> > >>> Would
> > >>>> it
> > >>>>>>>>> also
> > >>>>>>>>>>>> make sense to support stream mode to a read a snapshot of
> the
> > >>>> table
> > >>>>>>>>> as a
> > >>>>>>>>>>>> bounded stream?
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Best regards,
> > >>>>>>>>>>>> Yuxia
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> ----- 原始邮件 -----
> > >>>>>>>>>>>> 发件人: "Benchao Li" <libenc...@apache.org>
> > >>>>>>>>>>>> 收件人: "dev" <dev@flink.apache.org>
> > >>>>>>>>>>>> 发送时间: 星期一, 2023年 5 月 29日 下午 6:04:53
> > >>>>>>>>>>>> 主题: Re: [DISCUSS] FLIP-308: Support Time Travel In Batch
> Mode
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> # Can Calcite support this syntax ` VERSION AS OF`  ?
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> This also depends on whether this is defined in standard or
> > >> any
> > >>>>>>>> known
> > >>>>>>>>>>>> databases that have implemented this. If not, it would be
> > >> hard
> > >>> to
> > >>>>>>>> push
> > >>>>>>>>>>> it
> > >>>>>>>>>>>> to Calcite.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> # getTable(ObjectPath object, long timestamp)
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Then we again come to the problem of "casting between
> > >> timestamp
> > >>>> and
> > >>>>>>>>>>>> numeric", which has been disabled in FLINK-21978[1]. If
> > >> you're
> > >>>>>> gonna
> > >>>>>>>>> use
> > >>>>>>>>>>>> this, then we need to clarify that problem first.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> [1] https://issues.apache.org/jira/browse/FLINK-21978
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Feng Jin <jinfeng1...@gmail.com> 于2023年5月29日周一 15:57写道:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>> hi, thanks for your reply.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> @Benchao
> > >>>>>>>>>>>>>> did you consider the pushdown abilities compatible
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> In the current design, the implementation of TimeTravel is
> > >>>>>>>> delegated
> > >>>>>>>>>>> to
> > >>>>>>>>>>>>> Catalog. We have added a function called
> getTable(ObjectPath
> > >>>>>>>>>>> tablePath,
> > >>>>>>>>>>>>> long timestamp) to obtain the corresponding
> CatalogBaseTable
> > >>> at
> > >>>> a
> > >>>>>>>>>>>> specific
> > >>>>>>>>>>>>> time.  Therefore, I think it will not have any impact on
> the
> > >>>>>>>>> original
> > >>>>>>>>>>>>> pushdown abilities.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>> I see there is a rejected  design for adding
> > >>>>>>>> SupportsTimeTravel,
> > >>>>>>>>>>> but
> > >>>>>>>>>>>> I
> > >>>>>>>>>>>>> didn't see the alternative in  the FLIP doc
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Sorry, the document description is not very clear.
> > >> Regarding
> > >>>>>>>>> whether
> > >>>>>>>>>>> to
> > >>>>>>>>>>>>> support SupportTimeTravel, I have discussed it with yuxia.
> > >>> Since
> > >>>>>>>> we
> > >>>>>>>>>>> have
> > >>>>>>>>>>>>> already passed the corresponding time in
> > >> getTable(ObjectPath,
> > >>>> long
> > >>>>>>>>>>>>> timestamp) of Catalog, SupportTimeTravel may not be
> > >> necessary.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> In getTable(ObjectPath object, long timestamp), we can
> > >> obtain
> > >>>> the
> > >>>>>>>>>>> schema
> > >>>>>>>>>>>> of
> > >>>>>>>>>>>>> the corresponding time point and put the SNAPSHOT that
> needs
> > >>> to
> > >>>> be
> > >>>>>>>>>>>> consumed
> > >>>>>>>>>>>>> into options.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> @Shammon
> > >>>>>>>>>>>>>> Could we support this in Flink too?
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> I personally think it's possible, but limited by Calcite's
> > >>>> syntax
> > >>>>>>>>>>>>> restrictions. I believe we should first support this syntax
> > >> in
> > >>>>>>>>>>> Calcite.
> > >>>>>>>>>>>>> Currently, I think it may not be easy  to support this
> > >> syntax
> > >>> in
> > >>>>>>>>>>> Flink's
> > >>>>>>>>>>>>> parser. @Benchao, what do you think? Can Calcite support
> > >> this
> > >>>>>>>> syntax
> > >>>>>>>>>>>>> ` VERSION AS OF`  ?
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Best,
> > >>>>>>>>>>>>> Feng.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> On Fri, May 26, 2023 at 2:55 PM Shammon FY <
> > >> zjur...@gmail.com
> > >>>>
> > >>>>>>>>> wrote:
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Thanks Feng, the feature of time travel sounds great!
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> In addition to SYSTEM_TIME, lake houses such as paimon and
> > >>>>>>>> iceberg
> > >>>>>>>>>>>>> support
> > >>>>>>>>>>>>>> snapshot or version. For example, users can query snapshot
> > >> 1
> > >>>> for
> > >>>>>>>>>>> paimon
> > >>>>>>>>>>>>> by
> > >>>>>>>>>>>>>> the following statement
> > >>>>>>>>>>>>>> SELECT * FROM t VERSION AS OF 1
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Could we support this in Flink too?
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Best,
> > >>>>>>>>>>>>>> Shammon FY
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> On Fri, May 26, 2023 at 1:20 PM Benchao Li <
> > >>>>>>>> libenc...@apache.org>
> > >>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Regarding the implementation, did you consider the
> > >> pushdown
> > >>>>>>>>>>> abilities
> > >>>>>>>>>>>>>>> compatible, e.g., projection pushdown, filter pushdown,
> > >>>>>>>>> partition
> > >>>>>>>>>>>>>> pushdown.
> > >>>>>>>>>>>>>>> Since `Snapshot` is not handled much in existing rules, I
> > >>>>>>>> have a
> > >>>>>>>>>>>>> concern
> > >>>>>>>>>>>>>>> about this. Of course, it depends on your implementation
> > >>>>>>>> detail,
> > >>>>>>>>>>> what
> > >>>>>>>>>>>>> is
> > >>>>>>>>>>>>>>> important is that we'd better add some cross tests for
> > >>> these.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Regarding the interface exposed to Connector, I see there
> > >>> is a
> > >>>>>>>>>>>> rejected
> > >>>>>>>>>>>>>>> design for adding SupportsTimeTravel, but I didn't see
> the
> > >>>>>>>>>>>> alternative
> > >>>>>>>>>>>>> in
> > >>>>>>>>>>>>>>> the FLIP doc. IMO, this is an important thing we need to
> > >>>>>>>> clarify
> > >>>>>>>>>>>>> because
> > >>>>>>>>>>>>>> we
> > >>>>>>>>>>>>>>> need to know whether the Connector supports this, and
> what
> > >>>>>>>>>>>>>> column/metadata
> > >>>>>>>>>>>>>>> corresponds to 'system_time'.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Feng Jin <jinfeng1...@gmail.com> 于2023年5月25日周四 22:50写道:
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> Thanks for your reply
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> @Timo @BenChao @yuxia
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> Sorry for the mistake,  Currently , calcite only
> supports
> > >>>>>>>>> `FOR
> > >>>>>>>>>>>>>>> SYSTEM_TIME
> > >>>>>>>>>>>>>>>> AS OF `  syntax.  We can only support `FOR SYSTEM_TIME
> AS
> > >>>>>>>> OF`
> > >>>>>>>>> .
> > >>>>>>>>>>>> I've
> > >>>>>>>>>>>>>>>> updated the syntax part of the FLIP.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> @Timo
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> We will convert it to TIMESTAMP_LTZ?
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> Yes, I think we need to convert TIMESTAMP to
> > >> TIMESTAMP_LTZ
> > >>>>>>>> and
> > >>>>>>>>>>> then
> > >>>>>>>>>>>>>>> convert
> > >>>>>>>>>>>>>>>> it into a long value.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> How do we want to query the most recent version of a
> > >> table
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> I think we can use `AS OF CURRENT_TIMESTAMP` ,But it
> does
> > >>>>>>>>> cause
> > >>>>>>>>>>>>>>>> inconsistency with the real-time concept.
> > >>>>>>>>>>>>>>>> However, from my personal understanding, the scope of
> > >> `AS
> > >>>>>>>> OF
> > >>>>>>>>>>>>>>>> CURRENT_TIMESTAMP` is the table itself, not the table
> > >>>>>>>> record.
> > >>>>>>>>>>> So,
> > >>>>>>>>>>>> I
> > >>>>>>>>>>>>>>> think
> > >>>>>>>>>>>>>>>> using CURRENT_TIMESTAMP should also be reasonable?.
> > >>>>>>>>>>>>>>>> Additionally, if no version is specified, the latest
> > >>> version
> > >>>>>>>>>>> should
> > >>>>>>>>>>>>> be
> > >>>>>>>>>>>>>>> used
> > >>>>>>>>>>>>>>>> by default.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> Best,
> > >>>>>>>>>>>>>>>> Feng
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> On Thu, May 25, 2023 at 7:47 PM yuxia <
> > >>>>>>>>>>> luoyu...@alumni.sjtu.edu.cn
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Thanks Feng for bringing this up. It'll be great to
> > >>>>>>>>> introduce
> > >>>>>>>>>>>> time
> > >>>>>>>>>>>>>>> travel
> > >>>>>>>>>>>>>>>>> to Flink to have a better integration with external
> data
> > >>>>>>>>>>> soruces.
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> I also share same concern about the syntax.
> > >>>>>>>>>>>>>>>>> I see in the part of `Whether to support other syntax
> > >>>>>>>>>>>>>> implementations`
> > >>>>>>>>>>>>>>> in
> > >>>>>>>>>>>>>>>>> this FLIP, seems the syntax in Calcite should be `FOR
> > >>>>>>>>>>> SYSTEM_TIME
> > >>>>>>>>>>>>> AS
> > >>>>>>>>>>>>>>> OF`,
> > >>>>>>>>>>>>>>>>> right?
> > >>>>>>>>>>>>>>>>> But the the syntax part in this FLIP, it seems to be
> `AS
> > >>>>>>>> OF
> > >>>>>>>>>>>>>> TIMESTAMP`
> > >>>>>>>>>>>>>>>>> instead of  `FOR SYSTEM_TIME AS OF`. Is it just a
> > >> mistake
> > >>>>>>>> or
> > >>>>>>>>>>> by
> > >>>>>>>>>>>>>> design?
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Best regards,
> > >>>>>>>>>>>>>>>>> Yuxia
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> ----- 原始邮件 -----
> > >>>>>>>>>>>>>>>>> 发件人: "Benchao Li" <libenc...@apache.org>
> > >>>>>>>>>>>>>>>>> 收件人: "dev" <dev@flink.apache.org>
> > >>>>>>>>>>>>>>>>> 发送时间: 星期四, 2023年 5 月 25日 下午 7:27:17
> > >>>>>>>>>>>>>>>>> 主题: Re: [DISCUSS] FLIP-308: Support Time Travel In
> Batch
> > >>>>>>>>> Mode
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Thanks Feng, it's exciting to have this ability.
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Regarding the syntax section, are you proposing `AS OF`
> > >>>>>>>>>>> instead
> > >>>>>>>>>>>> of
> > >>>>>>>>>>>>>> `FOR
> > >>>>>>>>>>>>>>>>> SYSTEM AS OF` to do this? I know `FOR SYSTEM AS OF` is
> > >> in
> > >>>>>>>>> the
> > >>>>>>>>>>> SQL
> > >>>>>>>>>>>>>>>> standard
> > >>>>>>>>>>>>>>>>> and has been supported in some database vendors such as
> > >>>>>>>> SQL
> > >>>>>>>>>>>> Server.
> > >>>>>>>>>>>>>>> About
> > >>>>>>>>>>>>>>>>> `AS OF`, is it in the standard or any database vendor
> > >>>>>>>>> supports
> > >>>>>>>>>>>>> this,
> > >>>>>>>>>>>>>> if
> > >>>>>>>>>>>>>>>>> yes, I think it's worth to add this support to Calcite,
> > >>>>>>>> and
> > >>>>>>>>> I
> > >>>>>>>>>>>> would
> > >>>>>>>>>>>>>>> give
> > >>>>>>>>>>>>>>>> a
> > >>>>>>>>>>>>>>>>> hand in Calcite side. Otherwise, I think we'd better to
> > >>>>>>>> use
> > >>>>>>>>>>> `FOR
> > >>>>>>>>>>>>>> SYSTEM
> > >>>>>>>>>>>>>>>> AS
> > >>>>>>>>>>>>>>>>> OF`.
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Timo Walther <twal...@apache.org> 于2023年5月25日周四
> > >> 19:02写道:
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> Also: How do we want to query the most recent version
> > >>>>>>>> of a
> > >>>>>>>>>>>> table?
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> `AS OF CURRENT_TIMESTAMP` would be ideal, but
> according
> > >>>>>>>> to
> > >>>>>>>>>>> the
> > >>>>>>>>>>>>> docs
> > >>>>>>>>>>>>>>>> both
> > >>>>>>>>>>>>>>>>>> the type is TIMESTAMP_LTZ and what is even more
> > >>>>>>>> concerning
> > >>>>>>>>>>> is
> > >>>>>>>>>>>> the
> > >>>>>>>>>>>>>> it
> > >>>>>>>>>>>>>>>>>> actually is evalated row-based:
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Returns the current SQL timestamp in the local time
> > >>>>>>>>> zone,
> > >>>>>>>>>>>> the
> > >>>>>>>>>>>>>>> return
> > >>>>>>>>>>>>>>>>>> type is TIMESTAMP_LTZ(3). It is evaluated for each
> > >>>>>>>> record
> > >>>>>>>>> in
> > >>>>>>>>>>>>>>> streaming
> > >>>>>>>>>>>>>>>>>> mode. But in batch mode, it is evaluated once as the
> > >>>>>>>> query
> > >>>>>>>>>>>> starts
> > >>>>>>>>>>>>>> and
> > >>>>>>>>>>>>>>>>>> uses the same result for every row.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> This could make it difficult to explain in a join
> > >>>>>>>> scenario
> > >>>>>>>>>>> of
> > >>>>>>>>>>>>>>> multiple
> > >>>>>>>>>>>>>>>>>> snapshotted tables.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> Regards,
> > >>>>>>>>>>>>>>>>>> Timo
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> On 25.05.23 12:29, Timo Walther wrote:
> > >>>>>>>>>>>>>>>>>>> Hi Feng,
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> thanks for proposing this FLIP. It makes a lot of
> > >>>>>>>> sense
> > >>>>>>>>> to
> > >>>>>>>>>>>>>> finally
> > >>>>>>>>>>>>>>>>>>> support querying tables at a specific point in time
> or
> > >>>>>>>>>>>>> hopefully
> > >>>>>>>>>>>>>>> also
> > >>>>>>>>>>>>>>>>>>> ranges soon. Following time-versioned tables.
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Here is some feedback from my side:
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> 1. Syntax
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Can you elaborate a bit on the Calcite restrictions?
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Does Calcite currently support `AS OF` syntax for
> this
> > >>>>>>>>> but
> > >>>>>>>>>>>> not
> > >>>>>>>>>>>>>> `FOR
> > >>>>>>>>>>>>>>>>>>> SYSTEM_TIME AS OF`?
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> It would be great to support `AS OF` also for
> > >>>>>>>>>>> time-versioned
> > >>>>>>>>>>>>>> joins
> > >>>>>>>>>>>>>>>> and
> > >>>>>>>>>>>>>>>>>>> have a unified and short syntax.
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Once a fix is merged in Calcite for this, we can make
> > >>>>>>>>> this
> > >>>>>>>>>>>>>>> available
> > >>>>>>>>>>>>>>>> in
> > >>>>>>>>>>>>>>>>>>> Flink earlier by copying the corresponding classes
> > >>>>>>>> until
> > >>>>>>>>>>> the
> > >>>>>>>>>>>>> next
> > >>>>>>>>>>>>>>>>>>> Calcite upgrade is performed.
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> 2. Semantics
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> How do we interpret the timestamp? In Flink we have 2
> > >>>>>>>>>>>> timestamp
> > >>>>>>>>>>>>>>> types
> > >>>>>>>>>>>>>>>>>>> (TIMESTAMP and TIMESTAMP_LTZ). If users specify AS OF
> > >>>>>>>>>>>> TIMESTAMP
> > >>>>>>>>>>>>>>>>>>> '2023-04-27 00:00:00', in which timezone will the
> > >>>>>>>>>>> timestamp
> > >>>>>>>>>>>> be?
> > >>>>>>>>>>>>>> We
> > >>>>>>>>>>>>>>>> will
> > >>>>>>>>>>>>>>>>>>> convert it to TIMESTAMP_LTZ?
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> We definely need to clarify this because the past has
> > >>>>>>>>>>> shown
> > >>>>>>>>>>>>> that
> > >>>>>>>>>>>>>>>>>>> daylight saving times make our lives hard.
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Thanks,
> > >>>>>>>>>>>>>>>>>>> Timo
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> On 25.05.23 10:57, Feng Jin wrote:
> > >>>>>>>>>>>>>>>>>>>> Hi, everyone.
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> I’d like to start a discussion about FLIP-308:
> > >>>>>>>> Support
> > >>>>>>>>>>> Time
> > >>>>>>>>>>>>>> Travel
> > >>>>>>>>>>>>>>>> In
> > >>>>>>>>>>>>>>>>>>>> Batch
> > >>>>>>>>>>>>>>>>>>>> Mode [1]
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> Time travel is a SQL syntax used to query historical
> > >>>>>>>>>>>> versions
> > >>>>>>>>>>>>> of
> > >>>>>>>>>>>>>>>> data.
> > >>>>>>>>>>>>>>>>>> It
> > >>>>>>>>>>>>>>>>>>>> allows users to specify a point in time and retrieve
> > >>>>>>>>> the
> > >>>>>>>>>>>> data
> > >>>>>>>>>>>>>> and
> > >>>>>>>>>>>>>>>>>>>> schema of
> > >>>>>>>>>>>>>>>>>>>> a table as it appeared at that time. With time
> > >>>>>>>> travel,
> > >>>>>>>>>>> users
> > >>>>>>>>>>>>> can
> > >>>>>>>>>>>>>>>>> easily
> > >>>>>>>>>>>>>>>>>>>> analyze and compare historical versions of data.
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> With the widespread use of data lake systems such as
> > >>>>>>>>>>> Paimon,
> > >>>>>>>>>>>>>>>> Iceberg,
> > >>>>>>>>>>>>>>>>>> and
> > >>>>>>>>>>>>>>>>>>>> Hudi, time travel can provide more convenience for
> > >>>>>>>>> users'
> > >>>>>>>>>>>> data
> > >>>>>>>>>>>>>>>>> analysis.
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> Looking forward to your opinions, any suggestions
> are
> > >>>>>>>>>>>>> welcomed.
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> 1.
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>
> > >>>>
> > >>>
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-308%3A+Support+Time+Travel+In+Batch+Mode
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> Best.
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> Feng
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> --
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Best,
> > >>>>>>>>>>>>>>>>> Benchao Li
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> --
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Best,
> > >>>>>>>>>>>>>>> Benchao Li
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> --
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Best,
> > >>>>>>>>>>>> Benchao Li
> > >>>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>
> > >>>
> > >>>
> > >>> --
> > >>>
> > >>> Best,
> > >>> Benchao Li
> > >>>
> > >>
> > >
> > >
> > > --
> > >
> > > Best,
> > > Benchao Li
> >
> >
>
> --
>
> Best,
> Benchao Li
>

Reply via email to