[ https://issues.apache.org/jira/browse/FLINK-11054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16707525#comment-16707525 ]
Zhenqiu Huang commented on FLINK-11054: --------------------------------------- [~twalthr] [~fhueske] Agree to Fabian. We met this issue during the work on AthenaX backfill, in which we want to backfill data from hive with the same query. The ts field in our Parquet schema is long, thus I converted it to timestamp in ParquetInputFormat in internal implementation. I think Fabian's proposal is more reasonable and general solution. From the implementation perspective, how about add a function called assignTimestamps in DataSet? > Ingest Long value as TIMESTAMP attribute > ---------------------------------------- > > Key: FLINK-11054 > URL: https://issues.apache.org/jira/browse/FLINK-11054 > Project: Flink > Issue Type: Improvement > Components: Table API & SQL > Reporter: Fabian Hueske > Assignee: Zhenqiu Huang > Priority: Major > > When ingesting streaming tables, a {{Long}} value that is marked as > event-time timestamp is automatically converted into a {{TIMESTAMP}} > attribute. > However, batch table scans do not have similar functionality, i.e. to convert > a {{Long}} during ingestion / table scan into a {{TIMESTAMP}}. This is > relevant because features like GROUP BY windows require a {{TIMESTAMP}} > parameter. Hence, batch queries would need to use a UDF (or later built-in > function) to manually convert a {{Long}} attribute to {{TIMESTAMP}}. > Flink separates the concepts of format schema and table schema. > I propose to automatically convert values that are defined as {{long}} in the > format schema and as {{TIMESTAMP}} in the table schema (both for streaming > and batch scans). > Since, the conversion is only done if explicitly requested (right now this > should yield an error messages), we should not break existing behavior. > What do you think [~twalthr] -- This message was sent by Atlassian JIRA (v7.6.3#76005)