Re: [DISCUSS] FLIP-107: Reading table columns from different parts of source records

Leonard Xu Tue, 11 Aug 2020 21:02:42 -0700

+1 for FLIP-107

Reading different parts of source code should be the key feature for Flink SQL, 
like metadata in CDC data, key and  timestamp in Kafka records.
The scope of FLIP-107 is too big to finish in one version IMO, maybe we can 
start part work in 1.12.


Best
Leonard

> 在 2020年8月11日，19:51，Kurt Young <[email protected]> 写道：
> 
> The content length of FLIP-107 is relatively short but the scope and
> implications it will cause is actually very big.
> From what I can tell now, I think there is a good chance that we can
> deliver part of this FLIP in 1.12, e.g.
> accessing the metadata field just like you mentioned.
> 
> Best,
> Kurt
> 
> 
> On Tue, Aug 11, 2020 at 7:18 PM Dongwon Kim <[email protected]> wrote:
> 
>> Big +1 for this FLIP.
>> 
>> Recently I'm working on some Kafka topics that have timestamps as
>> metadata, not in the message body. I want to declare a table from the
>> topics with DDL but "rowtime_column_name" in <watermark_definition> seems
>> to accept only existing columns.
>> 
>>> <watermark_definition>:
>>>  WATERMARK FOR rowtime_column_name AS watermark_strategy_expression
>>> 
>>> 
>> I raised an issue in user@ list but committers advise to use alternative
>> approaches that call for detailed knowledge of Flink like custom decoding
>> format or conversion between DataStream API and TableEnvironment. It is
>> definitely against the main advantage of Flink SQL, simplicity and ease of
>> use. This FLIP must be implemented IMHO in order for users to derive tables
>> freely from any Kafka topic without having to involve DataStream API.
>> 
>> Best,
>> 
>> Dongwon
>> 
>> On 2020/03/01 14:30:31, Dawid Wysakowicz <[email protected]> wrote:
>>> Hi,>
>>> 
>>> I would like to propose an improvement that would enable reading table>
>>> columns from different parts of source records. Besides the main payload>
>>> majority (if not all of the sources) expose additional information. It>
>>> can be simply a read-only metadata such as offset, ingestion time or a>
>>> read and write  parts of the record that contain data but additionally>
>>> serve different purposes (partitioning, compaction etc.), e.g. key or>
>>> timestamp in Kafka.>
>>> 
>>> We should make it possible to read and write data from all of those>
>>> locations. In this proposal I discuss reading partitioning data, for>
>>> completeness this proposal discusses also the partitioning when writing>
>>> data out.>
>>> 
>>> I am looking forward to your comments.>
>>> 
>>> You can access the FLIP here:>
>>> 
>> 
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-107%3A+Reading+table+columns+from+different+parts+of+source+records?src=contextnavpagetreemode
>>> 
>> 
>>> 
>>> Best,>
>>> 
>>> Dawid>
>>> 
>>> 
>>> 
>>

Re: [DISCUSS] FLIP-107: Reading table columns from different parts of source records

Reply via email to