[jira] [Commented] (FLINK-31275) Flink supports reporting and storage of source/sink tables relationship

Maciej Obuchowski (Jira) Mon, 27 Nov 2023 05:32:07 -0800


    [ 
https://issues.apache.org/jira/browse/FLINK-31275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17790083#comment-17790083
 ]


Maciej Obuchowski commented on FLINK-31275:
-------------------------------------------

>Why a `LineageVertex` have multiple inputs or outputs? We hope that 
>'LineageVertex' describes a single source or sink, rather than multiple.

I have two good counterexamples, for read and write, when one source or sink 
describes more than one datasets:
 * KafkaSource can read from multiple topics, or even wildcard pattern.
 * Another case is where one company used JDBC connector sink, and they had 
very large amount of destination tables (1000s), some of them with rather small 
amounts of data. The database would not work with one-connection-per-table 
model, so I had a fork of JDBC connector that could dynamically determine to 
which table the connector should write the data. I tried to contribute that but 
there was no interest. [https://github.com/apache/flink/pull/15102/files]

Flink is really flexible when it comes to structure of the job, which should be 
reflected in the API.

>We introduce `LineageEdge` in this FLIP to describe the relation between 
>sources and sinks instead of add `input` or `output` in `LineageVertex`.

I think those are two things are separate, as different datasets in one source 
can have different output sinks.

> Flink supports reporting and storage of source/sink tables relationship
> -----------------------------------------------------------------------
>
>                 Key: FLINK-31275
>                 URL: https://issues.apache.org/jira/browse/FLINK-31275
>             Project: Flink
>          Issue Type: Improvement
>          Components: Table SQL / Planner
>    Affects Versions: 1.18.0
>            Reporter: Fang Yong
>            Assignee: Fang Yong
>            Priority: Major
>
> FLIP-314 has been accepted 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-314%3A+Support+Customized+Job+Lineage+Listener



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-31275) Flink supports reporting and storage of source/sink tables relationship

Reply via email to