davidradl commented on code in PR #24618: URL: https://github.com/apache/flink/pull/24618#discussion_r1613290123
########## flink-streaming-java/src/main/java/org/apache/flink/streaming/api/lineage/LineageGraph.java: ########## @@ -20,13 +20,12 @@ package org.apache.flink.streaming.api.lineage; import org.apache.flink.annotation.PublicEvolving; -import org.apache.flink.streaming.api.graph.StreamGraph; import java.util.List; /** - * Job lineage is built according to {@link StreamGraph}, users can get sources, sinks and - * relationships from lineage and manage the relationship between jobs and tables. + * Job lineage graph that users can get sources, sinks and relationships from lineage and manage the Review Comment: > Thanks David for your comments. Yes, the documentation will be added after adding the job lineage listener which is more user facing. It is planned in this jira https://issues.apache.org/jira/browse/FLINK-33212. This PR only consider source/sink level lineage. Column level lineage is not included for this work, so internal transformations not need lineage info for now. Would you please elaborate more about "I assume a sink could be a source - so could be in both current lists"? Hi Peter, usually we think of lineage assets as the nodes in the lineage (e.g. open lineage). So the asset could be a Kafka topic and that topic would be being used as a source for some flows and a sink for other flows. I was wondering how this fits with lineage at the table level, where there could be a table defined as a sink and a table defined as a source on the same Kafka topic. I guess when exporting / exposing to open lineage there could be many Flink tables referring to the same topic that would end up as one open lineage node. The natural way for Flink to store the lineage is at the table level - rather than at the asset level. So thinking about it, I think this is fine. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org