Hi Jingsong, As the FLIP describes, "KTable connector produces a changelog stream, where each data record represents an update or delete event.". Therefore, a ktable source is an unbounded stream source. Selecting a ktable source is similar to selecting a kafka source with debezium-json format that it never ends and the results are continuously updated.
It's possible to have a bounded ktable source in the future, for example, add an option 'bounded=true' or 'end-offset=xxx'. In this way, the ktable will produce a bounded changelog stream. So I think this can be a compatible feature in the future. I don't think we should associate with ksql related concepts. Actually, we didn't introduce any concepts from KSQL (e.g. Stream vs Table notion). The "ktable" is just a connector name, we can also call it "compacted-kafka" or something else. Calling it "ktable" is just because KSQL users can migrate to Flink SQL easily. Regarding the "value.fields-include", this is an option introduced in FLIP-107 for Kafka connector. I think we should keep the same behavior with the Kafka connector. I'm not sure what's the default behavior of KSQL. But I guess it also stores the keys in value from this example docs (see the "users_original" table) [1]. Best, Jark [1]: https://docs.confluent.io/current/ksqldb/tutorials/basics-local.html#create-a-stream-and-table On Mon, 19 Oct 2020 at 18:17, Danny Chan <yuzhao....@gmail.com> wrote: > The concept seems conflicts with the Flink abstraction “dynamic table”, in > Flink we see both “stream” and “table” as a dynamic table, > > I think we should make clear first how to express stream and table > specific features on one “dynamic table”, > it is more natural for KSQL because KSQL takes stream and table as > different abstractions for representing collections. In KSQL, only table is > mutable and can have a primary key. > > Does this connector belongs to the “table” scope or “stream” scope ? > > Some of the concepts (such as the primary key on stream) should be > suitable for all the connectors, not just Kafka, Shouldn’t this be an > extension of existing Kafka connector instead of a totally new connector ? > What about the other connectors ? > > Because this touches the core abstraction of Flink, we better have a > top-down overall design, following the KSQL directly is not the answer. > > P.S. For the source > > Shouldn’t this be an extension of existing Kafka connector instead of a > totally new connector ? > > How could we achieve that (e.g. set up the parallelism correctly) ? > > Best, > Danny Chan > 在 2020年10月19日 +0800 PM5:17,Jingsong Li <jingsongl...@gmail.com>,写道: > > Thanks Shengkai for your proposal. > > > > +1 for this feature. > > > > > Future Work: Support bounded KTable source > > > > I don't think it should be a future work, I think it is one of the > > important concepts of this FLIP. We need to understand it now. > > > > Intuitively, a ktable in my opinion is a bounded table rather than a > > stream, so select should produce a bounded table by default. > > > > I think we can list Kafka related knowledge, because the word `ktable` is > > easy to associate with ksql related concepts. (If possible, it's better > to > > unify with it) > > > > What do you think? > > > > > value.fields-include > > > > What about the default behavior of KSQL? > > > > Best, > > Jingsong > > > > On Mon, Oct 19, 2020 at 4:33 PM Shengkai Fang <fskm...@gmail.com> wrote: > > > > > Hi, devs. > > > > > > Jark and I want to start a new FLIP to introduce the KTable connector. > The > > > KTable is a shortcut of "Kafka Table", it also has the same semantics > with > > > the KTable notion in Kafka Stream. > > > > > > FLIP-149: > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-149%3A+Introduce+the+KTable+Connector > > > > > > Currently many users have expressed their needs for the upsert Kafka by > > > mail lists and issues. The KTable connector has several benefits for > users: > > > > > > 1. Users are able to interpret a compacted Kafka Topic as an upsert > stream > > > in Apache Flink. And also be able to write a changelog stream to Kafka > > > (into a compacted topic). > > > 2. As a part of the real time pipeline, store join or aggregate result > (may > > > contain updates) into a Kafka topic for further calculation; > > > 3. The semantic of the KTable connector is just the same as KTable in > Kafka > > > Stream. So it's very handy for Kafka Stream and KSQL users. We have > seen > > > several questions in the mailing list asking how to model a KTable and > how > > > to join a KTable in Flink SQL. > > > > > > We hope it can expand the usage of the Flink with Kafka. > > > > > > I'm looking forward to your feedback. > > > > > > Best, > > > Shengkai > > > > > > > > > -- > > Best, Jingsong Lee >