Hello, mohan > 1. Does flink have any support to track any missed source Jdbc CDC records ?
Flink CDC Connector provides Exactly once semantics which means they won’t miss records. Tips: The Flink JDBC Connector only Scan the database once which can not continuously read CDC stream. > 2. What is the equivalent of Kafka consumer groups ? Different database has different CDC mechanism, it’s serverId which used to mark a slave for MySQL/MariaDB, it’s slot name for PostgresSQL. > 3. Delivering to kafka from flink is not exactly once. Is that right ? No, both Flink CDC Connector and Flink Kafka Connector provide exactly once implementation. BTW, if your destination is Elasticsearch, the quick start demo[1] may help you. Best, Leonard [1] https://ververica.github.io/flink-cdc-connectors/master/content/quickstart/mysql-postgres-tutorial.html > > Thanks > > On Friday, February 4, 2022, mohan radhakrishnan > <radhakrishnan.mo...@gmail.com <mailto:radhakrishnan.mo...@gmail.com>> wrote: > Hello, > So the jdbc source connector is kafka and transformation is > done by flink (flink sql) ? But that connector can miss records. I thought. > Started looking at flink for this and other use cases. > Can I see the alternative to spring cloudstreams( kafka streams )? Since I am > learning flink, kafka streams' changelog topics and exactly-once delivery and > dlqs seemed good for our cŕitical push notifications. > > We also needed a elastic sink. > > Thanks > > On Friday, February 4, 2022, Dawid Wysakowicz <dwysakow...@apache.org > <mailto:dwysakow...@apache.org>> wrote: > Hi Mohan, > > I don't know much about Kafka Connect, so I will not talk about its features > and differences to Flink. Flink on its own does not have a capability to read > a CDC stream directly from a DB. However there is the flink-cdc-connectors[1] > projects which embeds the standalone Debezium engine inside of Flink's source > and can process DB changelog with all processing guarantees that Flink > provides. > > As for the idea of processing further with Kafka Streams. Why not process > data with Flink? What do you miss in Flink? > > Best, > > Dawid > > [1] https://github.com/ververica/flink-cdc-connectors > <https://github.com/ververica/flink-cdc-connectors> > > On 04/02/2022 13:55, mohan radhakrishnan wrote: > Hi, > When I was looking for CDC I realized Flink uses Kafka Connector to > stream to Flink. The idea is to send it forward to Kafka and consume it using > Kafka Streams. > > Are there source DLQs or additional mechanisms to detect failures to read > from the DB ? > > We don't want to use Debezium and our CDC is based on queries. > > What mechanisms does Flink have that a Kafka Connect worker does not ? Kafka > Connect workers can go down and source data can be lost. > > Does the idea to send it forward to Kafka and consume it using Kafka Streams > make sense ? The checkpointing feature of Flink can help ? I plan to use > Kafka Streams for 'Exactly-once Delivery' and changelog topics. > > Could you point out relevant material to read ? > > Thanks, > Mohan