Re: CDC using Query

mohan radhakrishnan Thu, 10 Feb 2022 23:36:45 -0800

Thanks. I looked at it. Our primary DB is Oracle and MySql. Flink CDC
Connector uses Debezium. I think. So ververica doesn't have a Flink CDC
Connector for Oracle ?


On Mon, Feb 7, 2022 at 3:03 PM Leonard Xu <xbjt...@gmail.com> wrote:

> Hello, mohan
>
> 1. Does flink have any support to track any missed source Jdbc CDC records
> ?
>
>
> Flink CDC Connector provides Exactly once semantics which means they won’t
> miss records. Tips: The Flink JDBC Connector only
> Scan the database once which can not continuously read CDC stream.
>
> 2. What is the equivalent of Kafka consumer groups ?
>
>
> Different database has different CDC mechanism, it’s serverId which used
> to mark a slave for MySQL/MariaDB, it’s slot name for PostgresSQL.
>
>
> 3. Delivering to kafka from flink is not exactly once. Is that right ?
>
>
> No, both Flink CDC Connector and Flink Kafka Connector provide exactly
> once implementation.
>
> BTW, if your destination is Elasticsearch, the quick start demo[1] may
> help you.
>
> Best,
> Leonard
>
> [1]
> https://ververica.github.io/flink-cdc-connectors/master/content/quickstart/mysql-postgres-tutorial.html
>
>
>
> Thanks
>
> On Friday, February 4, 2022, mohan radhakrishnan <
> radhakrishnan.mo...@gmail.com> wrote:
>
>> Hello,
>>                So the jdbc source connector is  kafka and transformation
>> is done by flink (flink sql) ? But that connector can miss records. I
>> thought. Started looking at flink for this and other use cases.
>> Can I see the alternative to spring cloudstreams( kafka streams )? Since
>> I am learning flink, kafka streams' changelog topics and exactly-once
>> delivery and dlqs seemed good for our cŕitical push notifications.
>>
>> We also needed a  elastic  sink.
>>
>> Thanks
>>
>> On Friday, February 4, 2022, Dawid Wysakowicz <dwysakow...@apache.org>
>> wrote:
>>
>>> Hi Mohan,
>>>
>>> I don't know much about Kafka Connect, so I will not talk about its
>>> features and differences to Flink. Flink on its own does not have a
>>> capability to read a CDC stream directly from a DB. However there is the
>>> flink-cdc-connectors[1] projects which embeds the standalone Debezium
>>> engine inside of Flink's source and can process DB changelog with all
>>> processing guarantees that Flink provides.
>>>
>>> As for the idea of processing further with Kafka Streams. Why not
>>> process data with Flink? What do you miss in Flink?
>>>
>>> Best,
>>>
>>> Dawid
>>>
>>> [1] https://github.com/ververica/flink-cdc-connectors
>>>
>>> On 04/02/2022 13:55, mohan radhakrishnan wrote:
>>>
>>>> Hi,
>>>>      When I was looking for CDC I realized Flink uses Kafka Connector
>>>> to stream to Flink. The idea is to send it forward to Kafka and consume it
>>>> using Kafka Streams.
>>>>
>>>> Are there source DLQs or additional mechanisms to detect failures to
>>>> read from the DB ?
>>>>
>>>> We don't want to use Debezium and our CDC is based on queries.
>>>>
>>>> What mechanisms does Flink have that a Kafka Connect worker does not ?
>>>> Kafka Connect workers can go down and source data can be lost.
>>>>
>>>> Does the idea  to send it forward to Kafka and consume it using Kafka
>>>> Streams make sense ? The checkpointing feature of Flink can help ? I plan
>>>> to use Kafka Streams for 'Exactly-once Delivery' and changelog topics.
>>>>
>>>> Could you point out relevant material to read ?
>>>>
>>>> Thanks,
>>>> Mohan
>>>>
>>>
>

Re: CDC using Query

Reply via email to