Re: CDC using Query

mohan radhakrishnan Fri, 11 Feb 2022 04:06:52 -0800

Hello,
              Ok. I may not have understood the answer to my previous
question.
When I listen to https://www.youtube.com/watch?v=IOZ2Um6e430 at 20:14 he
starts to talk about this.
Is he talking about a single Kafka Connect worker or a cluster ? He
mentions that it is 'atleast-once'.
So Flink's version is an improvement ? So Flink's Kafka Connector in a
Connect cluster guarantees 'Exactly-once' ?
Please bear with me.


This will have other consequences too as our MQ may need a MQ connector.(
Probably from Flink or Confluent  )
Different connectors may have different guarantees.

Thanks.

> 3. Delivering to kafka from flink is not exactly once. Is that right ?
>
>
> No, both Flink CDC Connector and Flink Kafka Connector provide exactly
> once implementation.
>






On Fri, Feb 11, 2022 at 1:57 PM Martijn Visser <mart...@ververica.com>
wrote:

> Hi,
>
> The readme on the Flink CDC connectors [1] say that Oracle Databases
> version 11, 12, 19 are supported with Oracle Driver 19.3.0.0.
>
> Best regards,
>
> Martijn
>
> [1]
> https://github.com/ververica/flink-cdc-connectors/blob/master/README.md
>
> On Fri, 11 Feb 2022 at 08:37, mohan radhakrishnan <
> radhakrishnan.mo...@gmail.com> wrote:
>
>> Thanks. I looked at it. Our primary DB is Oracle and MySql. Flink CDC
>> Connector uses Debezium. I think. So ververica doesn't have a Flink CDC
>> Connector for Oracle ?
>>
>> On Mon, Feb 7, 2022 at 3:03 PM Leonard Xu <xbjt...@gmail.com> wrote:
>>
>>> Hello, mohan
>>>
>>> 1. Does flink have any support to track any missed source Jdbc CDC
>>> records ?
>>>
>>>
>>> Flink CDC Connector provides Exactly once semantics which means they
>>> won’t miss records. Tips: The Flink JDBC Connector only
>>> Scan the database once which can not continuously read CDC stream.
>>>
>>> 2. What is the equivalent of Kafka consumer groups ?
>>>
>>>
>>> Different database has different CDC mechanism, it’s serverId which used
>>> to mark a slave for MySQL/MariaDB, it’s slot name for PostgresSQL.
>>>
>>>
>>> 3. Delivering to kafka from flink is not exactly once. Is that right ?
>>>
>>>
>>> No, both Flink CDC Connector and Flink Kafka Connector provide exactly
>>> once implementation.
>>>
>>> BTW, if your destination is Elasticsearch, the quick start demo[1] may
>>> help you.
>>>
>>> Best,
>>> Leonard
>>>
>>> [1]
>>> https://ververica.github.io/flink-cdc-connectors/master/content/quickstart/mysql-postgres-tutorial.html
>>>
>>>
>>>
>>> Thanks
>>>
>>> On Friday, February 4, 2022, mohan radhakrishnan <
>>> radhakrishnan.mo...@gmail.com> wrote:
>>>
>>>> Hello,
>>>>                So the jdbc source connector is  kafka and
>>>> transformation is done by flink (flink sql) ? But that connector can miss
>>>> records. I thought. Started looking at flink for this and other use cases.
>>>> Can I see the alternative to spring cloudstreams( kafka streams )?
>>>> Since I am learning flink, kafka streams' changelog topics and exactly-once
>>>> delivery and dlqs seemed good for our cŕitical push notifications.
>>>>
>>>> We also needed a  elastic  sink.
>>>>
>>>> Thanks
>>>>
>>>> On Friday, February 4, 2022, Dawid Wysakowicz <dwysakow...@apache.org>
>>>> wrote:
>>>>
>>>>> Hi Mohan,
>>>>>
>>>>> I don't know much about Kafka Connect, so I will not talk about its
>>>>> features and differences to Flink. Flink on its own does not have a
>>>>> capability to read a CDC stream directly from a DB. However there is the
>>>>> flink-cdc-connectors[1] projects which embeds the standalone Debezium
>>>>> engine inside of Flink's source and can process DB changelog with all
>>>>> processing guarantees that Flink provides.
>>>>>
>>>>> As for the idea of processing further with Kafka Streams. Why not
>>>>> process data with Flink? What do you miss in Flink?
>>>>>
>>>>> Best,
>>>>>
>>>>> Dawid
>>>>>
>>>>> [1] https://github.com/ververica/flink-cdc-connectors
>>>>>
>>>>> On 04/02/2022 13:55, mohan radhakrishnan wrote:
>>>>>
>>>>>> Hi,
>>>>>>      When I was looking for CDC I realized Flink uses Kafka Connector
>>>>>> to stream to Flink. The idea is to send it forward to Kafka and consume 
>>>>>> it
>>>>>> using Kafka Streams.
>>>>>>
>>>>>> Are there source DLQs or additional mechanisms to detect failures to
>>>>>> read from the DB ?
>>>>>>
>>>>>> We don't want to use Debezium and our CDC is based on queries.
>>>>>>
>>>>>> What mechanisms does Flink have that a Kafka Connect worker does not
>>>>>> ? Kafka Connect workers can go down and source data can be lost.
>>>>>>
>>>>>> Does the idea  to send it forward to Kafka and consume it using Kafka
>>>>>> Streams make sense ? The checkpointing feature of Flink can help ? I plan
>>>>>> to use Kafka Streams for 'Exactly-once Delivery' and changelog topics.
>>>>>>
>>>>>> Could you point out relevant material to read ?
>>>>>>
>>>>>> Thanks,
>>>>>> Mohan
>>>>>>
>>>>>
>>>

Re: CDC using Query

Reply via email to