Re: Kafka Streams/Connect for Persistence?

Gwen Shapira Thu, 21 Jul 2016 18:08:06 -0700

In addition, our soon-to-be-released JDBC sink connector uses the
Connect framework to do things that are kind of annoying to do
yourself:
* Convert data types
* create tables if needed, add columns to tables if needed based on
the data in Kafka
* support for both insert and upsert
* configurable batch inserts
* exactly-once from Kafka to DB (using upserts)


We'll notify you when we open the repository. Just a bit of cleanup left :)


On Thu, Jul 21, 2016 at 1:45 PM, Guozhang Wang <wangg...@gmail.com> wrote:
> Hi Mathieu,
>
> I'm cc'ing Ewen for answering your question as well, but here are my two
> cents:
>
> 1. One benefit of piping end result from KS to KC rather than using
> .foreach() in KS directly is that you can have a loose coupling between
> data processing and data copying. For example, for the latter approach the
> number of JDBC connections is tied to the number of your KS instances,
> while in practice you may want to have a different number.
>
> 2. We are working on end-to-end exactly-once semantics right now, which is
> a big project involving both KS and KC. From the KS point of view, any
> logic inside the foreach() call is a "black-box" to it and any side-effects
> it may result in is not considered in its part of the exactly-once
> semantics; whereas with KC it has full knowledge about the connector and
> hence can achieve exactly-once as well for copying data to your RDBMS.
>
>
> Guozhang
>
> On Thu, Jul 21, 2016 at 6:49 AM, Mathieu Fenniak <
> mathieu.fenn...@replicon.com> wrote:
>
>> Hello again, Kafka users,
>>
>> My end goal is to get stream-processed data into a PostgreSQL database.
>>
>> I really like the architecture that Kafka Streams takes; it's "just" a
>> library, I can build a normal Java application around it and deal with
>> configuration and orchestration myself.  To persist my data, it's easy to
>> add a .foreach() to the end of my topology and upsert data into my DB with
>> jdbc.
>>
>> I'm interpreting based upon the docs that the recommended approach would be
>> to send my final data back to a Kafka topic, and use Connect with a sink to
>> persist that data.  That seems really interesting, but it's another complex
>> moving part that I could do without.
>>
>> What advantages does Kafka Connect provide that I would be missing out on
>> by persisting my data directly from my Kafka Streams application?
>>
>> Thanks,
>>
>> Mathieu
>>
>
>
>
> --
> -- Guozhang

Re: Kafka Streams/Connect for Persistence?

Reply via email to