Re: Kafka Streams/Connect for Persistence?

Guozhang Wang Thu, 21 Jul 2016 13:45:44 -0700

Hi Mathieu,

I'm cc'ing Ewen for answering your question as well, but here are my two
cents:

1. One benefit of piping end result from KS to KC rather than using
.foreach() in KS directly is that you can have a loose coupling between
data processing and data copying. For example, for the latter approach the
number of JDBC connections is tied to the number of your KS instances,
while in practice you may want to have a different number.

2. We are working on end-to-end exactly-once semantics right now, which is
a big project involving both KS and KC. From the KS point of view, any
logic inside the foreach() call is a "black-box" to it and any side-effects
it may result in is not considered in its part of the exactly-once
semantics; whereas with KC it has full knowledge about the connector and
hence can achieve exactly-once as well for copying data to your RDBMS.

Guozhang

On Thu, Jul 21, 2016 at 6:49 AM, Mathieu Fenniak <
mathieu.fenn...@replicon.com> wrote:

> Hello again, Kafka users,
>
> My end goal is to get stream-processed data into a PostgreSQL database.
>
> I really like the architecture that Kafka Streams takes; it's "just" a
> library, I can build a normal Java application around it and deal with
> configuration and orchestration myself.  To persist my data, it's easy to
> add a .foreach() to the end of my topology and upsert data into my DB with
> jdbc.
>
> I'm interpreting based upon the docs that the recommended approach would be
> to send my final data back to a Kafka topic, and use Connect with a sink to
> persist that data.  That seems really interesting, but it's another complex
> moving part that I could do without.
>
> What advantages does Kafka Connect provide that I would be missing out on
> by persisting my data directly from my Kafka Streams application?
>
> Thanks,
>
> Mathieu
>

-- 
-- Guozhang

Re: Kafka Streams/Connect for Persistence?

Reply via email to