Re: Kafka Streams/Connect for Persistence?

Ewen Cheslack-Postava Sat, 23 Jul 2016 17:42:26 -0700

Gwen and Guozhang are very convincing, so I don't have much to add here :)
The only thing I can think to add is that it is less code for you to write!
Once one person writes the connector, we don't have a bunch of people
reimplementing the logic for copying data to a JDBC sink for their KS apps.


Re: dynamic schema, this is generally configurable -- it's nice to have
everything propagate automatically, but understood that it can be a bit
scary.

The JDBC sink will be available in the same repo as our JDBC source (
https://github.com/confluentinc/kafka-connect-jdbc) as soon as we can make
it public, hopefully as early as next week.

-Ewen

On Fri, Jul 22, 2016 at 7:47 AM, Mathieu Fenniak <
mathieu.fenn...@replicon.com> wrote:

> Hm, cool.  Thanks Gwen and Guozhang.
>
> Loose-coupling (especially with regard to the number of instances
> running), batch inserts, and exactly-once are very convincing.  Dynamic
> schema is interesting / scary, but, I'd need a dynamic app on the other
> side which I don't have. :-)
>
> I'll plod along with KS-foreach until the JDBC sink connector is
> available, but, would definitely pick up the JDBC sink connector and give
> it a try when available.
>
> Thanks,
>
> Mathieu
>
>
> On Thu, Jul 21, 2016 at 7:07 PM, Gwen Shapira <g...@confluent.io> wrote:
>
>> In addition, our soon-to-be-released JDBC sink connector uses the
>> Connect framework to do things that are kind of annoying to do
>> yourself:
>> * Convert data types
>> * create tables if needed, add columns to tables if needed based on
>> the data in Kafka
>> * support for both insert and upsert
>> * configurable batch inserts
>> * exactly-once from Kafka to DB (using upserts)
>>
>> We'll notify you when we open the repository. Just a bit of cleanup left
>> :)
>>
>>
>> On Thu, Jul 21, 2016 at 1:45 PM, Guozhang Wang <wangg...@gmail.com>
>> wrote:
>> > Hi Mathieu,
>> >
>> > I'm cc'ing Ewen for answering your question as well, but here are my two
>> > cents:
>> >
>> > 1. One benefit of piping end result from KS to KC rather than using
>> > .foreach() in KS directly is that you can have a loose coupling between
>> > data processing and data copying. For example, for the latter approach
>> the
>> > number of JDBC connections is tied to the number of your KS instances,
>> > while in practice you may want to have a different number.
>> >
>> > 2. We are working on end-to-end exactly-once semantics right now, which
>> is
>> > a big project involving both KS and KC. From the KS point of view, any
>> > logic inside the foreach() call is a "black-box" to it and any
>> side-effects
>> > it may result in is not considered in its part of the exactly-once
>> > semantics; whereas with KC it has full knowledge about the connector and
>> > hence can achieve exactly-once as well for copying data to your RDBMS.
>> >
>> >
>> > Guozhang
>> >
>> > On Thu, Jul 21, 2016 at 6:49 AM, Mathieu Fenniak <
>> > mathieu.fenn...@replicon.com> wrote:
>> >
>> >> Hello again, Kafka users,
>> >>
>> >> My end goal is to get stream-processed data into a PostgreSQL database.
>> >>
>> >> I really like the architecture that Kafka Streams takes; it's "just" a
>> >> library, I can build a normal Java application around it and deal with
>> >> configuration and orchestration myself.  To persist my data, it's easy
>> to
>> >> add a .foreach() to the end of my topology and upsert data into my DB
>> with
>> >> jdbc.
>> >>
>> >> I'm interpreting based upon the docs that the recommended approach
>> would be
>> >> to send my final data back to a Kafka topic, and use Connect with a
>> sink to
>> >> persist that data.  That seems really interesting, but it's another
>> complex
>> >> moving part that I could do without.
>> >>
>> >> What advantages does Kafka Connect provide that I would be missing out
>> on
>> >> by persisting my data directly from my Kafka Streams application?
>> >>
>> >> Thanks,
>> >>
>> >> Mathieu
>> >>
>> >
>> >
>> >
>> > --
>> > -- Guozhang
>>
>
>


-- 
Thanks,
Ewen

Re: Kafka Streams/Connect for Persistence?

Reply via email to