Gwen and Guozhang are very convincing, so I don't have much to add here :) The only thing I can think to add is that it is less code for you to write! Once one person writes the connector, we don't have a bunch of people reimplementing the logic for copying data to a JDBC sink for their KS apps.
Re: dynamic schema, this is generally configurable -- it's nice to have everything propagate automatically, but understood that it can be a bit scary. The JDBC sink will be available in the same repo as our JDBC source ( https://github.com/confluentinc/kafka-connect-jdbc) as soon as we can make it public, hopefully as early as next week. -Ewen On Fri, Jul 22, 2016 at 7:47 AM, Mathieu Fenniak < mathieu.fenn...@replicon.com> wrote: > Hm, cool. Thanks Gwen and Guozhang. > > Loose-coupling (especially with regard to the number of instances > running), batch inserts, and exactly-once are very convincing. Dynamic > schema is interesting / scary, but, I'd need a dynamic app on the other > side which I don't have. :-) > > I'll plod along with KS-foreach until the JDBC sink connector is > available, but, would definitely pick up the JDBC sink connector and give > it a try when available. > > Thanks, > > Mathieu > > > On Thu, Jul 21, 2016 at 7:07 PM, Gwen Shapira <g...@confluent.io> wrote: > >> In addition, our soon-to-be-released JDBC sink connector uses the >> Connect framework to do things that are kind of annoying to do >> yourself: >> * Convert data types >> * create tables if needed, add columns to tables if needed based on >> the data in Kafka >> * support for both insert and upsert >> * configurable batch inserts >> * exactly-once from Kafka to DB (using upserts) >> >> We'll notify you when we open the repository. Just a bit of cleanup left >> :) >> >> >> On Thu, Jul 21, 2016 at 1:45 PM, Guozhang Wang <wangg...@gmail.com> >> wrote: >> > Hi Mathieu, >> > >> > I'm cc'ing Ewen for answering your question as well, but here are my two >> > cents: >> > >> > 1. One benefit of piping end result from KS to KC rather than using >> > .foreach() in KS directly is that you can have a loose coupling between >> > data processing and data copying. For example, for the latter approach >> the >> > number of JDBC connections is tied to the number of your KS instances, >> > while in practice you may want to have a different number. >> > >> > 2. We are working on end-to-end exactly-once semantics right now, which >> is >> > a big project involving both KS and KC. From the KS point of view, any >> > logic inside the foreach() call is a "black-box" to it and any >> side-effects >> > it may result in is not considered in its part of the exactly-once >> > semantics; whereas with KC it has full knowledge about the connector and >> > hence can achieve exactly-once as well for copying data to your RDBMS. >> > >> > >> > Guozhang >> > >> > On Thu, Jul 21, 2016 at 6:49 AM, Mathieu Fenniak < >> > mathieu.fenn...@replicon.com> wrote: >> > >> >> Hello again, Kafka users, >> >> >> >> My end goal is to get stream-processed data into a PostgreSQL database. >> >> >> >> I really like the architecture that Kafka Streams takes; it's "just" a >> >> library, I can build a normal Java application around it and deal with >> >> configuration and orchestration myself. To persist my data, it's easy >> to >> >> add a .foreach() to the end of my topology and upsert data into my DB >> with >> >> jdbc. >> >> >> >> I'm interpreting based upon the docs that the recommended approach >> would be >> >> to send my final data back to a Kafka topic, and use Connect with a >> sink to >> >> persist that data. That seems really interesting, but it's another >> complex >> >> moving part that I could do without. >> >> >> >> What advantages does Kafka Connect provide that I would be missing out >> on >> >> by persisting my data directly from my Kafka Streams application? >> >> >> >> Thanks, >> >> >> >> Mathieu >> >> >> > >> > >> > >> > -- >> > -- Guozhang >> > > -- Thanks, Ewen