In addition, our soon-to-be-released JDBC sink connector uses the Connect framework to do things that are kind of annoying to do yourself: * Convert data types * create tables if needed, add columns to tables if needed based on the data in Kafka * support for both insert and upsert * configurable batch inserts * exactly-once from Kafka to DB (using upserts)
We'll notify you when we open the repository. Just a bit of cleanup left :) On Thu, Jul 21, 2016 at 1:45 PM, Guozhang Wang <wangg...@gmail.com> wrote: > Hi Mathieu, > > I'm cc'ing Ewen for answering your question as well, but here are my two > cents: > > 1. One benefit of piping end result from KS to KC rather than using > .foreach() in KS directly is that you can have a loose coupling between > data processing and data copying. For example, for the latter approach the > number of JDBC connections is tied to the number of your KS instances, > while in practice you may want to have a different number. > > 2. We are working on end-to-end exactly-once semantics right now, which is > a big project involving both KS and KC. From the KS point of view, any > logic inside the foreach() call is a "black-box" to it and any side-effects > it may result in is not considered in its part of the exactly-once > semantics; whereas with KC it has full knowledge about the connector and > hence can achieve exactly-once as well for copying data to your RDBMS. > > > Guozhang > > On Thu, Jul 21, 2016 at 6:49 AM, Mathieu Fenniak < > mathieu.fenn...@replicon.com> wrote: > >> Hello again, Kafka users, >> >> My end goal is to get stream-processed data into a PostgreSQL database. >> >> I really like the architecture that Kafka Streams takes; it's "just" a >> library, I can build a normal Java application around it and deal with >> configuration and orchestration myself. To persist my data, it's easy to >> add a .foreach() to the end of my topology and upsert data into my DB with >> jdbc. >> >> I'm interpreting based upon the docs that the recommended approach would be >> to send my final data back to a Kafka topic, and use Connect with a sink to >> persist that data. That seems really interesting, but it's another complex >> moving part that I could do without. >> >> What advantages does Kafka Connect provide that I would be missing out on >> by persisting my data directly from my Kafka Streams application? >> >> Thanks, >> >> Mathieu >> > > > > -- > -- Guozhang