I'd use option 2 (Kafka Connect). Advantages of #2:
- The code is decoupled from the processing code and easier to refactor in the future. (same as #4) - The runtime/uptime/scalability of your Kafka Streams app (processing) is decoupled from the runtime/uptime/scalability of the data ingestion into your remote database. "Remove the need for additional kafka topic." isn't a big win typically -- even though topics aren't free, there still quite cheap. ;-) YMMV of course. :-) On Sun, Mar 5, 2017 at 7:55 PM, Shimi Kiviti <shim...@gmail.com> wrote: > Thank Eno, > > Yes, I am aware of that. It indeed looks like a very useful feature. > > The result of the processing in kafka streams is only a small amount of > data that is require by our service. > Currently it make more sense for us to update the remote database were we > have more data that our application require. > Also, the data should be available in case of failures. The remote database > data is replicated. AFAIK although RocksDb changelog is backed by kafka, if > a node fail, the data will be unavailable until it will be replicated to a > different node. > > On Sun, Mar 5, 2017 at 4:38 PM, Eno Thereska <eno.there...@gmail.com> > wrote: > > > Hi Shimi, > > > > Could you tell us more about your scenario? Kafka Streams uses embedded > > databases (RocksDb) to store it's state, so often you don't need to write > > anything to an external database and you can query your streams state > > directly from streams. Have a look at this blog if that matches your > > scenario: https://www.confluent.io/blog/unifying-stream-processing- > > and-interactive-queries-in-apache-kafka/ <https://www.confluent.io/ > > blog/unifying-stream-processing-and-interactive- > queries-in-apache-kafka/>. > > > > Cheers > > Eno > > > > > On 5 Mar 2017, at 10:48, Shimi Kiviti <shim...@gmail.com> wrote: > > > > > > Hi Everyone, > > > > > > I was wondering about writing data to remote database. > > > I see 4 possible options: > > > > > > 1. Read from a topic and write to the database. > > > 2. Use kafka connect > > > 3. Write from anywhere in kafka streams. > > > 4. Register a CachedStateStore FlushListener that will send a batch > of > > > records when the store flush the records. > > > > > > Advantages of #4: > > > > > > - The code is decoupled from the processing code and easier to > refactor > > > in the future. > > > - Remove the need for additional kafka topic. > > > > > > > > > Thanks, > > > > > > Shimi > > > > >