I'd use option 2 (Kafka Connect).

Advantages of #2:

- The code is decoupled from the processing code and easier to refactor in
the future. (same as #4)
- The runtime/uptime/scalability of your Kafka Streams app (processing) is
decoupled from the runtime/uptime/scalability of the data ingestion into
your remote database.

"Remove the need for additional kafka topic." isn't a big win typically --
even though topics aren't free, there still quite cheap. ;-)

YMMV of course. :-)


On Sun, Mar 5, 2017 at 7:55 PM, Shimi Kiviti <shim...@gmail.com> wrote:

> Thank Eno,
>
> Yes, I am aware of that. It indeed looks like a very useful feature.
>
> The result of the processing in kafka streams is only a small amount of
> data that is require by our service.
> Currently it make more sense for us to update the remote database were we
> have more data that our application require.
> Also, the data should be available in case of failures. The remote database
> data is replicated. AFAIK although RocksDb changelog is backed by kafka, if
> a node fail, the data will be unavailable until it will be replicated to a
> different node.
>
> On Sun, Mar 5, 2017 at 4:38 PM, Eno Thereska <eno.there...@gmail.com>
> wrote:
>
> > Hi Shimi,
> >
> > Could you tell us more about your scenario? Kafka Streams uses embedded
> > databases (RocksDb) to store it's state, so often you don't need to write
> > anything to an external database and you can query your streams state
> > directly from streams. Have a look at this blog if that matches your
> > scenario: https://www.confluent.io/blog/unifying-stream-processing-
> > and-interactive-queries-in-apache-kafka/ <https://www.confluent.io/
> > blog/unifying-stream-processing-and-interactive-
> queries-in-apache-kafka/>.
> >
> > Cheers
> > Eno
> >
> > > On 5 Mar 2017, at 10:48, Shimi Kiviti <shim...@gmail.com> wrote:
> > >
> > > Hi Everyone,
> > >
> > > I was wondering about writing data to remote database.
> > > I see 4 possible options:
> > >
> > >   1. Read from a topic and write to the database.
> > >   2. Use kafka connect
> > >   3. Write from anywhere in kafka streams.
> > >   4. Register a CachedStateStore FlushListener that will send a batch
> of
> > >   records when the store flush the records.
> > >
> > > Advantages of #4:
> > >
> > >   - The code is decoupled from the processing code and easier to
> refactor
> > >   in the future.
> > >   - Remove the need for additional kafka topic.
> > >
> > >
> > > Thanks,
> > >
> > > Shimi
> >
> >
>

Reply via email to