This is definitely an interesting use case. However, you need to be aware
that changing the broker topology won't rebalance the preexisting data from
the previous brokers. That is, you risk loosing data.

Cheers,
Jens

On Wed, Mar 9, 2016 at 2:10 PM Daniel Schierbeck <da...@zendesk.com.invalid>
wrote:

> I'm considering an architecture where Kafka acts as the primary datastore,
> with infinite retention of messages. The messages in this case will be
> domain events that must not be lost. Different downstream consumers would
> ingest the events and build up various views on them, e.g. aggregated
> stats, indexes by various properties, full text search, etc.
>
> The important bit is that I'd like to avoid having a separate datastore for
> long-term archival of events, since:
>
> 1) I want to make it easy to spin up new materialized views based on past
> events, and only having to deal with Kafka is simpler.
> 2) Instead of having some sort of two-phased import process where I need to
> first import historical data and then do a switchover to the Kafka topics,
> I'd rather just start from offset 0 in the Kafka topics.
> 3) I'd like to be able to use standard tooling where possible, and most
> tools for ingesting events into e.g. Spark Streaming would be difficult to
> use unless all the data was in Kafka.
>
> I'd like to know if anyone here has tried this use case. Based on the
> presentations by Jay Kreps and Martin Kleppmann I would expect that someone
> had actually implemented some of the ideas they're been pushing. I'd also
> like to know what sort of problems Kafka would pose for long-term storage –
> would I need special storage nodes, or would replication be sufficient to
> ensure durability?
>
> Daniel Schierbeck
> Senior Staff Engineer, Zendesk
>
-- 

Jens Rantil
Backend Developer @ Tink

Tink AB, Wallingatan 5, 111 60 Stockholm, Sweden
For urgent matters you can reach me at +46-708-84 18 32.

Reply via email to