A couple of things:

- Compacted topics provide a useful way to retain meaningful datasets inside 
the broker, which don’t grow indefinitely. If you have an update-in-place use 
case, where the event sourced approach doesn’t buy you much, these will keep 
the reload time down when you regenerate materialised views.  
- When going down the master data store route a few different problems may 
conflate. Disaster recovery, historic backups, regenerating data in non 
production environments.  

B


> On 14 Mar 2016, at 09:56, Jens Rantil <jens.ran...@tink.se> wrote:
> 
> This is definitely an interesting use case. However, you need to be aware
> that changing the broker topology won't rebalance the preexisting data from
> the previous brokers. That is, you risk loosing data.
> 
> Cheers,
> Jens
> 
> On Wed, Mar 9, 2016 at 2:10 PM Daniel Schierbeck <da...@zendesk.com.invalid>
> wrote:
> 
>> I'm considering an architecture where Kafka acts as the primary datastore,
>> with infinite retention of messages. The messages in this case will be
>> domain events that must not be lost. Different downstream consumers would
>> ingest the events and build up various views on them, e.g. aggregated
>> stats, indexes by various properties, full text search, etc.
>> 
>> The important bit is that I'd like to avoid having a separate datastore for
>> long-term archival of events, since:
>> 
>> 1) I want to make it easy to spin up new materialized views based on past
>> events, and only having to deal with Kafka is simpler.
>> 2) Instead of having some sort of two-phased import process where I need to
>> first import historical data and then do a switchover to the Kafka topics,
>> I'd rather just start from offset 0 in the Kafka topics.
>> 3) I'd like to be able to use standard tooling where possible, and most
>> tools for ingesting events into e.g. Spark Streaming would be difficult to
>> use unless all the data was in Kafka.
>> 
>> I'd like to know if anyone here has tried this use case. Based on the
>> presentations by Jay Kreps and Martin Kleppmann I would expect that someone
>> had actually implemented some of the ideas they're been pushing. I'd also
>> like to know what sort of problems Kafka would pose for long-term storage –
>> would I need special storage nodes, or would replication be sufficient to
>> ensure durability?
>> 
>> Daniel Schierbeck
>> Senior Staff Engineer, Zendesk
>> 
> -- 
> 
> Jens Rantil
> Backend Developer @ Tink
> 
> Tink AB, Wallingatan 5, 111 60 Stockholm, Sweden
> For urgent matters you can reach me at +46-708-84 18 32.

Reply via email to