Hi Ted

This is an interesting question. 

Kafka has similar resilience properties to other distributed stores such as 
Cassandra, which are used as master data stores (obviously without the query 
functions). You’d need to set unclean.leader.election.enable=false and 
configure sufficient replication to get good resiliency. 

One objection to doing this would be that the majority of Kafka usage is for 
transitory data. This is fair and I’ve not seen Kafka used as a master data 
store per se. I have seen it used for reliable messaging, which means not 
losing data and hence requires similar properties. Certainly there is nothing I 
can think of that would suggest Kafka would be any worse than other distributed 
data stores, but to further mitigate concerns, you could use Connect to create 
a backup in HDFS, SAN etc. 

All the best

B 



> On 15 Feb 2016, at 08:56, Ted Swerve <ted.swe...@gmail.com> wrote:
> 
> Hello,
> 
> Is it viable to use infinite-retention Kafka topics as a master data
> store?  I'm not talking massive volumes of data here, but still potentially
> extending into tens of terabytes.
> 
> Are there any drawbacks or pitfalls to such an approach?  It seems like a
> compelling design, but there seem to be mixed messages about its
> suitability for this kind of role.
> 
> Regards,
> Ted

Reply via email to