Navinder Brar created KAFKA-6643:
------------------------------------
Summary: Warm up new replicas from scratch when changelog topic
has retention time
Key: KAFKA-6643
URL: https://issues.apache.org/jira/browse/KAFKA-6643
Project: Kafka
Issue Type: New Feature
Components: streams
Reporter: Navinder Brar
In the current scenario, Kafka Streams has changelog Kafka topics(internal
topics having all the data for the store) which are used to build the state of
replicas. So, if we keep the number of standby replicas as 1, we still have
more availability for persistent state stores as changelog Kafka topics are
also replicated depending upon broker replication policy but that also means we
are using at least 4 times the space(1 master store, 1 replica store, 1
changelog, 1 changelog replica).
Now if we have an year's data in persistent stores(rocksdb), we don't want the
changelog topics to have an year's data as it will put an unnecessary burden on
brokers(in terms of space). If we have to scale our kafka streams
application(having 200-300 TB's of data) we have to scale the kafka brokers as
well. We want to reduce this dependency and find out ways to just use changelog
topic as a queue, having just 2 or 3 days of data and warm up the replicas from
scratch in some other way.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)