[ https://issues.apache.org/jira/browse/KAFKA-19225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Henry Cai updated KAFKA-19225: ------------------------------ Description: This is the Jira for [KIP-1176|https://cwiki.apache.org/confluence/display/KAFKA/KIP-1176%3A+Tiered+Storage+for+Active+Log+Segment] In KIP-405, the community has proposed and implemented the tiered storage for old Kafka log segment files, when the log segments is older than {_}local.retention.ms{_}, it becomes eligible to be uploaded to cloud's object storage and removed from the local storage thus reducing local storage cost. KIP-405 only uploads older log segments but not the most recent active log segments (write-ahead logs). Thus in a typical 3-way replicated Kafka cluster, the 2 follower brokers would still need to replicate the active log segments from the leader broker. It is common practice to set up the 3 brokers in three different AZs to improve the high availability of the cluster. This would cause the replications between leader/follower brokers to be across AZs which is a significant cost ([various studies|https://www.confluent.io/blog/understanding-and-optimizing-your-kafka-costs-part-1-infrastructure/] show the across AZ transfer cost typically comprises 50%-60% of the total cluster cost). Since all the active log segments are physically present on three Kafka Brokers, they still comprise significant resource usage on the brokers. The state of the broker is still quite big during node replacement, leading to longer node replacement time. [KIP-1150|https://cwiki.apache.org/confluence/display/KAFKA/KIP-1150%3A+Diskless+Topics] recently proposes diskless Kafka topic, but leads to increased latency and a significant redesign. In comparison, this proposed KIP maintains identical performance for acks=1 producer path, minimizes design changes to Kafka, and still slashes cost by an estimated 43%. was: This is the Jira for [KIP-1176: Tiered Storage Support for Active Log Segment|[https://cwiki.apache.org/confluence/display/KAFKA/KIP-1176%3A+Tiered+Storage+for+Active+Log+Segment]] In KIP-405, the community has proposed and implemented the tiered storage for old Kafka log segment files, when the log segments is older than {_}local.retention.ms{_}, it becomes eligible to be uploaded to cloud's object storage and removed from the local storage thus reducing local storage cost. KIP-405 only uploads older log segments but not the most recent active log segments (write-ahead logs). Thus in a typical 3-way replicated Kafka cluster, the 2 follower brokers would still need to replicate the active log segments from the leader broker. It is common practice to set up the 3 brokers in three different AZs to improve the high availability of the cluster. This would cause the replications between leader/follower brokers to be across AZs which is a significant cost ([various studies|https://www.confluent.io/blog/understanding-and-optimizing-your-kafka-costs-part-1-infrastructure/] show the across AZ transfer cost typically comprises 50%-60% of the total cluster cost). Since all the active log segments are physically present on three Kafka Brokers, they still comprise significant resource usage on the brokers. The state of the broker is still quite big during node replacement, leading to longer node replacement time. [KIP-1150|https://cwiki.apache.org/confluence/display/KAFKA/KIP-1150%3A+Diskless+Topics] recently proposes diskless Kafka topic, but leads to increased latency and a significant redesign. In comparison, this proposed KIP maintains identical performance for acks=1 producer path, minimizes design changes to Kafka, and still slashes cost by an estimated 43%. > Tiered Storage Support for Active Log Segment > --------------------------------------------- > > Key: KAFKA-19225 > URL: https://issues.apache.org/jira/browse/KAFKA-19225 > Project: Kafka > Issue Type: New Feature > Components: Tiered-Storage > Affects Versions: 4.0.0 > Reporter: Henry Cai > Assignee: Henry Cai > Priority: Major > Fix For: 4.0.1 > > > This is the Jira for > [KIP-1176|https://cwiki.apache.org/confluence/display/KAFKA/KIP-1176%3A+Tiered+Storage+for+Active+Log+Segment] > In KIP-405, the community has proposed and implemented the tiered storage for > old Kafka log segment files, when the log segments is older than > {_}local.retention.ms{_}, it becomes eligible to be uploaded to cloud's > object storage and removed from the local storage thus reducing local storage > cost. KIP-405 only uploads older log segments but not the most recent active > log segments (write-ahead logs). Thus in a typical 3-way replicated Kafka > cluster, the 2 follower brokers would still need to replicate the active log > segments from the leader broker. It is common practice to set up the 3 > brokers in three different AZs to improve the high availability of the > cluster. This would cause the replications between leader/follower brokers to > be across AZs which is a significant cost ([various > studies|https://www.confluent.io/blog/understanding-and-optimizing-your-kafka-costs-part-1-infrastructure/] > show the across AZ transfer cost typically comprises 50%-60% of the total > cluster cost). Since all the active log segments are physically present on > three Kafka Brokers, they still comprise significant resource usage on the > brokers. The state of the broker is still quite big during node replacement, > leading to longer node replacement time. > [KIP-1150|https://cwiki.apache.org/confluence/display/KAFKA/KIP-1150%3A+Diskless+Topics] > recently proposes diskless Kafka topic, but leads to increased latency and a > significant redesign. In comparison, this proposed KIP maintains identical > performance for acks=1 producer path, minimizes design changes to Kafka, and > still slashes cost by an estimated 43%. -- This message was sent by Atlassian Jira (v8.20.10#820010)