Viktor Somogyi-Vass created KAFKA-14281:
-------------------------------------------

             Summary: Multi-level rack awareness
                 Key: KAFKA-14281
                 URL: https://issues.apache.org/jira/browse/KAFKA-14281
             Project: Kafka
          Issue Type: Improvement
          Components: core
    Affects Versions: 3.4.0
            Reporter: Viktor Somogyi-Vass
            Assignee: Viktor Somogyi-Vass


h1. Motivation

With replication services data can be replicated across independent Kafka 
clusters in multiple data center. In addition, many customers need "stretch 
clusters" - a single Kafka cluster that spans across multiple data centers. 
This architecture has the following useful characteristics:
 - Data is natively replicated into all data centers by Kafka topic replication.
 - No data is lost when 1 DC is lost and no configuration change is required - 
design is implicitly relying on native Kafka replication.
 - From operational point of view, it is much easier to configure and operate 
such a topology than a replication scenario via MM2.

Kafka should provide "native" support for stretch clusters, covering any 
special aspects of operations of stretch cluster.

h2. Multi-level rack awareness

Additionally, stretch clusters are implemented using the rack awareness 
feature, where each DC is represented as a rack. This ensures that replicas are 
spread across DCs evenly. Unfortunately, there are cases where this is too 
limiting - in case there are actual racks inside the DCs, we cannot specify 
those. Consider having 3 DCs with 2 racks each:

/DC1/R1, /DC1/R2
/DC2/R1, /DC2/R2
/DC3/R1, /DC3/R2

If we were to use racks as DC1, DC2, DC3, we lose the rack-level information of 
the setup. This means that it is possible that when we are using RF=6, that the 
2 replicas assigned to DC1 will both end up in the same rack.

If we were to use racks as /DC1/R1, /DC1/R2, etc, then when using RF=3, it is 
possible that 2 replicas end up in the same DC, e.g. /DC1/R1, /DC1/R2, /DC2/R1.

Because of this, Kafka should support "multi-level" racks, which means that 
rack IDs should be able to describe some kind of a hierarchy. With this 
feature, brokers should be able to:
 # spread replicas evenly based on the top level of the hierarchy (i.e. first, 
between DCs)
 # then inside a top-level unit (DC), if there are multiple replicas, they 
should be spread evenly among lower-level units (i.e. between racks, then 
between physical hosts, and so on)
 ## repeat for all levels



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to