This json file seemed to work cat reduce_replication_factor.json { "version":1, "partitions":[ {"topic":"md","partition":0,"replicas":[12,10,8,2,9,11,1,7,3]}, {"topic":"md","partition":1,"replicas":[9,8,2,12,11,1,7,3,10]}, {"topic":"md","partition":2,"replicas":[11,2,12,9,1,7,3,10,8]}, {"topic":"md","partition":3,"replicas":[1,12,9,11,7,3,10,8,2]}, {"topic":"md","partition":4,"replicas":[7,9,11,1,3,10,8,2,12]}, {"topic":"md","partition":5,"replicas":[3,11,1,7,10,8,2,12,9]} ] }
kafka-reassign-partitions.sh --bootstrap-server rhes75:9092,rhes75:9093,rhes75:9094,rhes76:9092,rhes76:9093,rhes76:9094,rhes76:9095,rhes76:9096, rhes76:9097 --reassignment-json-file ./reduce_replication_factor.json --execute The output Successfully started partition reassignments for md-0,md-1,md-2,md-3,md-4,md-5 I guess it is going to take sometime before it is completed. Thanks On Fri, 12 May 2023 at 20:16, Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > Thanks Matthias. > > with regard to your point below: > > A replication factor of 9 sounds very high. For production, a replication > factor of 3 is recommended. > > Is it possible to dynamically reduce this number to 3 when the topic is > actively consumed)? > > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > > On Fri, 12 May 2023 at 19:38, Matthias J. Sax <mj...@apache.org> wrote: > >> > Does having 9 partitions with 9 replication factors make sense here? >> >> A replication factor of 9 sounds very high. For production, replication >> factor of 3 is recommended. >> >> How many partitions you want/need is a different question, and cannot be >> answered in a general way. >> >> >> "Yes" to all other questions. >> >> >> -Matthias >> >> >> >> On 5/12/23 9:50 AM, Mich Talebzadeh wrote: >> > Hi, >> > >> > I have used Apache Kafka in conjunction with Spark as a messaging >> > source. This rather dated diagram describes it >> > >> > I have two physical hosts each 64 GB, running RHES 7.6, these are >> called >> > rhes75 and rhes76 respectively. The Zookeeper version is 3.7.1 and >> kafka >> > version is 3.4.0 >> > >> > >> > image.png >> > I have a topic md -> MarketData that has been defined as below >> > >> > kafka-topics.sh --create --bootstrap-server >> > >> rhes75:9092,rhes75:9093,rhes75:9094,rhes76:9092,rhes76:9093,rhes76:9094,rhes76:9095,rhes76:9096, >> rhes76:9097 --replication-factor 9 --partitions 9 --topic md >> > >> > kafka-topics.sh --describe --bootstrap-server >> > >> rhes75:9092,rhes75:9093,rhes75:9094,rhes76:9092,rhes76:9093,rhes76:9094,rhes76:9095,rhes76:9096, >> rhes76:9097 --topic md >> > >> > >> > This is working fine >> > >> > Topic: md TopicId: UfQly87bQPCbVKoH-PQheg PartitionCount: 9 >> > ReplicationFactor: 9 Configs: segment.bytes=1073741824 >> > Topic: md Partition: 0 Leader: 12 Replicas: >> > 12,10,8,2,9,11,1,7,3 Isr: 10,1,9,2,12,7,3,11,8 >> > Topic: md Partition: 1 Leader: 9 Replicas: >> > 9,8,2,12,11,1,7,3,10 Isr: 10,1,9,2,12,7,3,11,8 >> > Topic: md Partition: 2 Leader: 11 Replicas: >> > 11,2,12,9,1,7,3,10,8 Isr: 10,1,9,2,12,7,3,11,8 >> > Topic: md Partition: 3 Leader: 1 Replicas: >> > 1,12,9,11,7,3,10,8,2 Isr: 10,1,9,2,12,7,3,11,8 >> > Topic: md Partition: 4 Leader: 7 Replicas: >> > 7,9,11,1,3,10,8,2,12 Isr: 10,1,9,2,12,7,3,11,8 >> > Topic: md Partition: 5 Leader: 3 Replicas: >> > 3,11,1,7,10,8,2,12,9 Isr: 10,1,9,2,12,7,3,11,8 >> > Topic: md Partition: 6 Leader: 10 Replicas: >> > 10,1,7,3,8,2,12,9,11 Isr: 10,1,9,2,12,7,3,11,8 >> > Topic: md Partition: 7 Leader: 8 Replicas: >> > 8,7,3,10,2,12,9,11,1 Isr: 10,1,9,2,12,7,3,11,8 >> > Topic: md Partition: 8 Leader: 2 Replicas: >> > 2,3,10,8,12,9,11,1,7 Isr: 10,1,9,2,12,7,3,11,8 >> > >> > However, I have a number of questions >> > >> > 1. Does having 9 partitions with 9 replication factors make sense >> here? >> > 2. As I understand the parallelism is equal to the number of partitions >> > for a topic. >> > 3. Kafka only provides a total order over messages *within a >> > partition*, not between different partitions in a topic and in >> > this case I have one topic >> > 4. >> > >> > Data within a Partition will be stored in the order in which it is >> > written, therefore, data read from a Partition will be read in order >> > for that partition? >> > >> > 5. >> > >> > Finally if I want to get messages in order across multiple all 9 >> > partitionss, then I need to group messages with a key, so that >> > messages with the samekey goto the samepartition and withinthat >> > partition the messages are ordered >> > >> > Thanks >> > >> > >> > *Disclaimer:* Use it at your own risk.Any and all responsibility for >> any >> > loss, damage or destruction of data or any other property which may >> > arise from relying on this email's technical content is explicitly >> > disclaimed. The author will in no case be liable for any monetary >> > damages arising from such loss, damage or destruction. >> > >> >