[ 
https://issues.apache.org/jira/browse/KAFKA-8796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16907672#comment-16907672
 ] 

David Jacot commented on KAFKA-8796:
------------------------------------

[~rmarou] Have you tried to experiment with throttling the replication traffic? 
There is doc available here: 
[https://kafka.apache.org/documentation/#rep-throttle]

> A broker joining the cluster should be able to replicate without impacting 
> the cluster
> --------------------------------------------------------------------------------------
>
>                 Key: KAFKA-8796
>                 URL: https://issues.apache.org/jira/browse/KAFKA-8796
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 1.1.0
>            Reporter: Marouane RAJI
>            Priority: Major
>         Attachments: image-2019-08-13-10-26-19-282.png, 
> image-2019-08-13-10-28-42-337.png
>
>
> Hi, 
> We run a cluster of 50 brokers, 1.4M msgs/sec at max, on AWS. We were using 
> m4.2xlarge. We are now moving to m5.2xlarge. Everytime we replace a broker 
> from scratch (EBSs are linked to ec2 instance..), the byte sent on the 
> replaced broker increase significantly and that seem to impact the cluster, 
> increasing the produce time and fetch time..
> This is our configuration per broker :
>  
>  
> {code:java}
> broker.id=11
> ############################# Socket Server Settings 
> #############################
> # The port the socket server listens on
> port=9092
> advertised.host.name=ec2-xx-xx-xx-xx.eu-west-1.compute.amazonaws.com
> # The number of threads handling network requests
> num.network.threads=32
> # The number of threads doing disk I/O
> num.io.threads=16socket server socket.receive.buffer.bytes=1048576 
> socket.request.max.bytes=104857600 # The max time a connection can be idle 
> connections.max.idle.ms = 60000 
> num.partitions=2 
> default.replication.factor=2 
> auto.leader.rebalance.enable=true 
> delete.topic.enable=true 
> compression.type=producer 
> log.message.format.version=0.9.0.1
> message.max.bytes=8000000 
> # The minimum age of a log file to be eligible for deletion 
> log.retention.hours=48 
> log.retention.bytes=3000000000 
> log.segment.bytes=268435456 
> log.retention.check.interval.ms=60000  
> log.cleaner.enable=true 
> log.cleaner.dedupe.buffer.size=268435456
> replica.fetch.max.bytes=8388608 
> replica.fetch.wait.max.ms=500 
> replica.lag.time.max.ms=10000 
> num.replica.fetchers = 3 
> # Auto creation of topics on the server 
> auto.create.topics.enable=true 
> controlled.shutdown.enable=true 
> inter.broker.protocol.version=0.10.2 
> unclean.leader.election.enabled=True
> {code}
>  
> This is what we notice on replication :
> I high increase in byte received on the replaced broker
>  
> !image-2019-08-13-10-26-19-282.png!
> !image-2019-08-13-10-28-42-337.png!
> You can't see it the graph above but the increase in produce time stayed high 
> for 20minutes..
> We didn't see anything out of the ordinary in the logs.
> Please let us know if there is anything wrong in our config or if it is a 
> potential issue that needs fixing with kafka. 
> Thanks.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to