Hi, I am trying to tune mirrormaker configurations based on this doc <https://cwiki.apache.org/confluence/display/KAFKA/Kafka+mirroring+(MirrorMaker)#Kafkamirroring%28MirrorMaker%29-Consumerandsourceclustersocketbuffersizes> and would like know your recommendations.
Our configuration: We are doing inter datacenter replication with 5 brokers in source and destination DC and 2 mirrormakers doing replication. We have about 4 topics with 4 partitions each. I have been consumerOffsetChecker to analysis lag based on tuning. 1. num.streams : - We have set num.streams=2 so that 4 partitions will be shared between 2 mirrormaker. Increasing num.streams more than this did not improve any performance, is this correct? 2. num.producers:- We initially set num.producers = 4 (assuming one producer thread per topic), then we bumped num.producers = 16, but did not see any improvement in performance..? Is this correct..? How do we determine optimum value for num.producers ? 3. *socket.buffersize : *We initially had default values for these, then I changed socket.send.buffer.bytes on source broker, socket.receive.buffer.bytes, fetch.message.max.bytes on mirrormaker consumer properties, socket.receive.buffer.bytes, socket.request.max.bytes on destination broker all to 1024*1024*1024(1073741824) . This did improve the performance, but I could not get Lag to < 100. Here is how our lag looks like after above changes: Group Topic Pid Offset logSize Lag Owner mirrormakerProd FunnelProto 0 554704539 554717088 12549 mirrormakerProd_ops-mmrs1-1-asg.ops.sfdc.net-1377192412490-38a53dc9-0 mirrormakerProd FunnelProto 1 547370573 547383136 12563 mirrormakerProd_ops-mmrs1-1-asg.ops.sfdc.net-1377192412490-38a53dc9-1 mirrormakerProd FunnelProto 2 553124930 553125742 812 mirrormakerProd_ops-mmrs1-2-asg.ops.sfdc.net-1377193322178-7262ed87-0 mirrormakerProd FunnelProto 3 552990834 552991650 816 mirrormakerProd_ops-mmrs1-2-asg.ops.sfdc.net-1377193322178-7262ed87-1 mirrormakerProd agent 0 35438 35440 2 mirrormakerProd_ops-mmrs1-1-asg.ops.sfdc.net-1377192412490-38a53dc9-0 mirrormakerProd agent 1 35447 35448 1 mirrormakerProd_ops-mmrs1-1-asg.ops.sfdc.net-1377192412490-38a53dc9-1 mirrormakerProd agent 2 35375 35375 0 mirrormakerProd_ops-mmrs1-2-asg.ops.sfdc.net-1377193322178-7262ed87-0 mirrormakerProd agent 3 35336 35336 0 mirrormakerProd_ops-mmrs1-2-asg.ops.sfdc.net-1377193322178-7262ed87-1 mirrormakerProd internal_metrics 0 1930852823 1930917418 64595 mirrormakerProd_ops-mmrs1-1-asg.ops.sfdc.net-1377192412490-38a53dc9-0 mirrormakerProd internal_metrics 1 1937237324 1937301841 64517 mirrormakerProd_ops-mmrs1-1-asg.ops.sfdc.net-1377192412490-38a53dc9-1 mirrormakerProd internal_metrics 2 1945894901 1945904067 9166 mirrormakerProd_ops-mmrs1-2-asg.ops.sfdc.net-1377193322178-7262ed87-0 mirrormakerProd internal_metrics 3 1946906932 1946915928 8996 mirrormakerProd_ops-mmrs1-2-asg.ops.sfdc.net-1377193322178-7262ed87-1 mirrormakerProd jmx 0 485270038 485280882 10844 mirrormakerProd_ops-mmrs1-1-asg.ops.sfdc.net-1377192412490-38a53dc9-0 mirrormakerProd jmx 1 486363914 486374759 10845 mirrormakerProd_ops-mmrs1-1-asg.ops.sfdc.net-1377192412490-38a53dc9-1 mirrormakerProd jmx 2 491783842 491784826 984 mirrormakerProd_ops-mmrs1-2-asg.ops.sfdc.net-1377193322178-7262ed87-0 mirrormakerProd jmx 3 485675629 485676643 1014 mirrormakerProd_ops-mmrs1-2-asg.ops.sfdc.net-1377193322178-7262ed87-1 In mirrormaker logs, I see topic metadata is fetched after every 10mins and connection reestablished with producers for producing. Is this normal? If it's continuously producing, why does it need to reconnect to destination brokers for producing.? What else can we tune to bring lag < 100 ..? This is just small set of data we are currently testing, the real production traffic will be very large. How can compute optimum configuration as data traffic increases.? Thanks for help, Thanks, Raja. On Thu, Aug 22, 2013 at 4:44 PM, Rajasekar Elango <rela...@salesforce.com>wrote: > I am trying to tune mirrormaker configurations based on this doc > <https://cwiki.apache.org/confluence/display/KAFKA/Kafka+mirroring+(MirrorMaker)#Kafkamirroring%28MirrorMaker%29-Consumerandsourceclustersocketbuffersizes> > and > would like know your recommendations. > > Our configuration: We are doing inter datacenter replication with 5 > brokers in source and destination DC and 2 mirrormakers doing replication. > We have about 4 topics with 4 partitions each. > I have been consumerOffsetChecker to analysis lag based on tuning. > > > 1. num.streams : - We have set num.streams=2 so that 4 partitions will > be shared between 2 mirrormaker. Increasing num.streams more than this did > not improve any performance, is this correct? > 2. num.producers:- We initially set num.producers = 4 (assuming one > producer thread per topic), then we bumped num.producers = 16, but did not > see any improvement in performance..? Is this correct..? How do we > determine optimum value for num.producers ? > 3. *socket.buffersize : *We initially had default values for these, > then I changed socket.send.buffer.bytes on source broker, > socket.receive.buffer.bytes, fetch.message.max.bytes on mirrormaker > consumer properties, socket.receive.buffer.bytes, > socket.request.max.bytes on destination broker all to > 1024*1024*1024(1073741824) . This did improve the performance, but I could > not get Lag to < 100. > > Here is how our lag looks like after above changes: > > Group Topic Pid Offset > logSize Lag Owner > mirrormakerProd FunnelProto 0 554704539 > 554717088 12549 > mirrormakerProd_ops-mmrs1-1-asg.ops.sfdc.net-1377192412490-38a53dc9-0 > mirrormakerProd FunnelProto 1 547370573 > 547383136 12563 > mirrormakerProd_ops-mmrs1-1-asg.ops.sfdc.net-1377192412490-38a53dc9-1 > mirrormakerProd FunnelProto 2 553124930 > 553125742 812 > mirrormakerProd_ops-mmrs1-2-asg.ops.sfdc.net-1377193322178-7262ed87-0 > mirrormakerProd FunnelProto 3 552990834 > 552991650 816 > mirrormakerProd_ops-mmrs1-2-asg.ops.sfdc.net-1377193322178-7262ed87-1 > mirrormakerProd agent 0 35438 35440 > 2 > mirrormakerProd_ops-mmrs1-1-asg.ops.sfdc.net-1377192412490-38a53dc9-0 > mirrormakerProd agent 1 35447 35448 > 1 > mirrormakerProd_ops-mmrs1-1-asg.ops.sfdc.net-1377192412490-38a53dc9-1 > mirrormakerProd agent 2 35375 35375 > 0 > mirrormakerProd_ops-mmrs1-2-asg.ops.sfdc.net-1377193322178-7262ed87-0 > mirrormakerProd agent 3 35336 35336 > 0 > mirrormakerProd_ops-mmrs1-2-asg.ops.sfdc.net-1377193322178-7262ed87-1 > mirrormakerProd internal_metrics 0 1930852823 > 1930917418 64595 > mirrormakerProd_ops-mmrs1-1-asg.ops.sfdc.net-1377192412490-38a53dc9-0 > mirrormakerProd internal_metrics 1 1937237324 > 1937301841 64517 > mirrormakerProd_ops-mmrs1-1-asg.ops.sfdc.net-1377192412490-38a53dc9-1 > mirrormakerProd internal_metrics 2 1945894901 > 1945904067 9166 > mirrormakerProd_ops-mmrs1-2-asg.ops.sfdc.net-1377193322178-7262ed87-0 > mirrormakerProd internal_metrics 3 1946906932 > 1946915928 8996 > mirrormakerProd_ops-mmrs1-2-asg.ops.sfdc.net-1377193322178-7262ed87-1 > mirrormakerProd jmx 0 485270038 > 485280882 10844 > mirrormakerProd_ops-mmrs1-1-asg.ops.sfdc.net-1377192412490-38a53dc9-0 > mirrormakerProd jmx 1 486363914 > 486374759 10845 > mirrormakerProd_ops-mmrs1-1-asg.ops.sfdc.net-1377192412490-38a53dc9-1 > mirrormakerProd jmx 2 491783842 > 491784826 984 > mirrormakerProd_ops-mmrs1-2-asg.ops.sfdc.net-1377193322178-7262ed87-0 > mirrormakerProd jmx 3 485675629 > 485676643 1014 > mirrormakerProd_ops-mmrs1-2-asg.ops.sfdc.net-1377193322178-7262ed87-1 > > In mirrormaker logs, I see topic metadata is fetched after every 10mins > and connection reestablished with producers for producing. Is this normal? > If it's continuously producing, why does it need to reconnect to > destination brokers for producing.? > What else can we tune to bring lag < 100 ..? This is just small set of > data we are currently testing, the real production traffic will be very > large. How can compute optimum configuration as data traffic increases.? > > Thanks for help, > > Thanks, > Raja. > > > -- Thanks, Raja.