We've been stable all weekend with the following settings: ExecStart=/usr/bin/kafka-mirror-maker --abort.on.send.failure true --new.consumer --num.streams 6 --offset.commit.interval.ms 60000 --consumer.config /etc/kafka/mirrormaker/telem_mm/consumer.properties --producer.config /etc/kafka/mirrormaker/telem_mm/producer.properties --whitelist
Consumer properties: bootstrap session.timeout.ms=55000 heartbeat.interval.ms=15000 request.timeout.ms=60000 Producer properties: Bootstrap Any other combination of compression/buffer memory/linger/etc. on the 0.9 producer producing to 0.11/1.0 wasn't reliable - it might work for an hour and then die, or it might never work. Once I landed on stable producer settings (which were just defaults), the consumer started having time outs due to heartbeating (because again, 0.9) so I had to increase the heartbeat, session and request timeouts to stabilize the consumer group. Fortunately, our target cluster for most of our mirrormakers is the last one we will upgrade to 1.x, at which point we can just upgrade the mirrormakers to 1.x as well. On 4/6/18, 1:09 PM, "Jeff Field" <jvfi...@blizzard.com> wrote: I'm hitting the same problem, even with the new consumer, on MirrorMaker 0.9 reading from a 0.9 Kafka cluster and producing to a 0.11 Kafka cluster. On 3/30/18, 3:56 PM, "Andrew Otto" <o...@wikimedia.org> wrote: I’m currently stuck on MirrorMaker version 0.9, and I’m not sure when the new consumer client became the default. Does your 0.10 version have a —new.consumer option listed in the help message? If so, then the new consumer client is not the default. I haven’t seen the problem you are describing (I’m still having plenty of others though) since I’ve switched to using the new consumer. Another thought, what is the value of your partition.assignment.strategy? I’ve found round robin (default in later versions of MirrorMaker) to be a lot more consistent than whatever the default is in 0.9. Not sure what the default in 0.10 is. On Fri, Mar 30, 2018 at 11:40 AM, Siva A <siva9940261...@gmail.com> wrote: > Any other update on this? > > On Mon, Mar 26, 2018, 7:42 PM Andrew Otto <o...@wikimedia.org> wrote: > > > I’ve had similar problems, but I don’t have an explanation for ya :/ > > > > On Sun, Mar 25, 2018 at 12:19 PM, Siva A <siva9940261...@gmail.com> > wrote: > > > > > Hi, > > > > > > We have 3 nodes Kafka cluster(0.10.0.1) and its mirroring the data from > > > another 3 node cluster of same Kafka version. > > > Both the clusters are Kerberized and we are running the Mirrormaker on > > the > > > target cluster using the single principal/keytab with the one way trust > > on > > > the KDC. > > > > > > At times, the mirrormaker stops functioning(Doesn't mirror the data) > but > > > the process is still running. If we restart the service then it works > > fine > > > for a day or so. > > > > > > I don't see any error on the Kafka logs as well. > > > Is there anyone seen this kind of issue? > > > > > > Thanks > > > Siva > > > > > >