I tried again by increasing the max.poll.interval.ms to a really high
number like 12 hrs and since then it has been replicating fine.
What I observed was that after every max.poll.interval elapse, it was
getting stuck and replication was stopping.
Putting a high number like 12 hrs, it worked fine for a few days.
Not a full solution but still better than before.

On Wed, Feb 8, 2023, 23:48 Greg Harris <greg.har...@aiven.io.invalid> wrote:

> Arpit,
>
> I am not very familiar with MirrorMaker unfortunately so I won't be able to
> give you any specific advice.
> I also don't see any MirrorMaker-specific changes that would be relevant,
> except for some minor arguments changes and the deprecation landing in 3.0.
>
> > Its very random. It replicates for couple of hours fine and than stops
> for
> a day.
>
> Hopefully logging will help you to understand why the replication flow
> starts and stops.
> Do you have any very long timeouts which would correspond to the day-long
> downtime?
> Looking at MirrorMaker options, theres a `abort.on.send.failure`
> configuration that may be able to cause MirrorMaker to fail fast in some
> cases to allow you to debug it, and possibly auto-restart it.
>
> > Could you tell me how can I enable more logging ?
>
> I believe you can configure the logging by changing the KAFKA_LOG4J_OPTS
> environment variable before running the mirror maker script.
> For example, you could copy and modify the existing tools config:
> https://github.com/apache/kafka/blob/trunk/config/tools-log4j.properties
> and provide the copy to the MirrorMaker tool with:
>
> export
>
> KAFKA_LOG4J_OPTS="-Dlog4j.configuration=file:/path/to/your/tools-log4j.properties"
>
> One additional thing you might be able to do if you catch it when it's
> stalled is to take a heap dump or stacktrace of the containing JVM.
> This will let you see what the process is doing, and maybe see if there are
> stuck threads, excess memory, or other things preventing the replication
> from progressing.
>
> Good luck with your investigation!
> Greg
>
>
> On Wed, Feb 8, 2023 at 2:20 PM Arpit Jain <jain.arp...@gmail.com> wrote:
>
> > Hi Greg,
> >
> > Thanks for getting back to me. Please find more details below
> >
> > 1. Are you using MirrorMaker, or MirrorMaker 2.0?
> > Mirror maker
> > 2. What version of MM or MM2 are you using, and with what Kafka broker
> > version?
> > 3.2.3
> > 3. How is your replication flow configured?
> > We have upstream brokers (3 node kafka cluster) and we have one kafka
> > consumers for each lower environments and it is producing message for
> lower
> > environment kafka cluster
> > 4. What is the frequency and duration of these interruptions?
> > Its very random. It replicates for couple of hours fine and than stops
> for
> > a day.
> > 5. When did the interruptions start?
> > Not sure about that. It could be after we moved to 3.2.3
> > 6. Has anything changed in your environment recently, such as new
> > partitions or an upgrade?
> > No
> > 7. Are you seeing any ERROR logs or other unique logs from the
> replication
> > flow?
> > Only the warning to increase max.poll.interval or decrease poll.records
> > 8. Have you tried enabling more detailed logs and watched the progress of
> > the replication flow around the time it stops replicating?
> > Could you tell me how can I enable more logging ?
> >
> > Thanks,
> > Arpit
> >
> > On Wed, Feb 8, 2023, 18:16 Greg Harris <greg.har...@aiven.io.invalid>
> > wrote:
> >
> > > Arpit,
> > >
> > > Unfortunately from that description nothing specific is coming to mind.
> > > The max.poll.interval indicates that the consumer is losing contact
> with
> > > the Kafka cluster, but that may be caused by the replication
> application
> > > hanging somewhere else.
> > >
> > > Some clarifying questions, and things you can look into:
> > > 1. Are you using MirrorMaker, or MirrorMaker 2.0?
> > > 2. What version of MM or MM2 are you using, and with what Kafka broker
> > > version?
> > > 3. How is your replication flow configured?
> > > 4. What is the frequency and duration of these interruptions?
> > > 5. When did the interruptions start?
> > > 6. Has anything changed in your environment recently, such as new
> > > partitions or an upgrade?
> > > 7. Are you seeing any ERROR logs or other unique logs from the
> > replication
> > > flow?
> > > 8. Have you tried enabling more detailed logs and watched the progress
> of
> > > the replication flow around the time it stops replicating?
> > >
> > > Thanks,
> > > Greg Harris
> > >
> > >
> > > On Tue, Feb 7, 2023 at 6:27 AM Arpit Jain <jain.arp...@gmail.com>
> wrote:
> > >
> > > > Hi,
> > > >
> > > > Hope this is the right forum to ask for Kafka mirror maker issues.
> > > > We are facing an issue where the mirror maker replicates the trades
> and
> > > > then doesn't work for long time and again replicates.
> > > > Also seeing the warning message to increase the poll interval or
> > decrease
> > > > the maximum batch size (max.poll.records).
> > > >
> > > > I have tried reducing max.poll.records to 250 but still same issues.
> > > >
> > > > Could anyone suggest what could be wrong?
> > > >
> > > > Thanks
> > > >
> > >
> >
>

Reply via email to