Iftach,

I think you should try observe if this happens with other topics. Maybe something unrelated might have happened already in the case of the topic that currently has ~3TB of data -- making things even harder to troubleshoot.

I would recommend creating a new topic with few partitions and configure that topic in the whitelist. Then, observe if the same behavior occur. If it does then it might be something wrong with MM2 -- likely a bug or misconfiguration. If not then you can eliminate MM2 as the cause and work at a smaller scale to see if something went south with the topic. Maybe that could be something not even related to MM2 such as network failures that forced the internal producer of MM2 to retry multiple times and hence produce more data that it should.

The bottom-line is that certain troubleshooting exercises are hard or sometimes impossible to diagnose with cases that might have been an outlier.

-- Ricardo

On 7/1/20 10:02 AM, Iftach Ben-Yosef wrote:
Hi Ryanne, thanks for the quick reply.

I had the thought it might be compression. I see that the topics have the
following config "compression.type=producer". This is for both the source
and destination topics. Should I check something else regarding compression?

Also, the destination topics are larger than the same topic being mirrored
using mm1 - the sum of the 3 topics mirrored by mm2 is much larger than the
1 topic that mm1 produced (they have the same 3 source topics, only mm1
aggregates to 1 destination topic). Retention is again the same between the
mm1 destination topic and the mm2 destination topics.

Thanks,
Iftach


On Wed, Jul 1, 2020 at 4:54 PM Ryanne Dolan <ryannedo...@gmail.com> wrote:

Iftach, is it possible the source topic is compressed?

Ryanne

On Wed, Jul 1, 2020, 8:39 AM Iftach Ben-Yosef <iben-yo...@outbrain.com>
wrote:

Hello everyone.

I'm testing mm2 for our cross dc topic replication. We used to do it
using
mm1 but faced various issues.

So far, mm2 is working well, but I have 1 issue which I can't really
explain; the destination topic is larger than the source topic.

For example, We have 1 topic which on the source cluster is around
2.8-2.9TB with retention.ms=86400000

I added to our mm2 cluster the "sync.topic.configs.enabled=false" config,
and edited the retention.ms of the destination topic to be 57600000.
Other
than that, I haven't touched the topic created by mm2 on the destination
cluster.

By logic I'd say that if I shortened the retention on the destination,
the
topic size should decrease, but in practice, I see that it is larger than
the source topic (it's about 4.6TB).
This same behaviour is seen on all 3 topics which I am currently
mirroring
(all 3 from different source clusters, into the same destination
clusters)
Does anyone have any idea as to why mm2 acts this way for me?

Thanks,
Iftach

--
The above terms reflect a potential business arrangement, are provided
solely as a basis for further discussion, and are not intended to be and
do
not constitute a legally binding obligation. No legally binding
obligations
will be created, implied, or inferred until an agreement in final form is
executed in writing by all parties involved.


This email and any
attachments hereto may be confidential or privileged.  If you received
this
communication by mistake, please don't forward it to anyone else, please
erase all copies and attachments, and please let me know that it has gone
to the wrong person. Thanks.

Reply via email to