Iftach,
This is a very useful finding. While I don't know the answer to your
question below, I would like to take this opportunity to encourage you
to write a blog about this finding =)
Thanks,
-- Ricardo
On 7/7/20 2:48 AM, Iftach Ben-Yosef wrote:
I believe I got it to work with
"source->dest.producer.compression.type = gzip"
Is there a way to set this globally for the mm2 process and not to do
it per mirroring flow?
Thanks,
Iftach
On Tue, Jul 7, 2020 at 9:34 AM Iftach Ben-Yosef
<iben-yo...@outbrain.com <mailto:iben-yo...@outbrain.com>> wrote:
Upon further investigation, it the issue is indeed compression as
in the logs i see 'ompression.type = none'
Does anyone know how to configure gzip compression for
the connect-mirror-maker.properties file?
I tried "producer.override.compression.type = gzip" but that
doesnt seem to work.
Thanks,
Iftach
On Mon, Jul 6, 2020 at 8:03 AM Iftach Ben-Yosef
<iben-yo...@outbrain.com <mailto:iben-yo...@outbrain.com>> wrote:
Ricardo,
Thanks for the reply. I did some more testing. I tried
mirroring a different topic from 1 of the 3 source clusters
used from the previous test, into the same destination
cluster. Again, the result topic on the dest cluster is about
2 times larger than the source, same config and retention
(both have compression.type producer)
regarding my configuration, other than the clusters and
mirroring direction/topic whitelist configs I have the
following - changed all the prefixes to .. to make it shorter;
..tasks.max = 128
..fetch.max.wait.ms <http://fetch.max.wait.ms> = 150
..fetch.min.bytes = 10485760
..fetch.max.bytes = 52428800
..max.request.size = 10485760
..enable.idempotence = true
..sync.topic.configs.enabled=false (played with this as true
and as false)
Don't see how anything other than perhaps the idempotency
could affect the topic size. I have also tried without
idempotency config, but it looks the same - and in any case I
expect idempotency to maybe decrease the topic size, not
increase it...
Thanks,
Iftach
On Thu, Jul 2, 2020 at 5:30 PM Ricardo Ferreira
<rifer...@riferrei.com <mailto:rifer...@riferrei.com>> wrote:
Iftach,
I think you should try observe if this happens with other
topics. Maybe something unrelated might have happened
already in the case of the topic that currently has ~3TB
of data -- making things even harder to troubleshoot.
I would recommend creating a new topic with few partitions
and configure that topic in the whitelist. Then, observe
if the same behavior occur. If it does then it might be
something wrong with MM2 -- likely a bug or
misconfiguration. If not then you can eliminate MM2 as the
cause and work at a smaller scale to see if something went
south with the topic. Maybe that could be something not
even related to MM2 such as network failures that forced
the internal producer of MM2 to retry multiple times and
hence produce more data that it should.
The bottom-line is that certain troubleshooting exercises
are hard or sometimes impossible to diagnose with cases
that might have been an outlier.
-- Ricardo
On 7/1/20 10:02 AM, Iftach Ben-Yosef wrote:
Hi Ryanne, thanks for the quick reply.
I had the thought it might be compression. I see that the topics
have the
following config "compression.type=producer". This is for both the
source
and destination topics. Should I check something else regarding
compression?
Also, the destination topics are larger than the same topic being
mirrored
using mm1 - the sum of the 3 topics mirrored by mm2 is much larger
than the
1 topic that mm1 produced (they have the same 3 source topics, only
mm1
aggregates to 1 destination topic). Retention is again the same
between the
mm1 destination topic and the mm2 destination topics.
Thanks,
Iftach
On Wed, Jul 1, 2020 at 4:54 PM Ryanne Dolan<ryannedo...@gmail.com>
<mailto:ryannedo...@gmail.com> wrote:
Iftach, is it possible the source topic is compressed?
Ryanne
On Wed, Jul 1, 2020, 8:39 AM Iftach Ben-Yosef<iben-yo...@outbrain.com>
<mailto:iben-yo...@outbrain.com>
wrote:
Hello everyone.
I'm testing mm2 for our cross dc topic replication. We used to do it
using
mm1 but faced various issues.
So far, mm2 is working well, but I have 1 issue which I can't really
explain; the destination topic is larger than the source topic.
For example, We have 1 topic which on the source cluster is around
2.8-2.9TB withretention.ms <http://retention.ms>=86400000
I added to our mm2 cluster the "sync.topic.configs.enabled=false"
config,
and edited theretention.ms <http://retention.ms> of the
destination topic to be 57600000.
Other
than that, I haven't touched the topic created by mm2 on the
destination
cluster.
By logic I'd say that if I shortened the retention on the
destination,
the
topic size should decrease, but in practice, I see that it is
larger than
the source topic (it's about 4.6TB).
This same behaviour is seen on all 3 topics which I am currently
mirroring
(all 3 from different source clusters, into the same destination
clusters)
Does anyone have any idea as to why mm2 acts this way for me?
Thanks,
Iftach
--
The above terms reflect a potential business arrangement, are
provided
solely as a basis for further discussion, and are not intended to
be and
do
not constitute a legally binding obligation. No legally binding
obligations
will be created, implied, or inferred until an agreement in final
form is
executed in writing by all parties involved.
This email and any
attachments hereto may be confidential or privileged. If you
received
this
communication by mistake, please don't forward it to anyone else,
please
erase all copies and attachments, and please let me know that it
has gone
to the wrong person. Thanks.
The above terms reflect a potential business arrangement, are provided
solely as a basis for further discussion, and are not intended to be
and do not constitute a legally binding obligation. No legally binding
obligations will be created, implied, or inferred until an agreement
in final form is executed in writing by all parties involved.
This email and any attachments hereto may be confidential or
privileged. If you received this communication by mistake, please
don't forward it to anyone else, please erase all copies and
attachments, and please let me know that it has gone to the wrong
person. Thanks.