When you state the numbers, are they the same across instances in the
cluster, meaning that Topic-0 would have 910*5 GB in source cluster and
25*5 GB in target cluster?

Another possibility is that MirrorMaker uses compression on the producer
side, but I would be surprised if the compression rate could be 25/910.

Guozhang


On Thu, Aug 22, 2013 at 3:48 PM, Rajasekar Elango <rela...@salesforce.com>wrote:

> Yes, both source and target clusters have 5 brokers in cluster.
>
> Sent from my iPhone
>
> On Aug 22, 2013, at 6:11 PM, Guozhang Wang <wangg...@gmail.com> wrote:
>
> > Hello Rajasekar,
> >
> > Are the size of the source cluster and target cluster the same?
> >
> > Guozhang
> >
> >
> > On Thu, Aug 22, 2013 at 2:14 PM, Rajasekar Elango <
> rela...@salesforce.com>wrote:
> >
> >> Hi,
> >>
> >> We are using mirrormaker to replicate data between two kafka clusters.
> I am
> >> seeing huge difference in size of log in data dir between the broker in
> >> source cluster vs broker in destination cluster:
> >>
> >> For eg: Size of ~/data/Topic-0/ is about 910 G in source broker, but
> only
> >> its only 25G in destination broker. I see segmented log files (~500 M)
> is
> >> created for about every 2 or 3 mins in source brokers, but I see
> segmented
> >> log files is created for about every 25 mins in destination broker.
> >>
> >> I verified mirrormaker is doing fine using consumer offset checker, not
> >> much lag, offsets are incrementing. I also verified that
> topics/partitions
> >> are not under replicated in both source and target cluster. What is the
> >> reason for this difference in disk usage?
> >>
> >>
> >> --
> >> Thanks,
> >> Raja.
> >
> >
> >
> > --
> > -- Guozhang
>



-- 
-- Guozhang

Reply via email to