Re: options for kafka cluster backup?

Liam Clarke-Hutchinson Sun, 07 Mar 2021 22:13:34 -0800

As Ryanne said,

MM2 "syncs" offsets - in that it maintains a mapping of "cluster A offsets"
to "cluster B offsets" in cluster B, so if you have to move a consumer
group from A to B, you can relatively easily point the consumer group at
the offsets on B that map to its offsets on A.


On Mon, Mar 8, 2021 at 7:00 PM Ryanne Dolan <ryannedo...@gmail.com> wrote:

> MirrorMaker v1 does not sync offsets, but MM2 does!
>
> Ryanne
>
> On Sun, Mar 7, 2021, 10:02 PM Pushkar Deole <pdeole2...@gmail.com> wrote:
>
> > Thanks you all!
> >
> > Blake, for your comment:
> >
> > It'll require having a HA cluster running in another region, of course.
> > One other caveat is that it doesn't preserve the offsets of the records
> >
> > -> I believe I can't afford to keep another cluster running due to cost
> > reasons.Can you elaborate on the offset part, if offset is not preserved
> > then how the backup cluster know where to start processing for each
> topic?
> >
> > For example, you could use a Kafka Connect s3 sink. You'd have to write
> > some disaster-recovery code to restore lost data from s3 into Kafka.
> >
> > -> again here the same question, does s3 also store offset for each topic
> > as it is modified in kafka? If not then when the back is restored back
> into
> > kafka cluster, how it will know where to process each topic from?
> >
> > On Sat, Mar 6, 2021 at 4:44 PM Himanshu Shukla <
> > himanshushukla...@gmail.com>
> > wrote:
> >
> > > Hi Pushkar,
> > >
> > > you could also look at the available Kafka-connect plugins. It provides
> > > many connectors which could be leveraged to move the data in/out from
> > > Kafka.
> > >
> > > On Sat, Mar 6, 2021 at 10:18 AM Blake Miller <blak3mil...@gmail.com>
> > > wrote:
> > >
> > > > MirrorMaker is one reasonable way to do this, certainly it can
> > replicate
> > > to
> > > > another region, with most of the latency being the unavoidable kind,
> if
> > > you
> > > > give it enough resources.
> > > >
> > > > It'll require having a HA cluster running in another region, of
> course.
> > > One
> > > > other caveat is that it doesn't preserve the offsets of the records.
> > > That's
> > > > probably okay for your use-case, but you should be aware of it.
> > > >
> > > > Since what you want is a backup, there are many ways to do that which
> > > might
> > > > be cheaper than another Kafka cluster.
> > > >
> > > > For example, you could use a Kafka Connect s3 sink. You'd have to
> write
> > > > some disaster-recovery code to restore lost data from s3 into Kafka.
> > > >
> > > >
> https://www.confluent.io/blog/apache-kafka-to-amazon-s3-exactly-once/
> > > >
> > > > There are many other sinks available, but s3 might be a reasonable
> > choice
> > > > for backup. It's inexpensive and reliable.
> > > >
> > > > On Fri, Mar 5, 2021, 2:48 AM Pushkar Deole <pdeole2...@gmail.com>
> > wrote:
> > > >
> > > > > Yes.. so the requirement for me is to have data backed up or
> > replicated
> > > > in
> > > > > a different 'region' to cater for disaster scenarios and recover
> from
> > > > them
> > > > >
> > > > > On Fri, Mar 5, 2021 at 3:01 PM Ran Lupovich <ranlupov...@gmail.com
> >
> > > > wrote:
> > > > >
> > > > > > I guess that in case of avoiding data lose you would need to use
> 3
> > > > > replica
> > > > > > in different rack/sites awareness to avoid data lose, Confluent's
> > > > > > Replicator or MirrorMaker are for copying data from one cluster
> to
> > > > > another
> > > > > > usually in different dc / regions, If I am not mistaken
> > > > > >
> > > > > > בתאריך יום ו׳, 5 במרץ 2021, 11:21, מאת Pushkar Deole ‏<
> > > > > > pdeole2...@gmail.com
> > > > > > >:
> > > > > >
> > > > > > > Thanks Luke... is the mirror maker asynchronous? What will be
> > > typical
> > > > > lag
> > > > > > > between the replicated cluster and running cluster and in case
> of
> > > > > > disaster,
> > > > > > > what are the chances of data loss?
> > > > > > >
> > > > > > > On Fri, Mar 5, 2021 at 11:37 AM Luke Chen <show...@gmail.com>
> > > wrote:
> > > > > > >
> > > > > > > > Hi Pushkar,
> > > > > > > > MirrorMaker is what you're looking for.
> > > > > > > > ref:
> > > > > >
> https://kafka.apache.org/documentation/#georeplication-mirrormaker
> > > > > > > >
> > > > > > > > Thanks.
> > > > > > > > Luke
> > > > > > > >
> > > > > > > > On Fri, Mar 5, 2021 at 1:50 PM Pushkar Deole <
> > > pdeole2...@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi All,
> > > > > > > > >
> > > > > > > > > I was looking for some options to backup a running kafka
> > > cluster,
> > > > > for
> > > > > > > > > disaster recovery requirements. Can someone provide what
> are
> > > the
> > > > > > > > available
> > > > > > > > > options to backup and restore a running cluster in case the
> > > > entire
> > > > > > > > cluster
> > > > > > > > > goes down?
> > > > > > > > >
> > > > > > > > > Thanks..
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> > > --
> > > Regards,
> > > Himanshu Shukla
> > >
> >
>

Re: options for kafka cluster backup?

Reply via email to