Hi there, Can anyone please answer to my below follow-up questions for Ewen's responses.
Thanks On Tue, 17 Jan 2017 at 00:28 Greenhorn Techie <greenhorntec...@gmail.com> wrote: > Thanks Ewen for the detailed response. This is quite helpful and cleared > some of my doubts. However, I do have some follow-up queries. Can you > please let me know your thoughts on the same? > > [Query] Is non-compacted topics a pre-requisite to have this mechanism > work as expected? What are the challenges that need to be looked for in > case of compacted-topics? > 1. Use MM normally to replicate your data. Be *very* sure you construct > your setup to ensure *everything* is mirrored (proper # of partitions, > replication factor, topic level configs, etc). (Note that this is something > the Confluent replication solution is addressing that's a significant > gap in MM.) > [Query] Here we are planning to use MirrorMaker to do the job for us and > hence topic creation etc is expected to be created by MirrorMaker (by > setting auto.create.topics.enable=true) Will this work? or will setting > auto.create.topics.enable=true > create the topic with default settings? > 2. During replication, be sure to record offset deltas for every > topic partition. These are needed to reverse the direction of > replication correctly. Make sure to store them in the backup DC and > somewhere very reliable. > [Query] Is there any recommended approach to do this? As I am new to > Kafka, wondering if there is a good way of doing this > 4. Decide to do failover. Ensure replication has actually stopped (via > your own tooling, or probably better, by using ACLs to ensure no new data > can be > produced from original DC to backup DC) > [Query] Does stopping replication mean killing the MirrorMaker process? Or > is there more needed here? Using ACLs probably, we can ensure the mirror > maker service account doesn't have read access on the source cluster and > write access on the DR cluster. Is there anything else to be done here? > 5. Record all the high watermarks for every topic partition so you > know which data was replicated from the original DC (vs which is new > after failover). > [Query] Is there any best practice around this? In the presentation Jun > Rao talks about time-stamp based offset recording. As I understand, that > would probably help our case, where we can probably produce messages to the > DR cluster, from the point of failover > 7. Once the original DC is back alive, you want to reverse replication > and make it the backup. Lookup the offset deltas, use them to > initialize offsets for the consumer group you'll use to do replication. > [Query]In order to lookup the offset deltas before initiating the > consumers on the original cluster, is there any recommended > mechanism/tooling that can be leveraged? > > Best Regards > > On Fri, 6 Jan 2017 at 03:31 Ewen Cheslack-Postava <e...@confluent.io> > wrote: > > On Thu, Jan 5, 2017 at 3:07 AM, Greenhorn Techie < > greenhorntec...@gmail.com> > wrote: > > > Hi, > > > > We are planning to setup MirrorMaker based Kafka replication for DR > > purposes. The base requirement is to have a DR replication from primary > > (site1) to DR site (site2)using MirrorMaker, > > > > However, we need the solution to work in case of failover as well i.e. > > where in the event of the site1 kafka cluster failing, site2 kafka > cluster > > would be made primary. Later when site1 cluster eventually comes back-up > > online, direction of replication would be from site2 to site1. > > > > But as I understand, the offsets on each of the clusters are different, > so > > wondering how to design the solution given this constraint and > > requirements. > > > > It turns out this is tricky. And once you start digging in you'll find it's > way more complicated than you might originally think. > > Before going down the rabbit hole, I'd suggest taking a look at this great > talk by Jun Rao (one of the original authors of Kafka) about multi-DC Kafka > setups: https://www.youtube.com/watch?v=Dvk0cwqGgws > > Additionally, I want to mention that while it is tempting to want to treat > multi-DC DR cases in a way that we get really convenient, strongly > consistent, highly available behavior because that makes it easier to > reason about and avoids pushing much of the burden down to applications, > that's not realistic or practical. And honestly, it's rarely even > necessary. DR cases really are DR. Usually it is possible to make some > tradeoffs you might not make under normal circumstances (the most important > one being the tradeoff between possibly seeing duplicates vs exactly once). > The tension here is often that one team is responsible for maintain the > infrastructure and handling this DR failover scenario, and others are > responsible for the behavior of the applications. The infrastructure team > is responsible for figuring out the DR failover story but if they don't > solve it at the infrastructure layer then they get stuck having to > understand all the current (and future) applications built on that > infrastructure. > > That said, here are the details I think you're looking for: > > The short answer right now is that doing DR failover like that is not going > to be easy with MM. Confluent is building additional tools to deal with > multi-DC setups because of a bunch of these challenges: > https://www.confluent.io/product/multi-datacenter/ > > For your specific concern about reversing the direction of replication, > you'd need to build additional tooling to support this. The basic list of > steps would be something like this (assuming non-compacted topics): > > 1. Use MM normally to replicate your data. Be *very* sure you construct > your setup to ensure *everything* is mirrored (proper # of partitions, > replication factor, topic level configs, etc). (Note that this is something > the Confluent replication solution is addressing that's a significant gap > in MM.) > 2. During replication, be sure to record offset deltas for every topic > partition. These are needed to reverse the direction of replication > correctly. Make sure to store them in the backup DC and somewhere very > reliable. > 3. Observe DC failure. > 4. Decide to do failover. Ensure replication has actually stopped (via your > own tooling, or probably better, by using ACLs to ensure no new data can be > produced from original DC to backup DC) > 5. Record all the high watermarks for every topic partition so you know > which data was replicated from the original DC (vs which is new after > failover). > 6. Allow failover to proceed. Make the backup DC primary. > 7. Once the original DC is back alive, you want to reverse replication and > make it the backup. Lookup the offset deltas, use them to initialize > offsets for the consumer group you'll use to do replication. > 8. Go back to the original DC and make sure there isn't any "extra" data, > i.e. stuff that didn't get replicated but was successfully written to the > original DC's cluster. For topic partitions where there is data beyond the > expected offsets, you currently would need to just delete the entire set of > data, or at least to before the offset we expect to start at. (A truncate > operation might be a nice way to avoid having to dump *all* the data, but > doesn't currently exist.) > 9. Once you've got the two clusters back in a reasonably synced state with > appropriate starting offsets committed, start up MM again in the reverse > direction. > > If this sounds tricky, it turns out that when you add compacted topics, > things get quite a bit messier.... > > -Ewen > > > > > > Thanks > > > >