Hi, Joong, Please check the following two links:
- https://cwiki.apache.org/confluence/display/KAFKA/KIP-3+-+Mirror+Maker+Enhancement - https://cwiki.apache.org/confluence/display/KAFKA/KIP-8+-+Add+a+flush+method+to+the+producer+API They might help you understand the problem. Cheers, Xiao Li 2015-05-01 6:28 GMT-07:00 Joe Stein <joe.st...@stealth.ly>: > If you want 0 data loss you should also look into the min.insync.repica > setting in 0.8.2.1 as it guarantees data in multiple racks. > > If you don't have that set then you have this scenario as possible. > > lets say 1 topic, 1 partition, replication 3. You are producing with ACK=-1 > > b1, b2, b3 (where b=broker and b1 is leader, b2, b3 replicas). > > b1,b2 dies, b3 is leader. so far all is well. > > 10 minutes go by and b3 dies > > 1 minute later b1 comes back online, it will truncate essentially 45 > minutes of data upstream thought was saved. > > but now, you can have ACK=-1 get a failure if you don't have a enough > replica to survive data loss guarantees. min.isr=2 min.sir=3 //depends on > data > > Also take a look at > https://github.com/stealthly/go_kafka_client/tree/master/mirrormaker it > might be helpful for what you are looking for. > > ~ Joe Stein > - - - - - - - - - - - - - - - - - > > http://www.stealth.ly > - - - - - - - - - - - - - - - - - > > On Fri, May 1, 2015 at 7:43 AM, Joong Lee <jo...@me.com> wrote: > > > It is based on our understanding from reading the documents. > > > > We aren't concerned of data duplication as that is going to be handled by > > elasticsearch. > > > > > On May 1, 2015, at 12:15 AM, Daniel Compton < > > daniel.compton.li...@gmail.com> wrote: > > > > > > When we evaluated MirrorMaker last year we didn't find any risk of data > > > loss, only duplicate messages in the case of a network partition. > > > > > > Did you discover data loss in your tests, or were you just looking at > the > > > docs? > > > On Fri, 1 May 2015 at 4:31 pm Jiangjie Qin <j...@linkedin.com.invalid> > > > wrote: > > > > > >> Which mirror maker version did you look at? The MirrorMaker in trunk > > >> should not have data loss if you just use the default setting. > > >> > > >>> On 4/30/15, 7:53 PM, "Joong Lee" <jo...@me.com> wrote: > > >>> > > >>> Hi, > > >>> We are exploring Kafka to keep two data centers (primary and DR) > > running > > >>> hosts of elastic search nodes in sync. One key requirement is that we > > >>> can't lose any data. We POC'd use of MirrorMaker and felt it may not > > meet > > >>> out data loss requirement. > > >>> > > >>> I would like ask the community if we should look for another solution > > or > > >>> would Kafka be the right solution considering zero data loss > > requirement. > > >>> > > >>> Thanks > > >> > > >> > > >