I agree with you. If we include that knob, applications can choose their consistency vs availability tradeoff according to the respective requirements. I will file a JIRA for this.
Thanks, Neha On Thu, Aug 22, 2013 at 2:10 PM, Scott Clasen <sc...@heroku.com> wrote: > +1 for that knob on a per topic basis, choosing consistency over > availability would open kafka to more use cases no? > > Sent from my iPhone > > On Aug 22, 2013, at 1:59 PM, Neha Narkhede <neha.narkh...@gmail.com> > wrote: > > > Scott, > > > > Kafka replication aims to guarantee that committed writes are not lost. > In > > other words, as long as leader can be transitioned to a broker that was > in > > the ISR, no data will be lost. For increased availability, if there are > no > > other brokers in the ISR, we fall back to electing a broker that is not > > caught up with the current leader, as the new leader. IMO, this is the > real > > problem that the post is complaining about. > > > > Let me explain his test in more detail- > > > > 1. The first part of the test partitions the leader (n1) from other > brokers > > (n2-n5). The leader shrinks the ISR to just itself and ends up taking n > > writes. This is not a problem all by itself. Once the partition is > > resolved, n2-n5 would catch up from the leader and no writes will be > lost, > > since n1 would continue to serve as the leader. > > 2. The problem starts in the second part of the test where it partitions > > the leader (n1) from zookeeper. This causes the unclean leader election > > (mentioned above), which causes Kafka to lose data. > > > > We thought about this while designing replication, but never ended up > > including the feature that would allow some applications to pick > > consistency over availability. Basically, we could let applications pick > > some topics for which the controller will never attempt unclean leader > > election. The result is that Kafka would reject writes and mark the > > partition offline, instead of moving leadership to a broker that is not > in > > ISR, and losing the writes. > > > > I think if we included this knob, the tests that aphyr (jepsen) ran, > would > > make more sense. > > > > Thanks, > > Neha > > > > > > On Thu, Aug 22, 2013 at 12:50 PM, Scott Clasen <sc...@heroku.com> wrote: > > > >> So looks like there is a jespen post coming on kafka 0.8 replication, > based > >> on this thats circulating on twitter. https://www.refheap.com/17932/raw > >> > >> Understanding that kafka isnt designed particularly to be partition > >> tolerant, the result is not completely surprising. > >> > >> But my question is, is there something that can be done about the lost > >> messages? > >> > >> From my understanding when broker n1 comes back on line, currently what > >> will happen is that the messages that were only on n1 will be > >> truncated/tossed while n1 is coming back to ISR. Please correct me if > this > >> is not accurate. > >> > >> Would it instead be possible to do something else with them, like > sending > >> them to an internal lost messages topic, or log file where some manual > >> intervenion could be done on them, or a configuration property like > >> replay.truncated.messages=true could be set where the broker would send > the > >> lost messages back onto the topic after ISR? > >> >