+1 for that knob on a per topic basis, choosing consistency over availability would open kafka to more use cases no?
Sent from my iPhone On Aug 22, 2013, at 1:59 PM, Neha Narkhede <neha.narkh...@gmail.com> wrote: > Scott, > > Kafka replication aims to guarantee that committed writes are not lost. In > other words, as long as leader can be transitioned to a broker that was in > the ISR, no data will be lost. For increased availability, if there are no > other brokers in the ISR, we fall back to electing a broker that is not > caught up with the current leader, as the new leader. IMO, this is the real > problem that the post is complaining about. > > Let me explain his test in more detail- > > 1. The first part of the test partitions the leader (n1) from other brokers > (n2-n5). The leader shrinks the ISR to just itself and ends up taking n > writes. This is not a problem all by itself. Once the partition is > resolved, n2-n5 would catch up from the leader and no writes will be lost, > since n1 would continue to serve as the leader. > 2. The problem starts in the second part of the test where it partitions > the leader (n1) from zookeeper. This causes the unclean leader election > (mentioned above), which causes Kafka to lose data. > > We thought about this while designing replication, but never ended up > including the feature that would allow some applications to pick > consistency over availability. Basically, we could let applications pick > some topics for which the controller will never attempt unclean leader > election. The result is that Kafka would reject writes and mark the > partition offline, instead of moving leadership to a broker that is not in > ISR, and losing the writes. > > I think if we included this knob, the tests that aphyr (jepsen) ran, would > make more sense. > > Thanks, > Neha > > > On Thu, Aug 22, 2013 at 12:50 PM, Scott Clasen <sc...@heroku.com> wrote: > >> So looks like there is a jespen post coming on kafka 0.8 replication, based >> on this thats circulating on twitter. https://www.refheap.com/17932/raw >> >> Understanding that kafka isnt designed particularly to be partition >> tolerant, the result is not completely surprising. >> >> But my question is, is there something that can be done about the lost >> messages? >> >> From my understanding when broker n1 comes back on line, currently what >> will happen is that the messages that were only on n1 will be >> truncated/tossed while n1 is coming back to ISR. Please correct me if this >> is not accurate. >> >> Would it instead be possible to do something else with them, like sending >> them to an internal lost messages topic, or log file where some manual >> intervenion could be done on them, or a configuration property like >> replay.truncated.messages=true could be set where the broker would send the >> lost messages back onto the topic after ISR? >>