Thinking about it some more I guess you are really talking about monitoring UnderReplicatedPartitionCount during a restart?
/Sam On Sep 6, 2013, at 5:46 PM, Sam Meder <sam.me...@jivesoftware.com> wrote: > On Aug 29, 2013, at 11:12 PM, Neha Narkhede <neha.narkh...@gmail.com> wrote: > >>>> How do you automate waiting for the broker to come up? Just keep >> monitoring the process and keep trying to connect to the port? >> >> Every leader in a Kafka cluster exposes the UnderReplicatedPartitionCount >> metric. The safest way to issue controlled shutdown is to wait until that >> metric reports 0 on the brokers. > > Maybe I am missing something, but won't the topics for which I have > partitions on the broker I am shutting down always report as under-replicated > (unless I manually reassign the partition to another broker)? I thought that > the shutdown logic really only dealt with transferring the leader status for > a partition. > > As a side note it would be great to have a minimum replication factor in > addition to the regular replication factor so one can enforce durability > guarantees (fail the producer when the message can't be sufficiently > replicated). > >> If you try to shutdown the last broker in >> the ISR, the controlled shutdown cannot succeed since there is no other >> broker to move the leader to. Waiting until under replicated partition >> count hits 0 prevents you from hitting this issue. >> >> This also solves the problem of waiting until the broker comes up since you >> will automatically wait until the broker comes up and joins ISR. > > Not sure I follow, but one start-up situation I am concerned about is what > happens on abnormal termination (whether through a kill -9, OOM, HW failure - > what ever floats your boat). For this scenario it would be great if there was > a way to wait for the recovery process to finish. For now we can just wait > for the server port to become available, but something more explicit would be > great. > > /Sam > >> >> >> Thanks, >> Neha >> >> >> On Thu, Aug 29, 2013 at 12:59 PM, Sam Meder >> <sam.me...@jivesoftware.com>wrote: >> >>> Ok, I spent some more time staring at our logs and figured out that it was >>> our fault. We were not waiting around for the Kafka broker to fully >>> initialize before moving on to the next broker and loading the data logs >>> can take quite some time (~7 minutes in one case), so we ended up with no >>> replicas online at some point and the replica that came back first was a >>> little short on data... >>> >>> How do you automate waiting for the broker to come up? Just keep >>> monitoring the process and keep trying to connect to the port? >>> >>> /Sam >>> >>> On Aug 29, 2013, at 6:40 PM, Sam Meder <sam.me...@jivesoftware.com> wrote: >>> >>>> >>>> On Aug 29, 2013, at 5:50 PM, Sriram Subramanian < >>> srsubraman...@linkedin.com> wrote: >>>> >>>>> Do you know why you timed out on a regular shutdown? >>>> >>>> No, though I think it may just have been that the timeout we put in was >>> too short. >>>> >>>>> If the replica had >>>>> fallen off of the ISR and shutdown was forced on the leader this could >>>>> happen. >>>> >>>> Hmm, but it shouldn't really be made leader if it isn't even in the isr, >>> should it? >>>> >>>> /Sam >>>> >>>>> With ack = -1, we guarantee that all the replicas in the in sync >>>>> set have received the message before exposing the message to the >>> consumer. >>>>> >>>>> On 8/29/13 8:32 AM, "Sam Meder" <sam.me...@jivesoftware.com> wrote: >>>>> >>>>>> We've recently come across a scenario where we see consumers resetting >>>>>> their offsets to earliest and which as far as I can tell may also lead >>> to >>>>>> data loss (we're running with ack = -1 to avoid loss). This seems to >>>>>> happen when we time out on doing a regular shutdown and instead kill -9 >>>>>> the kafka broker, but does obviously apply to any scenario that >>> involves >>>>>> a unclean exit. As far as I can tell what happens is >>>>>> >>>>>> 1. On restart the broker truncates the data for the affected >>> partitions, >>>>>> i.e. not all data was written to disk. >>>>>> 2. The new broker then becomes a leader for the affected partitions and >>>>>> consumers get confused because they've already consumed beyond the now >>>>>> available offset. >>>>>> >>>>>> Does that seem like a possible failure scenario? >>>>>> >>>>>> /Sam >>>>> >>>> >>> >>> >