IIUC it is a pseudo-automation in that you set up the retry interval for controlled shutdown (controlled.shutdown.retry.backoff.ms) and the number of retries (controlled.shutdown.max.retries) high enough so that during a rolling bounce, the likelihood of a controlled shutdown being unsuccessful is low, since you would have brought back up the previous broker and UnderReplicatedPartitionCount would return to zero soon enough. What would complete this procedure is an external hook into whatever deployment system is being used to wait for UnderReplicatedPartitionCount to return to zero before proceeding to issue a controlled shutdown to the next broker in the bounce sequence.
On Thu, Aug 29, 2013 at 5:11 PM, Jay Kreps <jay.kr...@gmail.com> wrote: > Can't he get this automatically though with the Sriram's controlled > shutdown stuff? > > -Jay > > > On Thu, Aug 29, 2013 at 2:12 PM, Neha Narkhede <neha.narkh...@gmail.com>wrote: > >> >> How do you automate waiting for the broker to come up? Just keep >> monitoring the process and keep trying to connect to the port? >> >> Every leader in a Kafka cluster exposes the UnderReplicatedPartitionCount >> metric. The safest way to issue controlled shutdown is to wait until that >> metric reports 0 on the brokers. If you try to shutdown the last broker in >> the ISR, the controlled shutdown cannot succeed since there is no other >> broker to move the leader to. Waiting until under replicated partition >> count hits 0 prevents you from hitting this issue. >> >> This also solves the problem of waiting until the broker comes up since you >> will automatically wait until the broker comes up and joins ISR. >> >> >> Thanks, >> Neha >> >> >> On Thu, Aug 29, 2013 at 12:59 PM, Sam Meder <sam.me...@jivesoftware.com >> >wrote: >> >> > Ok, I spent some more time staring at our logs and figured out that it >> was >> > our fault. We were not waiting around for the Kafka broker to fully >> > initialize before moving on to the next broker and loading the data logs >> > can take quite some time (~7 minutes in one case), so we ended up with >> no >> > replicas online at some point and the replica that came back first was a >> > little short on data... >> > >> > How do you automate waiting for the broker to come up? Just keep >> > monitoring the process and keep trying to connect to the port? >> > >> > /Sam >> > >> > On Aug 29, 2013, at 6:40 PM, Sam Meder <sam.me...@jivesoftware.com> >> wrote: >> > >> > > >> > > On Aug 29, 2013, at 5:50 PM, Sriram Subramanian < >> > srsubraman...@linkedin.com> wrote: >> > > >> > >> Do you know why you timed out on a regular shutdown? >> > > >> > > No, though I think it may just have been that the timeout we put in was >> > too short. >> > > >> > >> If the replica had >> > >> fallen off of the ISR and shutdown was forced on the leader this could >> > >> happen. >> > > >> > > Hmm, but it shouldn't really be made leader if it isn't even in the >> isr, >> > should it? >> > > >> > > /Sam >> > > >> > >> With ack = -1, we guarantee that all the replicas in the in sync >> > >> set have received the message before exposing the message to the >> > consumer. >> > >> >> > >> On 8/29/13 8:32 AM, "Sam Meder" <sam.me...@jivesoftware.com> wrote: >> > >> >> > >>> We've recently come across a scenario where we see consumers >> resetting >> > >>> their offsets to earliest and which as far as I can tell may also >> lead >> > to >> > >>> data loss (we're running with ack = -1 to avoid loss). This seems to >> > >>> happen when we time out on doing a regular shutdown and instead kill >> -9 >> > >>> the kafka broker, but does obviously apply to any scenario that >> > involves >> > >>> a unclean exit. As far as I can tell what happens is >> > >>> >> > >>> 1. On restart the broker truncates the data for the affected >> > partitions, >> > >>> i.e. not all data was written to disk. >> > >>> 2. The new broker then becomes a leader for the affected partitions >> and >> > >>> consumers get confused because they've already consumed beyond the >> now >> > >>> available offset. >> > >>> >> > >>> Does that seem like a possible failure scenario? >> > >>> >> > >>> /Sam >> > >> >> > > >> > >> > >>