Great that it's working. Yes, you need retries not to drop messages during broker restarts.
Ismael On Tue, Sep 26, 2017 at 3:33 PM, Yogesh Sangvikar < yogesh.sangvi...@gmail.com> wrote: > Hi Team, > > Thanks a lot for the suggestion Ismael. > > We have tried kafka cluster rolling upgrade by doing the version changes > (CURRENT_KAFKA_VERSION - 0.10.0, CURRENT_MESSAGE_FORMAT_VERSION - 0.10.0 > and upgraded respective version 0.10.2) in upgraded confluent package 3.2.2 > and observed the in-sync replicas are coming up immediately & also, the > preferred leaders are coming up after version bump post sync. > > As per my understanding, the in-sync replicas & leader election happening > quickly as the new data getting published while upgrade is getting written > and synced using upgraded package libraries (0.10.2). > > Also, observed some records failed to produce due to error, > > kafka-rest error response - > > {"offsets":[{"partition":null,"offset":null,"error_code": > 50003,"error":"This > server is not the leader for that > topic-partition."}],"key_schema_id":1542,"value_schema_id":1541} > > Exception in log file - > org.apache.kafka.common.errors.NotLeaderForPartitionException: This server > is not the leader for that topic-partition. > > > To resolve the above error, we have override properties *acks=-1 (default, > 1) retries=3 (default, 0) *for kafka rest producer config > (kafka-rest.properties) and getting some duplicate events in topic. > > > Thanks, > Yogesh > > On Thu, Sep 21, 2017 at 7:09 AM, yogesh sangvikar < > yogesh.sangvi...@gmail.com> wrote: > > > Thanks Ismael. > > I will try the solution and update all. > > > > Thanks, > > Yogesh > > ------------------------------ > > From: Ismael Juma <ism...@juma.me.uk> > > Sent: 20-09-2017 11:57 PM > > To: Kafka Users <users@kafka.apache.org> > > Subject: Re: Data loss while upgrading confluent 3.0.0 kafka cluster > > toconfluent 3.2.2 > > > > One clarification below: > > > > On Wed, Sep 20, 2017 at 3:50 PM, Ismael Juma <ism...@juma.me.uk> wrote: > > > > > Comments inline. > > > > > > On Wed, Sep 20, 2017 at 11:56 AM, Yogesh Sangvikar < > > > yogesh.sangvi...@gmail.com> wrote: > > > > > >> 2. At which point in the sequence below was the code for the brokers > > >> updated to 0.10.2? > > >> > > >> [Comment: On the kafka servers, we have confluent-3.0.0 and > > >> confluent-3.2.2 > > >> packages deployed separately. So, first for protocol and message > version > > >> to > > >> 0.10.0 we have updated server.properties file in running > confluent-3.0.0 > > >> package and restarted the service for the same. > > > > > > And, for protocol and message version to 0.10.2 bumb, we have modified > > >> server.properties file in confluent-3.2.2 & stopped the old package > > >> services and started the kafka services using new one. All restarts > are > > >> done rolling fashion and random broker.id sequence (4,3,2,1).] > > >> > > > > > > You have to set version 0.10.0 in the server.properties of the > 0.10.2/3.2 > > > brokers. This is probably the source of your issue. After all running > > > brokers are version 0.10.2/3.2, then you can switch the version to > > 0.10.2. > > > > > > > The last sentence may be clearer with the following change: > > > > "After all running brokers are version 0.10.2/3.2, then you can switch > the > > inter.broker.protocol.version to 0.10.2 in server.properties." > > > > Ismael > > >