OK. Thanks Cody! On Fri, Apr 29, 2016 at 12:41 PM, Cody Koeninger <c...@koeninger.org> wrote:
> If worker to broker communication breaks down, the worker will sleep > for refresh.leader.backoff.ms before throwing an error, at which point > normal spark task retry (spark.task.maxFailures) comes into play. > > If driver to broker communication breaks down, the driver will sleep > for refresh.leader.backoff.ms before retrying the attempt to get > offsets, up to spark.streaming.kafka.maxRetries number of times. > > The actual leader rebalancing process is entirely up to Kafka, which > is why I'm saying if you're losing leaders, you should look at Kafka. > > On Fri, Apr 29, 2016 at 11:21 AM, swetha kasireddy > <swethakasire...@gmail.com> wrote: > > OK. So the Kafka side they use rebalance.backoff.ms of 2000 which is a > > default for rebalancing and they say that refresh.leader.backoff.ms of > 200 > > to refresh leader is very aggressive and suggested us to increase it to > > 2000. Even after increasing to 2500 I still get Leader Lost Errors. > > > > Is refresh.leader.backoff.ms the right setting in the app for it to > wait > > till the leader election and rebalance is done from the Kafka side > assuming > > that Kafka has rebalance.backoff.ms of 2000 ? > > > > Also, does Spark Kafka Direct try to restart the app when the leader is > lost > > or it will just wait till refresh.leader.backoff.ms and then retry > again > > depending on the number of retries? > > > > On Fri, Apr 29, 2016 at 8:14 AM, swetha kasireddy > > <swethakasire...@gmail.com> wrote: > >> > >> OK. So the Kafka side they use rebalance.backoff.ms of 2000 which is a > >> default for rebalancing and they say that refresh.leader.backoff.ms of > 200 > >> to refresh leader is very aggressive and suggested us to increase it to > >> 2000. Even after increasing to 2500 I still get Leader Lost Errors. > >> > >> Is refresh.leader.backoff.ms the right setting in the app for it to > wait > >> till the leader election and rebalance is done from the Kafka side > assuming > >> that Kafka has rebalance.backoff.ms of 2000 ? > >> > >> On Wed, Apr 27, 2016 at 11:05 AM, Cody Koeninger <c...@koeninger.org> > >> wrote: > >>> > >>> Seems like it'd be better to look into the Kafka side of things to > >>> determine why you're losing leaders frequently, as opposed to trying > >>> to put a bandaid on it. > >>> > >>> On Wed, Apr 27, 2016 at 11:49 AM, SRK <swethakasire...@gmail.com> > wrote: > >>> > Hi, > >>> > > >>> > We seem to be getting a lot of LeaderLostExceptions and our source > >>> > Stream is > >>> > working with a default value of rebalance.backoff.ms which is 2000. > I > >>> > was > >>> > thinking to increase this value to 5000. Any suggestions on this? > >>> > > >>> > Thanks! > >>> > > >>> > > >>> > > >>> > -- > >>> > View this message in context: > >>> > > http://apache-spark-user-list.1001560.n3.nabble.com/What-is-the-default-value-of-rebalance-backoff-ms-in-Spark-Kafka-Direct-tp26840.html > >>> > Sent from the Apache Spark User List mailing list archive at > >>> > Nabble.com. > >>> > > >>> > --------------------------------------------------------------------- > >>> > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > >>> > For additional commands, e-mail: user-h...@spark.apache.org > >>> > > >> > >> > > >