It is possible, there are some discussions about a similar issue in KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-53+-+Add+custom+policies+for+reconnect+attempts+to+NetworkdClient
mailing thread: https://www.mail-archive.com/dev@kafka.apache.org/msg46868.html Guozhang On Tue, Apr 5, 2016 at 2:34 PM, Yifan Ying <nafan...@gmail.com> wrote: > Some updates: > > Yesterday, right after release (producers and consumers reconnected to > Kafka/Zookeeper, but no code change in our producers and consumers), all > under replication issues were resolved automatically and no more high > latency in both Kafka and Zookeeper. But right after today's > release(producers and consumers re-connected again), the under replication > and high latency issue happened again. So the all-at-once reconnecting from > producers and consumers would cause the problem? And all these only > happened since I deleted a deprecated topic in production. > > Yifan > > On Tue, Apr 5, 2016 at 9:04 AM, Guozhang Wang <wangg...@gmail.com> wrote: > >> These configs are mainly dependent on your publish throughput, since the >> replication throughput is higher bounded by the publish throughput. If the >> publish throughput is not high, then setting a lower threshold values in >> these two configs will cause churns in shrinking / expanding ISRs. >> >> Guozhang >> >> On Mon, Apr 4, 2016 at 11:55 PM, Yifan Ying <nafan...@gmail.com> wrote: >> >>> Thanks for replying, Guozhang. We did increase both settings: >>> >>> replica.lag.max.messages=20000 >>> >>> replica.lag.time.max.ms=20000 >>> >>> >>> But no sure if these are good enough. And yes, that's a good suggestion >>> to monitor ZK performance. >>> >>> >>> Thanks. >>> >>> On Mon, Apr 4, 2016 at 8:58 PM, Guozhang Wang <wangg...@gmail.com> >>> wrote: >>> >>>> Hmm, it seems like your broker config "replica.lag.max.messages" and " >>>> replica.lag.time.max.ms" is mis-configed regarding your replication >>>> traffic, and the deletion of the topic actually makes it below the >>>> threshold. What are the config values for these two? And could you try to >>>> increase these configs and see if that helps? >>>> >>>> In 0.8.2.1 Kafka-consumer-offset-checker.sh access ZK to query the >>>> consumer offsets one-by-one, and hence if your ZK read latency is high it >>>> could take long time. You may want to monitor your ZK cluster performance >>>> to check its read / write latencies. >>>> >>>> >>>> Guozhang >>>> >>>> >>>> >>>> >>>> >>>> On Mon, Apr 4, 2016 at 10:59 AM, Yifan Ying <nafan...@gmail.com> wrote: >>>> >>>>> Hi Guozhang, >>>>> >>>>> It's 0.8.2.1. So it should be fixed? We also tried to start from >>>>> scratch by wiping out the data directory on both Kafka and Zookeeper. And >>>>> it's odd that the constant shrinking and expanding happened after fresh >>>>> restart, and high request latency as well. The brokers are using the same >>>>> config before topic deletion. >>>>> >>>>> Another observation is that, using the >>>>> Kafka-consumer-offset-checker.sh is extremely slow. Any suggestion would >>>>> be >>>>> appreciated! Thanks. >>>>> >>>>> On Sun, Apr 3, 2016 at 2:29 PM, Guozhang Wang <wangg...@gmail.com> >>>>> wrote: >>>>> >>>>>> Yifan, >>>>>> >>>>>> Are you on 0.8.0 or 0.8.1/2? There are some issues with zkVersion >>>>>> checking >>>>>> in 0.8.0 that are fixed in later minor releases of 0.8. >>>>>> >>>>>> Guozhang >>>>>> >>>>>> On Fri, Apr 1, 2016 at 7:46 PM, Yifan Ying <nafan...@gmail.com> >>>>>> wrote: >>>>>> >>>>>> > Hi All, >>>>>> > >>>>>> > We deleted a deprecated topic on Kafka cluster(0.8) and started >>>>>> observing >>>>>> > constant 'Expanding ISR for partition' and 'Shrinking ISR for >>>>>> partition' >>>>>> > for other topics. As a result we saw a huge number of under >>>>>> replicated >>>>>> > partitions and very high request latency from Kafka. And it doesn't >>>>>> seem >>>>>> > able to recover itself. >>>>>> > >>>>>> > Anyone knows what caused this issue and how to resolve it? >>>>>> > >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> -- Guozhang >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Yifan >>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> -- Guozhang >>>> >>> >>> >>> >>> -- >>> Yifan >>> >>> >>> >> >> >> -- >> -- Guozhang >> > > > > -- > Yifan > > > -- -- Guozhang