It is possible, there are some discussions about a similar issue in KIP:

https://cwiki.apache.org/confluence/display/KAFKA/KIP-53+-+Add+custom+policies+for+reconnect+attempts+to+NetworkdClient

mailing thread:

https://www.mail-archive.com/dev@kafka.apache.org/msg46868.html



Guozhang

On Tue, Apr 5, 2016 at 2:34 PM, Yifan Ying <nafan...@gmail.com> wrote:

> Some updates:
>
> Yesterday, right after release (producers and consumers reconnected to
> Kafka/Zookeeper, but no code change in our producers and consumers), all
> under replication issues were resolved automatically and no more high
> latency in both Kafka and Zookeeper. But right after today's
> release(producers and consumers re-connected again), the under replication
> and high latency issue happened again. So the all-at-once reconnecting from
> producers and consumers would cause the problem? And all these only
> happened since I deleted a deprecated topic in production.
>
> Yifan
>
> On Tue, Apr 5, 2016 at 9:04 AM, Guozhang Wang <wangg...@gmail.com> wrote:
>
>> These configs are mainly dependent on your publish throughput, since the
>> replication throughput is higher bounded by the publish throughput. If the
>> publish throughput is not high, then setting a lower threshold values in
>> these two configs will cause churns in shrinking / expanding ISRs.
>>
>> Guozhang
>>
>> On Mon, Apr 4, 2016 at 11:55 PM, Yifan Ying <nafan...@gmail.com> wrote:
>>
>>> Thanks for replying, Guozhang. We did increase both settings:
>>>
>>> replica.lag.max.messages=20000
>>>
>>> replica.lag.time.max.ms=20000
>>>
>>>
>>> But no sure if these are good enough. And yes, that's a good suggestion
>>> to monitor ZK performance.
>>>
>>>
>>> Thanks.
>>>
>>> On Mon, Apr 4, 2016 at 8:58 PM, Guozhang Wang <wangg...@gmail.com>
>>> wrote:
>>>
>>>> Hmm, it seems like your broker config "replica.lag.max.messages" and "
>>>> replica.lag.time.max.ms" is mis-configed regarding your replication
>>>> traffic, and the deletion of the topic actually makes it below the
>>>> threshold. What are the config values for these two? And could you try to
>>>> increase these configs and see if that helps?
>>>>
>>>> In 0.8.2.1 Kafka-consumer-offset-checker.sh access ZK to query the
>>>> consumer offsets one-by-one, and hence if your ZK read latency is high it
>>>> could take long time. You may want to monitor your ZK cluster performance
>>>> to check its read / write latencies.
>>>>
>>>>
>>>> Guozhang
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Mon, Apr 4, 2016 at 10:59 AM, Yifan Ying <nafan...@gmail.com> wrote:
>>>>
>>>>> Hi Guozhang,
>>>>>
>>>>> It's 0.8.2.1. So it should be fixed? We also tried to start from
>>>>> scratch by wiping out the data directory on both Kafka and Zookeeper. And
>>>>> it's odd that the constant shrinking and expanding happened after fresh
>>>>> restart, and high request latency as well. The brokers are using the same
>>>>> config before topic deletion.
>>>>>
>>>>> Another observation is that, using the
>>>>> Kafka-consumer-offset-checker.sh is extremely slow. Any suggestion would 
>>>>> be
>>>>> appreciated! Thanks.
>>>>>
>>>>> On Sun, Apr 3, 2016 at 2:29 PM, Guozhang Wang <wangg...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Yifan,
>>>>>>
>>>>>> Are you on 0.8.0 or 0.8.1/2? There are some issues with zkVersion
>>>>>> checking
>>>>>> in 0.8.0 that are fixed in later minor releases of 0.8.
>>>>>>
>>>>>> Guozhang
>>>>>>
>>>>>> On Fri, Apr 1, 2016 at 7:46 PM, Yifan Ying <nafan...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>> > Hi All,
>>>>>> >
>>>>>> > We deleted a deprecated topic on Kafka cluster(0.8) and started
>>>>>> observing
>>>>>> > constant 'Expanding ISR for partition' and 'Shrinking ISR for
>>>>>> partition'
>>>>>> > for other topics. As a result we saw a huge number of under
>>>>>> replicated
>>>>>> > partitions and very high request latency from Kafka. And it doesn't
>>>>>> seem
>>>>>> > able to recover itself.
>>>>>> >
>>>>>> > Anyone knows what caused this issue and how to resolve it?
>>>>>> >
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> -- Guozhang
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Yifan
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> -- Guozhang
>>>>
>>>
>>>
>>>
>>> --
>>> Yifan
>>>
>>>
>>>
>>
>>
>> --
>> -- Guozhang
>>
>
>
>
> --
> Yifan
>
>
>


-- 
-- Guozhang

Reply via email to