[ 
https://issues.apache.org/jira/browse/KAFKA-13102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Justine Olshan updated KAFKA-13102:
-----------------------------------
    Description: 
Currently, the fetch path for replicas relies on the topic IDs in the metadata 
cache. However, the propagation of topic ID information is done through the 
UpdateMetadata request and is too slow. At first the topic will have no ID in 
the metadata cache and we will send an older request and then we get the ID and 
have to close the session. This will likely happen on broker startup and with 
new topics. This has resulted in increased partitions in error, frequent 
closing of sessions and made tests like 
ConsumerBounceTest#testCloseDuringRebalance extremely flaky.

A quick test with topic IDs stored in the replica manager during the handling 
of LISR requests showed that significantly fewer errors and made 
ConsumerBounceTest#testCloseDuringRebalance much less flaky (passing 50/50 runs 
vs. 11/50 runs).

The task now is figuring out the best strategy to store topic IDs for the fetch 
path using the IDs from the LISR request.

  was:
Currently, the fetch path for replicas relies on the topic IDs in the metadata 
cache. However, the propagation of topic ID information is done through the 
UpdateMetadata request and is too slow. This has resulted in increased 
partitions in error, frequent closing of sessions and made tests like 
ConsumerBounceTest#testCloseDuringRebalance extremely flaky.

A quick test with topic IDs stored in the replica manager during the handling 
of LISR requests showed that significantly fewer errors and made 
ConsumerBounceTest#testCloseDuringRebalance much less flaky (passing 50/50 runs 
vs. 11/50 runs).

The task now is figuring out the best strategy to store topic IDs for the fetch 
path using the IDs from the LISR request.


> Topic IDs not propagated to metadata cache quickly enough for Fetch path
> ------------------------------------------------------------------------
>
>                 Key: KAFKA-13102
>                 URL: https://issues.apache.org/jira/browse/KAFKA-13102
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Justine Olshan
>            Assignee: Justine Olshan
>            Priority: Major
>
> Currently, the fetch path for replicas relies on the topic IDs in the 
> metadata cache. However, the propagation of topic ID information is done 
> through the UpdateMetadata request and is too slow. At first the topic will 
> have no ID in the metadata cache and we will send an older request and then 
> we get the ID and have to close the session. This will likely happen on 
> broker startup and with new topics. This has resulted in increased partitions 
> in error, frequent closing of sessions and made tests like 
> ConsumerBounceTest#testCloseDuringRebalance extremely flaky.
> A quick test with topic IDs stored in the replica manager during the handling 
> of LISR requests showed that significantly fewer errors and made 
> ConsumerBounceTest#testCloseDuringRebalance much less flaky (passing 50/50 
> runs vs. 11/50 runs).
> The task now is figuring out the best strategy to store topic IDs for the 
> fetch path using the IDs from the LISR request.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to