Steven Schlansker created KAFKA-19165: -----------------------------------------
Summary: PartitionLeaderStrategy has very high error rate during topic initialization Key: KAFKA-19165 URL: https://issues.apache.org/jira/browse/KAFKA-19165 Project: Kafka Issue Type: Improvement Components: clients, streams Affects Versions: 3.9.0 Reporter: Steven Schlansker We implemented a Kafka Streams app. Some integration tests run a Kafka broker and then connect the Streams app to it, to ensure our application functions as desired. When initializing each test case, all the Streams topics must be created. This is expected as each integration test expects to run its own "copy" of the app (different `application.id`) The application is *very* chatty about this process. We see hundreds of thousands of errors like: {code:java} 2025-04-16T17:17:33.841Z [kafka-admin-client-thread | search-indexing-2025-04-16-r67a-a4975cf6-622a-4f36-93bf-990df7b34a4e-admin] ERROR o.a.k.c.a.i.PartitionLeaderStrategy - [AdminClient clientId=search-indexing-2025-04-16-r67a-a4975cf6-622a-4f36-93bf-990df7b34a4e-admin] Received unknown topic error for topic search-indexing-2025-04-16-r67a-notification-group-eoc-merge-changelog org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server does not host this topic-partition. 2025-04-16T17:17:33.841Z [kafka-admin-client-thread | search-indexing-2025-04-16-r67a-a4975cf6-622a-4f36-93bf-990df7b34a4e-admin] ERROR o.a.k.c.a.i.PartitionLeaderStrategy - [AdminClient clientId=search-indexing-2025-04-16-r67a-a4975cf6-622a-4f36-93bf-990df7b34a4e-admin] Received unknown topic error for topic search-indexing-2025-04-16-r67a-current-time-store-changelog org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server does not host this topic-partition. 2025-04-16T17:17:33.841Z [kafka-admin-client-thread | search-indexing-2025-04-16-r67a-a4975cf6-622a-4f36-93bf-990df7b34a4e-admin] ERROR o.a.k.c.a.i.PartitionLeaderStrategy - [AdminClient clientId=search-indexing-2025-04-16-r67a-a4975cf6-622a-4f36-93bf-990df7b34a4e-admin] Received unknown topic error for topic search-indexing-2025-04-16-r67a-notification-group-eoc-merge-changelog org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server does not host this topic-partition. 2025-04-16T17:17:33.841Z [kafka-admin-client-thread | search-indexing-2025-04-16-r67a-a4975cf6-622a-4f36-93bf-990df7b34a4e-admin] ERROR o.a.k.c.a.i.PartitionLeaderStrategy - [AdminClient clientId=search-indexing-2025-04-16-r67a-a4975cf6-622a-4f36-93bf-990df7b34a4e-admin] Received unknown topic error for topic search-indexing-2025-04-16-r67a-current-time-store-changelog {code} For a single topic, we get > 6000 errors in just a few seconds. The log file ends up being many megabytes of this, to the point where some less-powerful text editors struggle to even render the file. Having so many errors that are in fact expected and non-actionable harms observability of the Kafka Streams platform. Would it be sensible to suppress "expected" exceptions of this type, as topics are being created? Or at least rate-limit it, for example printing ever 10 seconds "Waiting for topics [..., ...] to be created for 30s..." I also wonder if the admin client should rate limit how often it pings the broker, to reduce broker load in this case. -- This message was sent by Atlassian Jira (v8.20.10#820010)