Re: [PR] KAFKA-18706: Move AclPublisher to metadata module [kafka]

2025-02-27 Thread via GitHub


FrankYang0529 commented on code in PR #18802:
URL: https://github.com/apache/kafka/pull/18802#discussion_r1973058360


##
metadata/src/main/java/org/apache/kafka/metadata/publisher/AclPublisher.java:
##
@@ -0,0 +1,108 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.kafka.metadata.publisher;
+
+import org.apache.kafka.common.utils.LogContext;
+import org.apache.kafka.image.MetadataDelta;
+import org.apache.kafka.image.MetadataImage;
+import org.apache.kafka.image.loader.LoaderManifest;
+import org.apache.kafka.image.loader.LoaderManifestType;
+import org.apache.kafka.image.publisher.MetadataPublisher;
+import org.apache.kafka.metadata.authorizer.ClusterMetadataAuthorizer;
+import org.apache.kafka.server.authorizer.Authorizer;
+import org.apache.kafka.server.fault.FaultHandler;
+
+import org.slf4j.Logger;
+
+import java.util.Optional;
+import java.util.concurrent.TimeoutException;
+
+public class AclPublisher implements MetadataPublisher {
+private final Logger log;
+private final int nodeId;
+private final FaultHandler faultHandler;
+private final String nodeType;
+private final Optional authorizer;
+private boolean completedInitialLoad = false;
+
+public AclPublisher(int nodeId, FaultHandler faultHandler, String 
nodeType, Optional authorizer) {
+this.nodeId = nodeId;
+this.faultHandler = faultHandler;
+this.nodeType = nodeType;
+this.authorizer = 
authorizer.filter(ClusterMetadataAuthorizer.class::isInstance).map(ClusterMetadataAuthorizer.class::cast);
+this.log = new LogContext(String.format("[%s %s id=%d] ", 
AclPublisher.class.getSimpleName(), nodeType, 
nodeId)).logger(AclPublisher.class);

Review Comment:
   Updated it. Thanks.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-18868: add the "default value" explanation to the docs of num.replica.alter.log.dirs.threads [kafka]

2025-02-27 Thread via GitHub


chia7712 commented on code in PR #19038:
URL: https://github.com/apache/kafka/pull/19038#discussion_r1973070151


##
server/src/main/java/org/apache/kafka/server/config/ServerConfigs.java:
##
@@ -52,7 +52,8 @@ public class ServerConfigs {
 public static final String BACKGROUND_THREADS_DOC = "The number of threads 
to use for various background processing tasks";
 
 public static final String NUM_REPLICA_ALTER_LOG_DIRS_THREADS_CONFIG = 
"num.replica.alter.log.dirs.threads";
-public static final String NUM_REPLICA_ALTER_LOG_DIRS_THREADS_DOC = "The 
number of threads that can move replicas between log directories, which may 
include disk I/O";
+public static final String NUM_REPLICA_ALTER_LOG_DIRS_THREADS_DOC = "The 
number of threads that can move replicas between log directories, which may 
include disk I/O. " +
+"The default value is equal to the number of directories specified 
in the " + ServerLogConfigs.LOG_DIRS_CONFIG + " configuration 
property.";

Review Comment:
   the logs are defined by either `LOG_DIRS_CONFIG` or `LOG_DIR_CONFIG`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-18706: Move AclPublisher to metadata module [kafka]

2025-02-27 Thread via GitHub


FrankYang0529 commented on code in PR #18802:
URL: https://github.com/apache/kafka/pull/18802#discussion_r1973100776


##
metadata/src/main/java/org/apache/kafka/metadata/publisher/AclPublisher.java:
##
@@ -0,0 +1,108 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.kafka.metadata.publisher;
+
+import org.apache.kafka.common.utils.LogContext;
+import org.apache.kafka.image.MetadataDelta;
+import org.apache.kafka.image.MetadataImage;
+import org.apache.kafka.image.loader.LoaderManifest;
+import org.apache.kafka.image.loader.LoaderManifestType;
+import org.apache.kafka.image.publisher.MetadataPublisher;
+import org.apache.kafka.metadata.authorizer.ClusterMetadataAuthorizer;
+import org.apache.kafka.server.authorizer.Authorizer;
+import org.apache.kafka.server.fault.FaultHandler;
+
+import org.slf4j.Logger;
+
+import java.util.Optional;
+import java.util.concurrent.TimeoutException;
+
+public class AclPublisher implements MetadataPublisher {
+private final Logger log;
+private final int nodeId;
+private final FaultHandler faultHandler;
+private final String nodeType;
+private final Optional authorizer;
+private boolean completedInitialLoad = false;
+
+public AclPublisher(int nodeId, FaultHandler faultHandler, String 
nodeType, Optional authorizer) {
+this.nodeId = nodeId;
+this.faultHandler = faultHandler;
+this.nodeType = nodeType;
+this.authorizer = 
authorizer.filter(ClusterMetadataAuthorizer.class::isInstance).map(ClusterMetadataAuthorizer.class::cast);
+this.log = new LogContext(name()).logger(AclPublisher.class);
+}
+
+@Override
+public String name() {

Review Comment:
   Sorry, I missed this part. Updated it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-18849: Add "strict min ISR" to the docs of "min.insync.replicas" [kafka]

2025-02-27 Thread via GitHub


chia7712 commented on PR #19016:
URL: https://github.com/apache/kafka/pull/19016#issuecomment-2687190842

   cherry-pick to 4.0


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-18849: Add "strict min ISR" to the docs of "min.insync.replicas" [kafka]

2025-02-27 Thread via GitHub


chia7712 merged PR #19016:
URL: https://github.com/apache/kafka/pull/19016


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Resolved] (KAFKA-18849) add "strict min ISR" to the docs of "min.insync.replicas"

2025-02-27 Thread Chia-Ping Tsai (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-18849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai resolved KAFKA-18849.

Resolution: Fixed

trunk: 
[https://github.com/apache/kafka/commit/269e2d898b76b5c9c58232c35bee805ceacc2ead]

 

4.0: 
https://github.com/apache/kafka/commit/2f9da4691fdda9745cd7bee9bb9d8c8d3dfba3b1

> add "strict min ISR" to the docs of "min.insync.replicas"
> -
>
> Key: KAFKA-18849
> URL: https://issues.apache.org/jira/browse/KAFKA-18849
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Chia-Ping Tsai
>Assignee: Wei-Ting Chen
>Priority: Minor
> Fix For: 4.0.0
>
>
> see https://github.com/apache/kafka/pull/18880



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] KAFKA-18706: Move AclPublisher to metadata module [kafka]

2025-02-27 Thread via GitHub


chia7712 commented on code in PR #18802:
URL: https://github.com/apache/kafka/pull/18802#discussion_r1973092684


##
metadata/src/main/java/org/apache/kafka/metadata/publisher/AclPublisher.java:
##
@@ -0,0 +1,108 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.kafka.metadata.publisher;
+
+import org.apache.kafka.common.utils.LogContext;
+import org.apache.kafka.image.MetadataDelta;
+import org.apache.kafka.image.MetadataImage;
+import org.apache.kafka.image.loader.LoaderManifest;
+import org.apache.kafka.image.loader.LoaderManifestType;
+import org.apache.kafka.image.publisher.MetadataPublisher;
+import org.apache.kafka.metadata.authorizer.ClusterMetadataAuthorizer;
+import org.apache.kafka.server.authorizer.Authorizer;
+import org.apache.kafka.server.fault.FaultHandler;
+
+import org.slf4j.Logger;
+
+import java.util.Optional;
+import java.util.concurrent.TimeoutException;
+
+public class AclPublisher implements MetadataPublisher {
+private final Logger log;
+private final int nodeId;
+private final FaultHandler faultHandler;
+private final String nodeType;
+private final Optional authorizer;
+private boolean completedInitialLoad = false;
+
+public AclPublisher(int nodeId, FaultHandler faultHandler, String 
nodeType, Optional authorizer) {
+this.nodeId = nodeId;
+this.faultHandler = faultHandler;
+this.nodeType = nodeType;
+this.authorizer = 
authorizer.filter(ClusterMetadataAuthorizer.class::isInstance).map(ClusterMetadataAuthorizer.class::cast);
+this.log = new LogContext(name()).logger(AclPublisher.class);
+}
+
+@Override
+public String name() {

Review Comment:
   you need to add `final` - otherwise, you will see failed build



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Assigned] (KAFKA-18870) implement describeDelegationToken for controller

2025-02-27 Thread Szu-Yung Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-18870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szu-Yung Wang reassigned KAFKA-18870:
-

Assignee: Szu-Yung Wang  (was: TaiJuWu)

> implement describeDelegationToken for controller
> 
>
> Key: KAFKA-18870
> URL: https://issues.apache.org/jira/browse/KAFKA-18870
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Chia-Ping Tsai
>Assignee: Szu-Yung Wang
>Priority: Major
>
> as title



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] KAFKA-18276 Migrate RebootstrapTest to new test infra [kafka]

2025-02-27 Thread via GitHub


m1a2st commented on code in PR #19046:
URL: https://github.com/apache/kafka/pull/19046#discussion_r1973122449


##
core/src/test/java/kafka/test/api/ProducerRebootstrapTest.java:
##
@@ -0,0 +1,121 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package kafka.test.api;
+
+import kafka.server.KafkaBroker;
+import org.apache.kafka.clients.admin.Admin;
+import org.apache.kafka.clients.admin.NewTopic;
+import org.apache.kafka.clients.producer.ProducerConfig;
+import org.apache.kafka.clients.producer.ProducerRecord;
+import org.apache.kafka.clients.producer.RecordMetadata;
+import org.apache.kafka.common.config.TopicConfig;
+import org.apache.kafka.common.test.ClusterInstance;
+import org.apache.kafka.common.test.api.ClusterConfig;
+import org.apache.kafka.common.test.api.ClusterTemplate;
+import org.apache.kafka.common.test.api.Type;
+import org.apache.kafka.coordinator.group.GroupCoordinatorConfig;
+
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.concurrent.ExecutionException;
+import java.util.stream.Stream;
+
+import static org.junit.jupiter.api.Assertions.assertEquals;
+
+public class ProducerRebootstrapTest {
+private static final int BROKER_COUNT = 2;
+
+static List generator() {
+Map serverProperties = new HashMap<>();
+// Enable unclean leader election for the test topic
+
serverProperties.put(TopicConfig.UNCLEAN_LEADER_ELECTION_ENABLE_CONFIG, "true");
+
serverProperties.put(GroupCoordinatorConfig.OFFSETS_TOPIC_REPLICATION_FACTOR_CONFIG,
 String.valueOf(BROKER_COUNT));
+
+return Stream.of(false, true)
+.map(ProducerRebootstrapTest::getRebootstrapConfig)
+.map(rebootstrapProperties -> 
ProducerRebootstrapTest.buildConfig(serverProperties, rebootstrapProperties))
+.toList();
+}
+
+static Map getRebootstrapConfig(boolean 
useRebootstrapTriggerMs) {
+Map properties = new HashMap<>();
+if (useRebootstrapTriggerMs) {
+properties.put("metadata.recovery.rebootstrap.trigger.ms", "5000");
+} else {
+properties.put("metadata.recovery.rebootstrap.trigger.ms", 
"360");
+properties.put("socket.connection.setup.timeout.ms", "5000");
+properties.put("socket.connection.setup.timeout.max.ms", "5000");
+properties.put("reconnect.backoff.ms", "1000");
+properties.put("reconnect.backoff.max.ms", "1000");
+}
+properties.put("metadata.recovery.strategy", "rebootstrap");
+properties.putIfAbsent(ProducerConfig.ACKS_CONFIG, "-1");
+return properties;
+}
+
+static ClusterConfig buildConfig(Map serverProperties, 
Map rebootstrapProperties) {
+return ClusterConfig.defaultBuilder()
+.setTypes(Set.of(Type.KRAFT))
+.setBrokers(BROKER_COUNT)
+.setProducerProperties(rebootstrapProperties)
+.setServerProperties(serverProperties).build();
+}
+
+@ClusterTemplate(value = "generator")
+public void testRebootstrap(ClusterInstance clusterInstance) throws 
ExecutionException, InterruptedException {
+String topic = "topic";
+try (Admin admin = clusterInstance.admin()) {
+admin.createTopics(Collections.singletonList(new NewTopic(topic, 
BROKER_COUNT, (short) 2))).all().get();

Review Comment:
   Please use `List.of()` instead `Collections.singletonList`, and  I don't 
think we need to get the future value.
   ```
   admin.createTopics(List.of(new NewTopic(topic, BROKER_COUNT, (short) 2)));
   ```



##
core/src/test/java/kafka/test/api/ProducerRebootstrapTest.java:
##
@@ -0,0 +1,121 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+

Re: [PR] [MINOR] Clean up coordinator-common and server modules [kafka]

2025-02-27 Thread via GitHub


chia7712 merged PR #19009:
URL: https://github.com/apache/kafka/pull/19009


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-17565: Move MetadataCache interface to metadata module [kafka]

2025-02-27 Thread via GitHub


chia7712 commented on code in PR #18801:
URL: https://github.com/apache/kafka/pull/18801#discussion_r1973109296


##
core/src/main/scala/kafka/server/metadata/KRaftMetadataCache.scala:
##
@@ -345,68 +352,75 @@ class KRaftMetadataCache(
 
Option(_currentImage.cluster.broker(brokerId)).count(_.inControlledShutdown) == 
1
   }
 
-  override def getAliveBrokers(): Iterable[BrokerMetadata] = 
getAliveBrokers(_currentImage)
+  override def getAliveBrokers(): util.List[BrokerMetadata] = 
getAliveBrokers(_currentImage)
 
-  private def getAliveBrokers(image: MetadataImage): Iterable[BrokerMetadata] 
= {
-image.cluster().brokers().values().asScala.filterNot(_.fenced()).
-  map(b => new BrokerMetadata(b.id, b.rack))
+  private def getAliveBrokers(image: MetadataImage): util.List[BrokerMetadata] 
= {
+_currentImage.cluster().brokers().values().stream()

Review Comment:
   `_currentImage` -> `image`



##
metadata/src/main/java/org/apache/kafka/metadata/MetadataCache.java:
##
@@ -0,0 +1,243 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.kafka.metadata;
+
+import org.apache.kafka.admin.BrokerMetadata;
+import org.apache.kafka.common.Cluster;
+import org.apache.kafka.common.Node;
+import org.apache.kafka.common.PartitionInfo;
+import org.apache.kafka.common.TopicPartition;
+import org.apache.kafka.common.Uuid;
+import org.apache.kafka.common.internals.Topic;
+import org.apache.kafka.common.message.DescribeClientQuotasRequestData;
+import org.apache.kafka.common.message.DescribeClientQuotasResponseData;
+import org.apache.kafka.common.message.DescribeTopicPartitionsResponseData;
+import org.apache.kafka.common.message.DescribeUserScramCredentialsRequestData;
+import 
org.apache.kafka.common.message.DescribeUserScramCredentialsResponseData;
+import org.apache.kafka.common.message.MetadataResponseData;
+import org.apache.kafka.common.network.ListenerName;
+import org.apache.kafka.image.MetadataImage;
+import org.apache.kafka.server.common.FinalizedFeatures;
+import org.apache.kafka.server.common.MetadataVersion;
+
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.Set;
+import java.util.concurrent.ThreadLocalRandom;
+import java.util.function.Function;
+import java.util.stream.Collectors;
+
+public interface MetadataCache extends ConfigRepository {
+
+/**
+ * Return topic metadata for a given set of topics and listener. See 
KafkaApis#handleTopicMetadataRequest for details
+ * on the use of the two boolean flags.
+ *
+ * @param topics  The set of topics.
+ * @param listenerNameThe listener name.
+ * @param errorUnavailableEndpoints   If true, we return an error on 
unavailable brokers. This is used to support
+ *MetadataResponse version 0.
+ * @param errorUnavailableListeners   If true, return LEADER_NOT_AVAILABLE 
if the listener is not found on the leader.
+ *This is used for MetadataResponse 
versions 0-5.
+ * @returnA collection of topic metadata.
+ */
+List getTopicMetadata(
+Set topics,
+ListenerName listenerName,
+boolean errorUnavailableEndpoints,
+boolean errorUnavailableListeners);
+
+Set getAllTopics();
+
+Set getTopicPartitions(String topicName);
+
+boolean hasAliveBroker(int brokerId);
+
+List getAliveBrokers();
+
+Optional getAliveBrokerEpoch(int brokerId);
+
+boolean isBrokerFenced(int brokerId);
+
+boolean isBrokerShuttingDown(int brokerId);
+
+Uuid getTopicId(String topicName);
+
+Optional getTopicName(Uuid topicId);
+
+Optional getAliveBrokerNode(int brokerId, ListenerName listenerName);
+
+List getAliveBrokerNodes(ListenerName listenerName);
+
+List getBrokerNodes(ListenerName listenerName);
+
+Optional getLeaderAndIsr(String topic, int partitionId);
+
+/**
+ * Return t

[jira] [Created] (KAFKA-18880) Remove kafka.cluster.Broker and BrokerEndPointNotAvailableException

2025-02-27 Thread Chia-Ping Tsai (Jira)
Chia-Ping Tsai created KAFKA-18880:
--

 Summary: Remove kafka.cluster.Broker and 
BrokerEndPointNotAvailableException
 Key: KAFKA-18880
 URL: https://issues.apache.org/jira/browse/KAFKA-18880
 Project: Kafka
  Issue Type: Improvement
Reporter: Chia-Ping Tsai


they are not used anymore after removing zk code



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-18886) add behavior change of CreateTopicPolicy and AlterConfigPolicy to zk2kraft

2025-02-27 Thread Chia-Ping Tsai (Jira)
Chia-Ping Tsai created KAFKA-18886:
--

 Summary: add behavior change of CreateTopicPolicy and 
AlterConfigPolicy to zk2kraft
 Key: KAFKA-18886
 URL: https://issues.apache.org/jira/browse/KAFKA-18886
 Project: Kafka
  Issue Type: Sub-task
Reporter: Chia-Ping Tsai
Assignee: Chia-Ping Tsai


as title. they are not running on broker under kraft.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-18882) Move BaseKey, TxnKey, and UnknownKey to transaction-coordinator module

2025-02-27 Thread Dmitry Werner (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-18882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17931118#comment-17931118
 ] 

Dmitry Werner commented on KAFKA-18882:
---

[~chia7712] Hello, can I work on this task?

> Move BaseKey, TxnKey, and UnknownKey to transaction-coordinator module
> --
>
> Key: KAFKA-18882
> URL: https://issues.apache.org/jira/browse/KAFKA-18882
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
>Priority: Minor
>
> as title



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-18871) KRaft migration rollback causes downtime

2025-02-27 Thread Daniel Urban (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Urban updated KAFKA-18871:
-
Attachment: kraft-rollback-bug.zip

> KRaft migration rollback causes downtime
> 
>
> Key: KAFKA-18871
> URL: https://issues.apache.org/jira/browse/KAFKA-18871
> Project: Kafka
>  Issue Type: Bug
>  Components: kraft, migration
>Affects Versions: 3.9.0
>Reporter: Daniel Urban
>Priority: Critical
> Attachments: kraft-rollback-bug.zip
>
>
> When testing the KRaft migration rollback feature, found the following 
> scenario:
>  # Execute KRaft migration on a 3 broker 3 ZK node cluster to the last step, 
> but do not finalize the migration.
>  ## In the test, we put a slow but continuous produce+consume load on the 
> cluster, with a topic (partitions=3, RF=3, min ISR=2)
>  # Start the rollback procedure
>  # First we roll back the brokers from KRaft mode to migration mode (both 
> controller and ZK configs are set, process roles are removed, 
> {{zookeeper.metadata.migration.enable}} is true)
>  # Then we delete the KRaft controllers, delete the /controller znode
>  # Then we immediately start rolling the brokers 1 by 1 to ZK mode by 
> removing the {{zookeeper.metadata.migration.enable}} flag and the 
> controller.* configurations.
>  # At this point, when we restart the 1st broker (let's call it broker 0) 
> into ZK mode, we find an issue which occurs ~1 out of 20 times:
> If broker 0 is not in the ISR for one of the partitions at this point, it can 
> simply never become part of the ISR. Since we are aiming for zero downtime, 
> we check the ISR states of partitions between broker restarts, and our 
> process gets blocked at this point. We have tried multiple workarounds at 
> this point, but it seems that there is no workaround which still ensures zero 
> downtime.
> Some more details about the process:
>  * We are using Strimzi to drive this process, but have verified that Strimzi 
> follows the documented steps precisely.
>  * When we reach the error state, it doesn't matter which broker became the 
> controller through the ZK node, the brokers still in migration mode get 
> stuck, and they flood the logs with the following error:
> {code:java}
> 2025-02-26 10:55:21,985 WARN [RaftManager id=0] Error connecting to node 
> kraft-rollback-kafka-controller-pool-5.kraft-rollback-kafka-kafka-brokers.csm-op-test-kraft-rollback-e7798bef.svc.cluster.local:9090
>  (id: 5 rack: null) (org.apache.kafka.clients.NetworkClient) 
> [kafka-raft-outbound-request-thread]
> java.net.UnknownHostException: 
> kraft-rollback-kafka-controller-pool-5.kraft-rollback-kafka-kafka-brokers.csm-op-test-kraft-rollback-e7798bef.svc.cluster.local
>         at 
> java.base/java.net.InetAddress$CachedAddresses.get(InetAddress.java:801)
>         at java.base/java.net.InetAddress.getAllByName0(InetAddress.java:1533)
>         at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1385)
>         at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1306)
>         at 
> org.apache.kafka.clients.DefaultHostResolver.resolve(DefaultHostResolver.java:27)
>         at org.apache.kafka.clients.ClientUtils.resolve(ClientUtils.java:125)
>         at 
> org.apache.kafka.clients.ClusterConnectionStates$NodeConnectionState.resolveAddresses(ClusterConnectionStates.java:536)
>         at 
> org.apache.kafka.clients.ClusterConnectionStates$NodeConnectionState.currentAddress(ClusterConnectionStates.java:511)
>         at 
> org.apache.kafka.clients.ClusterConnectionStates$NodeConnectionState.access$200(ClusterConnectionStates.java:466)
>         at 
> org.apache.kafka.clients.ClusterConnectionStates.currentAddress(ClusterConnectionStates.java:173)
>         at 
> org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:1075)
>         at 
> org.apache.kafka.clients.NetworkClient.ready(NetworkClient.java:321)
>         at 
> org.apache.kafka.server.util.InterBrokerSendThread.sendRequests(InterBrokerSendThread.java:146)
>         at 
> org.apache.kafka.server.util.InterBrokerSendThread.pollOnce(InterBrokerSendThread.java:109)
>         at 
> org.apache.kafka.server.util.InterBrokerSendThread.doWork(InterBrokerSendThread.java:137)
>         at 
> org.apache.kafka.server.util.ShutdownableThread.run(ShutdownableThread.java:136)
>  {code}
>  * Manually verified the last offsets of the replicas, and broker 0 is caught 
> up in the partition.
>  * Even after stopping the produce load, the issue persists.
>  * Even after removing the /controller node manually (to retrigger election), 
> regardless of which broker becomes the controller, the issue persists.
> Based on the above, it seems that during the rollback, brokers in migration 
> mode cannot handle the KRaft controllers being removed from th

[PR] KAFKA-18860: Remove Missing Features [kafka]

2025-02-27 Thread via GitHub


frankvicky opened a new pull request, #19048:
URL: https://github.com/apache/kafka/pull/19048

   JIRA: KAFKA-18860
   The content of Missing Features has been documented in zk2kraft.html. Hence, 
we could remove it
   
   Reviewers:
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (KAFKA-18860) Clarify the KRaft missing feature

2025-02-27 Thread Chia-Ping Tsai (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-18860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17931107#comment-17931107
 ] 

Chia-Ping Tsai commented on KAFKA-18860:


in the zk2kraft, we have write down the "non-dynamic configurations" 
("advertised.listeners"), and hence we can remove the description. Also, we can 
move the section directly.

> Clarify the KRaft missing feature
> -
>
> Key: KAFKA-18860
> URL: https://issues.apache.org/jira/browse/KAFKA-18860
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Luke Chen
>Assignee: TengYao Chi
>Priority: Major
>
> In the KRaft [missing 
> feature|https://kafka.apache.org/documentation/#kraft_missing] section, 
> there's one remaining:
>  
> Modifying certain dynamic configurations on the standalone KRaft controller
>  
> We should clarify what it means and make it clear to users. To me, I don't 
> think there's anything we can't update to KRaft controller. But not 100% sure 
> what the original author mean.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-18860) Clarify the KRaft missing feature

2025-02-27 Thread Chia-Ping Tsai (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-18860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai updated KAFKA-18860:
---
Fix Version/s: 4.0.0

> Clarify the KRaft missing feature
> -
>
> Key: KAFKA-18860
> URL: https://issues.apache.org/jira/browse/KAFKA-18860
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Luke Chen
>Assignee: TengYao Chi
>Priority: Major
> Fix For: 4.0.0
>
>
> In the KRaft [missing 
> feature|https://kafka.apache.org/documentation/#kraft_missing] section, 
> there's one remaining:
>  
> Modifying certain dynamic configurations on the standalone KRaft controller
>  
> We should clarify what it means and make it clear to users. To me, I don't 
> think there's anything we can't update to KRaft controller. But not 100% sure 
> what the original author mean.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-18885) document the behavioral differences between ZooKeeper and KRaft

2025-02-27 Thread Chia-Ping Tsai (Jira)
Chia-Ping Tsai created KAFKA-18885:
--

 Summary: document the behavioral differences between ZooKeeper and 
KRaft
 Key: KAFKA-18885
 URL: https://issues.apache.org/jira/browse/KAFKA-18885
 Project: Kafka
  Issue Type: Improvement
Reporter: Chia-Ping Tsai
Assignee: Chia-Ping Tsai


as title



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-18860) Clarify the KRaft missing feature

2025-02-27 Thread Chia-Ping Tsai (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-18860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai updated KAFKA-18860:
---
Parent: KAFKA-18885
Issue Type: Sub-task  (was: Bug)

> Clarify the KRaft missing feature
> -
>
> Key: KAFKA-18860
> URL: https://issues.apache.org/jira/browse/KAFKA-18860
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Luke Chen
>Assignee: TengYao Chi
>Priority: Major
>
> In the KRaft [missing 
> feature|https://kafka.apache.org/documentation/#kraft_missing] section, 
> there's one remaining:
>  
> Modifying certain dynamic configurations on the standalone KRaft controller
>  
> We should clarify what it means and make it clear to users. To me, I don't 
> think there's anything we can't update to KRaft controller. But not 100% sure 
> what the original author mean.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-18886) add behavior change of CreateTopicPolicy and AlterConfigPolicy to zk2kraft

2025-02-27 Thread Chia-Ping Tsai (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-18886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17931109#comment-17931109
 ] 

Chia-Ping Tsai commented on KAFKA-18886:


Additionally, we should update the documentation for 
create.topic.policy.class.name and alter.config.policy.class.name to remind 
users about the behavior change.

> add behavior change of CreateTopicPolicy and AlterConfigPolicy to zk2kraft
> --
>
> Key: KAFKA-18886
> URL: https://issues.apache.org/jira/browse/KAFKA-18886
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
>Priority: Minor
>
> as title. they are not running on broker under kraft.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-18871) KRaft migration rollback causes downtime

2025-02-27 Thread Daniel Urban (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17931121#comment-17931121
 ] 

Daniel Urban commented on KAFKA-18871:
--

[~showuon] Thank you for checking this!

I uploaded the strimzi + kafka broker logs. Some more info:
 * The topic partition with the ouf-of-sync replica is kraft-test-topic-2
 * We had multiple tests running in parallel, managed by the same strimzi 
installation. The namespace of the Kafka cluster which got stuck in rollback is 
csm-op-test-kraft-rollback-f19bca6a
 * We have the logs of each incarnation of the broker pods (i.e. logs for each 
restart). default-pool contains regular brokers, controller-pool contains 
dedicated KRaft brokers. For each pod, we have a separate log file where the 
name of the file is the pod ID.
 * We found an issue with Strimzi itself - since the reconciliation got stuck 
with the rolling of the brokers (due to the at-min-isr state of a partition), 
the same steps were repeated again and again, which included the removal of the 
/controller znode. This was repeated every 2 minutes, causing frequent 
controller changes. With that said, I scaled down the cluster operator 
deployment, and verified that even if the controller stabilizes (there is no 
/controller removal every 2 minutes), the ISR never gets fixed.

> KRaft migration rollback causes downtime
> 
>
> Key: KAFKA-18871
> URL: https://issues.apache.org/jira/browse/KAFKA-18871
> Project: Kafka
>  Issue Type: Bug
>  Components: kraft, migration
>Affects Versions: 3.9.0
>Reporter: Daniel Urban
>Priority: Critical
> Attachments: kraft-rollback-bug.zip
>
>
> When testing the KRaft migration rollback feature, found the following 
> scenario:
>  # Execute KRaft migration on a 3 broker 3 ZK node cluster to the last step, 
> but do not finalize the migration.
>  ## In the test, we put a slow but continuous produce+consume load on the 
> cluster, with a topic (partitions=3, RF=3, min ISR=2)
>  # Start the rollback procedure
>  # First we roll back the brokers from KRaft mode to migration mode (both 
> controller and ZK configs are set, process roles are removed, 
> {{zookeeper.metadata.migration.enable}} is true)
>  # Then we delete the KRaft controllers, delete the /controller znode
>  # Then we immediately start rolling the brokers 1 by 1 to ZK mode by 
> removing the {{zookeeper.metadata.migration.enable}} flag and the 
> controller.* configurations.
>  # At this point, when we restart the 1st broker (let's call it broker 0) 
> into ZK mode, we find an issue which occurs ~1 out of 20 times:
> If broker 0 is not in the ISR for one of the partitions at this point, it can 
> simply never become part of the ISR. Since we are aiming for zero downtime, 
> we check the ISR states of partitions between broker restarts, and our 
> process gets blocked at this point. We have tried multiple workarounds at 
> this point, but it seems that there is no workaround which still ensures zero 
> downtime.
> Some more details about the process:
>  * We are using Strimzi to drive this process, but have verified that Strimzi 
> follows the documented steps precisely.
>  * When we reach the error state, it doesn't matter which broker became the 
> controller through the ZK node, the brokers still in migration mode get 
> stuck, and they flood the logs with the following error:
> {code:java}
> 2025-02-26 10:55:21,985 WARN [RaftManager id=0] Error connecting to node 
> kraft-rollback-kafka-controller-pool-5.kraft-rollback-kafka-kafka-brokers.csm-op-test-kraft-rollback-e7798bef.svc.cluster.local:9090
>  (id: 5 rack: null) (org.apache.kafka.clients.NetworkClient) 
> [kafka-raft-outbound-request-thread]
> java.net.UnknownHostException: 
> kraft-rollback-kafka-controller-pool-5.kraft-rollback-kafka-kafka-brokers.csm-op-test-kraft-rollback-e7798bef.svc.cluster.local
>         at 
> java.base/java.net.InetAddress$CachedAddresses.get(InetAddress.java:801)
>         at java.base/java.net.InetAddress.getAllByName0(InetAddress.java:1533)
>         at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1385)
>         at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1306)
>         at 
> org.apache.kafka.clients.DefaultHostResolver.resolve(DefaultHostResolver.java:27)
>         at org.apache.kafka.clients.ClientUtils.resolve(ClientUtils.java:125)
>         at 
> org.apache.kafka.clients.ClusterConnectionStates$NodeConnectionState.resolveAddresses(ClusterConnectionStates.java:536)
>         at 
> org.apache.kafka.clients.ClusterConnectionStates$NodeConnectionState.currentAddress(ClusterConnectionStates.java:511)
>         at 
> org.apache.kafka.clients.ClusterConnectionStates$NodeConnectionState.access$200(ClusterConnectionStates.java:466)
>         at 
> 

Re: [PR] KAFKA-18734: Implemented share partition metrics (KIP-1103) [kafka]

2025-02-27 Thread via GitHub


AndrewJSchofield commented on code in PR #19045:
URL: https://github.com/apache/kafka/pull/19045#discussion_r1973396006


##
core/src/main/java/kafka/server/share/SharePartition.java:
##
@@ -1265,19 +1289,7 @@ boolean canAcquireRecords() {
 if (nextFetchOffset() != endOffset() + 1) {
 return true;
 }
-
-lock.readLock().lock();
-long numRecords;
-try {
-if (cachedState.isEmpty()) {
-numRecords = 0;
-} else {
-numRecords = this.endOffset - this.startOffset + 1;
-}
-} finally {
-lock.readLock().unlock();
-}
-return numRecords < maxInFlightMessages;
+return numInflightRecords() < maxInFlightMessages;

Review Comment:
   nit: `numInFlightRecords` for consistency.



##
server/src/main/java/org/apache/kafka/server/share/metrics/SharePartitionMetrics.java:
##
@@ -0,0 +1,166 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.kafka.server.share.metrics;
+
+import org.apache.kafka.common.utils.Utils;
+import org.apache.kafka.server.metrics.KafkaMetricsGroup;
+
+import com.yammer.metrics.core.Histogram;
+import com.yammer.metrics.core.Meter;
+
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import java.util.concurrent.TimeUnit;
+import java.util.function.Supplier;
+
+/**
+ * SharePartitionMetrics is used to track the broker-side metrics for the 
SharePartition.
+ */
+public class SharePartitionMetrics implements AutoCloseable {
+
+public static final String INFLIGHT_MESSAGE_COUNT = "InFlightMessageCount";

Review Comment:
   nit: Ought to be `IN_FLIGHT_` given the capitalization in the string version.



##
server/src/main/java/org/apache/kafka/server/share/metrics/SharePartitionMetrics.java:
##
@@ -0,0 +1,166 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.kafka.server.share.metrics;
+
+import org.apache.kafka.common.utils.Utils;
+import org.apache.kafka.server.metrics.KafkaMetricsGroup;
+
+import com.yammer.metrics.core.Histogram;
+import com.yammer.metrics.core.Meter;
+
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import java.util.concurrent.TimeUnit;
+import java.util.function.Supplier;
+
+/**
+ * SharePartitionMetrics is used to track the broker-side metrics for the 
SharePartition.
+ */
+public class SharePartitionMetrics implements AutoCloseable {
+
+public static final String INFLIGHT_MESSAGE_COUNT = "InFlightMessageCount";
+public static final String INFLIGHT_BATCH_COUNT = "InFlightBatchCount";
+
+private static final String ACQUISITION_LOCK_TIMEOUT_PER_SEC = 
"AcquisitionLockTimeoutPerSec";
+private static final String INFLIGHT_BATCH_MESSAGE_COUNT = 
"InFlightBatchMessageCount";
+private static final String FETCH_LOCK_TIME_MS = "FetchLockTimeMs";
+private static final String FETCH_LOCK_RATIO = "FetchLockRatio";
+
+/**
+ * Metric for the rate of acquisition lock timeouts for records.
+ */
+private final Meter acquisitionLockTimeoutPerSec;
+/**
+ * Metric for the number of in-flight messages for the batch.
+ */
+private final Histogram inFlightBatchMessageCount;
+/**
+ * Metric for the time the fetch lock is held.
+ */
+private final Histogram fetchLockTimeMs;
+/**
+ * Metric for the ratio of fetch lock time to the

Re: [PR] KAFKA-16718-2/n: KafkaAdminClient and GroupCoordinator implementation for DeleteShareGroupOffsets RPC [kafka]

2025-02-27 Thread via GitHub


AndrewJSchofield commented on code in PR #18976:
URL: https://github.com/apache/kafka/pull/18976#discussion_r1973409686


##
clients/src/main/java/org/apache/kafka/clients/admin/DeleteShareGroupOffsetsResult.java:
##
@@ -0,0 +1,97 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.kafka.clients.admin;
+
+import org.apache.kafka.common.KafkaFuture;
+import org.apache.kafka.common.TopicPartition;
+import org.apache.kafka.common.annotation.InterfaceStability;
+import org.apache.kafka.common.internals.KafkaFutureImpl;
+import org.apache.kafka.common.protocol.Errors;
+
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+/**
+ * The result of the {@link Admin#deleteShareGroupOffsets(String, Set, 
DeleteShareGroupOffsetsOptions)} call.
+ * 
+ * The API of this class is evolving, see {@link Admin} for details.
+ */
+@InterfaceStability.Evolving
+public class DeleteShareGroupOffsetsResult {
+
+private final KafkaFuture> future;
+
+DeleteShareGroupOffsetsResult(KafkaFuture> 
future) {
+this.future = future;
+}
+
+/**
+ * Return a future which succeeds only if all the deletions succeed.
+ */
+public KafkaFuture all() {
+return this.future.thenApply(topicPartitionErrorsMap ->  {
+List partitionsFailed = 
topicPartitionErrorsMap.entrySet()
+.stream()
+.filter(e -> e.getValue() != Errors.NONE)
+.map(Map.Entry::getKey)
+.collect(Collectors.toList());
+for (Errors error : topicPartitionErrorsMap.values()) {
+if (error != Errors.NONE) {
+throw error.exception(
+"Failed deleting share group offsets for the following 
partitions: " + partitionsFailed);
+}
+}
+return null;
+});
+}
+
+/**
+ * Return a future which can be used to check the result for a given 
partition.
+ */
+public KafkaFuture partitionResult(final TopicPartition partition) {
+final KafkaFutureImpl result = new KafkaFutureImpl<>();
+
+this.future.whenComplete((topicPartitions, throwable) -> {
+if (throwable != null) {
+result.completeExceptionally(throwable);
+} else if (!topicPartitions.containsKey(partition)) {
+result.completeExceptionally(new IllegalArgumentException(
+"Delete offset for partition \"" + partition + "\" was not 
attempted"));
+} else {
+final Errors error = topicPartitions.get(partition);
+if (error == Errors.NONE) {
+result.complete(null);
+} else {
+result.completeExceptionally(error.exception());
+}
+}
+});
+
+return result;
+}
+
+/**

Review Comment:
   I wonder why this method exists. It's not in the KIP (which could perhaps be 
an omission) and it's not in `DeleteConsumerGroupOffsetsResult`.



##
group-coordinator/src/main/java/org/apache/kafka/coordinator/group/GroupCoordinatorService.java:
##
@@ -1206,6 +1210,93 @@ public 
CompletableFuture 
deleteShareGroupOffsets(
+RequestContext context,
+DeleteShareGroupOffsetsRequestData requestData
+) {
+if (!isActive.get()) {
+return CompletableFuture.completedFuture(
+
DeleteShareGroupOffsetsRequest.getErrorDeleteResponseData(Errors.COORDINATOR_NOT_AVAILABLE));
+}
+
+if (metadataImage == null) {
+return CompletableFuture.completedFuture(
+
DeleteShareGroupOffsetsRequest.getErrorDeleteResponseData(Errors.COORDINATOR_NOT_AVAILABLE));
+}
+
+Map requestTopicIdToNameMapping = new HashMap<>();
+List 
deleteShareGroupStateRequestTopicsData = new 
ArrayList<>(requestData.topics().size());
+
List 
deleteShareGroupOffsetsResponseTopicList = new 
ArrayList<>(requestData.topics().size());
+
+requestData.topics().forEach(topic -> {
+Uuid topicId = 
metadataImage.topics().topicNameToIdView().get(top

Re: [PR] KAFKA-18617 Allow use of ClusterInstance inside BeforeEach [kafka]

2025-02-27 Thread via GitHub


chia7712 commented on code in PR #18662:
URL: https://github.com/apache/kafka/pull/18662#discussion_r1973443282


##
test-common/test-common-runtime/src/test/java/org/apache/kafka/common/test/junit/ClusterTestBeforeEachTest.java:
##
@@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.kafka.common.test.junit;
+
+import org.apache.kafka.common.test.ClusterInstance;
+import org.apache.kafka.common.test.api.AutoStart;
+import org.apache.kafka.common.test.api.ClusterTest;
+
+import org.junit.jupiter.api.BeforeEach;
+import org.junit.jupiter.api.extension.ExtendWith;
+
+import static org.junit.jupiter.api.Assertions.assertDoesNotThrow;
+import static org.junit.jupiter.api.Assertions.assertNotNull;
+import static org.junit.jupiter.api.Assertions.assertTrue;
+
+@ExtendWith(ClusterTestExtensions.class)
+public class ClusterTestBeforeEachTest {
+private final ClusterInstance clusterInstance;
+
+ClusterTestBeforeEachTest(ClusterInstance clusterInstance) { // 
Constructor injections
+this.clusterInstance = clusterInstance;
+}
+
+@BeforeEach
+void before() {
+assertNotNull(clusterInstance);
+if (!clusterInstance.started()) {
+clusterInstance.start();
+}
+assertDoesNotThrow(clusterInstance::waitForReadyBrokers);
+}
+
+@ClusterTest
+public void testDefault() {
+assertTrue(true);

Review Comment:
   why to add this assert?



##
test-common/test-common-runtime/src/test/java/org/apache/kafka/common/test/junit/ClusterTestBeforeEachTest.java:
##
@@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.kafka.common.test.junit;
+
+import org.apache.kafka.common.test.ClusterInstance;
+import org.apache.kafka.common.test.api.AutoStart;
+import org.apache.kafka.common.test.api.ClusterTest;
+
+import org.junit.jupiter.api.BeforeEach;
+import org.junit.jupiter.api.extension.ExtendWith;
+
+import static org.junit.jupiter.api.Assertions.assertDoesNotThrow;
+import static org.junit.jupiter.api.Assertions.assertNotNull;
+import static org.junit.jupiter.api.Assertions.assertTrue;
+
+@ExtendWith(ClusterTestExtensions.class)
+public class ClusterTestBeforeEachTest {
+private final ClusterInstance clusterInstance;
+
+ClusterTestBeforeEachTest(ClusterInstance clusterInstance) { // 
Constructor injections
+this.clusterInstance = clusterInstance;
+}
+
+@BeforeEach
+void before() {
+assertNotNull(clusterInstance);
+if (!clusterInstance.started()) {
+clusterInstance.start();
+}
+assertDoesNotThrow(clusterInstance::waitForReadyBrokers);
+}
+
+@ClusterTest
+public void testDefault() {
+assertTrue(true);
+assertNotNull(clusterInstance);
+}
+
+@ClusterTest(autoStart = AutoStart.NO)
+public void testNoAutoStart() {
+assertTrue(true);

Review Comment:
   ditto



##
test-common/test-common-runtime/src/test/java/org/apache/kafka/common/test/junit/ClusterTestBeforeEachTest.java:
##
@@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may no

Re: [PR] KAFKA-18804: Remove slf4j warning when using tool script [kafka]

2025-02-27 Thread via GitHub


Yunyung commented on code in PR #18918:
URL: https://github.com/apache/kafka/pull/18918#discussion_r1973445909


##
build.gradle:
##
@@ -2485,7 +2485,9 @@ project(':tools') {
 from (configurations.runtimeClasspath) {
   exclude('kafka-clients*')
 }
-from (configurations.releaseOnly)
+from (configurations.releaseOnly) {
+  exclude('log4j-slf4j-impl-*.jar')

Review Comment:
   Done. Also re-verify it works.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (KAFKA-18871) KRaft migration rollback causes downtime

2025-02-27 Thread Daniel Urban (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17931133#comment-17931133
 ] 

Daniel Urban commented on KAFKA-18871:
--

[~ppatierno] Thanks for checking the logs. As you said, cluster operator not 
rolling due to the ISR is completely normal and works as expected.

The issue is that for the cluster in namespace 
csm-op-test-kraft-rollback-f19bca6a broker 1 is never added to the ISR.

I've mentioned that there were multiple kafka clusters managed by this 
operator, all of them being rolled back from the KRaft migration, and some 
passed successfully (that's probably the one you found when you said "At some 
point AFAICS, the broker 0 is rolled because broker 1 seems to catch up."), but 
for the specific cluster in ns csm-op-test-kraft-rollback-f19bca6a, the last 
roll does not happen, it gets stuck after broker 1 was rolled, and then the 
roll is blocked due to the at-min-isr state of 1 partition.

> KRaft migration rollback causes downtime
> 
>
> Key: KAFKA-18871
> URL: https://issues.apache.org/jira/browse/KAFKA-18871
> Project: Kafka
>  Issue Type: Bug
>  Components: kraft, migration
>Affects Versions: 3.9.0
>Reporter: Daniel Urban
>Priority: Critical
> Attachments: kraft-rollback-bug.zip
>
>
> When testing the KRaft migration rollback feature, found the following 
> scenario:
>  # Execute KRaft migration on a 3 broker 3 ZK node cluster to the last step, 
> but do not finalize the migration.
>  ## In the test, we put a slow but continuous produce+consume load on the 
> cluster, with a topic (partitions=3, RF=3, min ISR=2)
>  # Start the rollback procedure
>  # First we roll back the brokers from KRaft mode to migration mode (both 
> controller and ZK configs are set, process roles are removed, 
> {{zookeeper.metadata.migration.enable}} is true)
>  # Then we delete the KRaft controllers, delete the /controller znode
>  # Then we immediately start rolling the brokers 1 by 1 to ZK mode by 
> removing the {{zookeeper.metadata.migration.enable}} flag and the 
> controller.* configurations.
>  # At this point, when we restart the 1st broker (let's call it broker 0) 
> into ZK mode, we find an issue which occurs ~1 out of 20 times:
> If broker 0 is not in the ISR for one of the partitions at this point, it can 
> simply never become part of the ISR. Since we are aiming for zero downtime, 
> we check the ISR states of partitions between broker restarts, and our 
> process gets blocked at this point. We have tried multiple workarounds at 
> this point, but it seems that there is no workaround which still ensures zero 
> downtime.
> Some more details about the process:
>  * We are using Strimzi to drive this process, but have verified that Strimzi 
> follows the documented steps precisely.
>  * When we reach the error state, it doesn't matter which broker became the 
> controller through the ZK node, the brokers still in migration mode get 
> stuck, and they flood the logs with the following error:
> {code:java}
> 2025-02-26 10:55:21,985 WARN [RaftManager id=0] Error connecting to node 
> kraft-rollback-kafka-controller-pool-5.kraft-rollback-kafka-kafka-brokers.csm-op-test-kraft-rollback-e7798bef.svc.cluster.local:9090
>  (id: 5 rack: null) (org.apache.kafka.clients.NetworkClient) 
> [kafka-raft-outbound-request-thread]
> java.net.UnknownHostException: 
> kraft-rollback-kafka-controller-pool-5.kraft-rollback-kafka-kafka-brokers.csm-op-test-kraft-rollback-e7798bef.svc.cluster.local
>         at 
> java.base/java.net.InetAddress$CachedAddresses.get(InetAddress.java:801)
>         at java.base/java.net.InetAddress.getAllByName0(InetAddress.java:1533)
>         at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1385)
>         at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1306)
>         at 
> org.apache.kafka.clients.DefaultHostResolver.resolve(DefaultHostResolver.java:27)
>         at org.apache.kafka.clients.ClientUtils.resolve(ClientUtils.java:125)
>         at 
> org.apache.kafka.clients.ClusterConnectionStates$NodeConnectionState.resolveAddresses(ClusterConnectionStates.java:536)
>         at 
> org.apache.kafka.clients.ClusterConnectionStates$NodeConnectionState.currentAddress(ClusterConnectionStates.java:511)
>         at 
> org.apache.kafka.clients.ClusterConnectionStates$NodeConnectionState.access$200(ClusterConnectionStates.java:466)
>         at 
> org.apache.kafka.clients.ClusterConnectionStates.currentAddress(ClusterConnectionStates.java:173)
>         at 
> org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:1075)
>         at 
> org.apache.kafka.clients.NetworkClient.ready(NetworkClient.java:321)
>         at 
> org.apache.kafka.server.util.InterBrokerSendThread.sendRequests(InterBro

[jira] [Created] (KAFKA-18887) Implement Admin API extensions beyond list/describe group (delete group, offset-related APIs)

2025-02-27 Thread Alieh Saeedi (Jira)
Alieh Saeedi created KAFKA-18887:


 Summary: Implement Admin API extensions beyond list/describe group 
(delete group, offset-related APIs)
 Key: KAFKA-18887
 URL: https://issues.apache.org/jira/browse/KAFKA-18887
 Project: Kafka
  Issue Type: Sub-task
Reporter: Alieh Saeedi






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-18887) Implement Admin API extensions beyond list/describe group (delete group, offset-related APIs)

2025-02-27 Thread Alieh Saeedi (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-18887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alieh Saeedi updated KAFKA-18887:
-
Description: 
* add methods for describing and manipulating offsets as described in KIP-1071

 * add corresponding unit tests

> Implement Admin API extensions beyond list/describe group (delete group, 
> offset-related APIs)
> -
>
> Key: KAFKA-18887
> URL: https://issues.apache.org/jira/browse/KAFKA-18887
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Alieh Saeedi
>Assignee: Alieh Saeedi
>Priority: Major
>
> * add methods for describing and manipulating offsets as described in KIP-1071
>  * add corresponding unit tests



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (KAFKA-18887) Implement Admin API extensions beyond list/describe group (delete group, offset-related APIs)

2025-02-27 Thread Alieh Saeedi (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-18887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alieh Saeedi reassigned KAFKA-18887:


Assignee: Alieh Saeedi

> Implement Admin API extensions beyond list/describe group (delete group, 
> offset-related APIs)
> -
>
> Key: KAFKA-18887
> URL: https://issues.apache.org/jira/browse/KAFKA-18887
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Alieh Saeedi
>Assignee: Alieh Saeedi
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[PR] KAFKA-18887 Implement Streams Admin APIs [kafka]

2025-02-27 Thread via GitHub


aliehsaeedii opened a new pull request, #19049:
URL: https://github.com/apache/kafka/pull/19049

   https://issues.apache.org/jira/browse/KAFKA-18887
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] MINOR: Remove unused system test code and avoid misleading `quorum.zk` references [kafka]

2025-02-27 Thread via GitHub


m1a2st commented on code in PR #19034:
URL: https://github.com/apache/kafka/pull/19034#discussion_r1973418908


##
tests/kafkatest/services/kafka/quorum.py:
##
@@ -14,14 +14,10 @@
 # limitations under the License.
 
 # the types of metadata quorums we support
-zk = 'ZK' # ZooKeeper, used before/during the KIP-500 bridge release(s)
-combined_kraft = 'COMBINED_KRAFT' # combined Controllers in KRaft mode, used 
during/after the KIP-500 bridge release(s)
-isolated_kraft = 'ISOLATED_KRAFT' # isolated Controllers in KRaft mode, used 
during/after the KIP-500 bridge release(s)
-
-# How we will parameterize tests that exercise all quorum styles
-#   [“ZK”, “ISOLATED_KRAFT”, "COMBINED_KRAFT"] during the KIP-500 bridge 
release(s)
-#   [“ISOLATED_KRAFT”, "COMBINED_KRAFT”] after the KIP-500 bridge release(s)
-all = [zk, isolated_kraft, combined_kraft]

Review Comment:
   `all = [zk, isolated_kraft, combined_kraft]` should not be removed, as it 
represents this variable in quorum.py at line 32."
   
   see the error log
   ```
   TypeError("argument of type 'builtin_function_or_method' is not 
iterable")
   Traceback (most recent call last):
 File 
"/usr/local/lib/python3.7/dist-packages/ducktape/tests/runner_client.py", line 
351, in _do_run
   data = self.run_test()
 File 
"/usr/local/lib/python3.7/dist-packages/ducktape/tests/runner_client.py", line 
411, in run_test
   return self.test_context.function(self.test)
 File "/usr/local/lib/python3.7/dist-packages/ducktape/mark/_mark.py", line 
438, in wrapper
   return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs)
 File 
"/opt/kafka-dev/tests/kafkatest/sanity_checks/test_performance_services.py", 
line 43, in test_version
   None, topics={self.topic: {'partitions': 1, 'replication-factor': 1}}, 
version=DEV_BRANCH)
 File "/opt/kafka-dev/tests/kafkatest/services/kafka/kafka.py", line 282, 
in __init__
   self.quorum_info = quorum.ServiceQuorumInfo.from_test_context(self, 
context)
 File "/opt/kafka-dev/tests/kafkatest/services/kafka/quorum.py", line 109, 
in from_test_context
   quorum_type = for_test(context)
 File "/opt/kafka-dev/tests/kafkatest/services/kafka/quorum.py", line 32, 
in for_test
   if retval not in all:
   TypeError: argument of type 'builtin_function_or_method' is not iterable
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (KAFKA-18888) Add support for Authorizer

2025-02-27 Thread Mickael Maison (Jira)
Mickael Maison created KAFKA-1:
--

 Summary: Add support for Authorizer
 Key: KAFKA-1
 URL: https://issues.apache.org/jira/browse/KAFKA-1
 Project: Kafka
  Issue Type: Sub-task
Reporter: Mickael Maison
Assignee: Mickael Maison






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-18871) KRaft migration rollback causes downtime

2025-02-27 Thread Paolo Patierno (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17931126#comment-17931126
 ] 

Paolo Patierno commented on KAFKA-18871:


I had a look at the operator log and it looks like it's doing what has to be 
done.

The brokers need to be rolled and of course it's trying to start from broker 0. 
The issue is that the broker 1 is out of sync. As you can see here:
{code:java}
2025-02-25 11:45:31 DEBUG KafkaAvailability:101 - Reconciliation #10971(watch) 
Kafka(csm-op-test-kraft-rollback-f19bca6a/kraft-rollback-kafka): 
kraft-test-topic has min.insync.replicas=2.
2025-02-25 11:45:31 INFO  KafkaAvailability:135 - Reconciliation #10971(watch) 
Kafka(csm-op-test-kraft-rollback-f19bca6a/kraft-rollback-kafka): 
kraft-test-topic/2 will be under-replicated (ISR={0,2}, replicas=[0,1,2], 
min.insync.replicas=2) if broker 0 is restarted.
2025-02-25 11:45:31 DEBUG KafkaAvailability:86 - Reconciliation #10971(watch) 
Kafka(csm-op-test-kraft-rollback-f19bca6a/kraft-rollback-kafka): Restart pod 0 
would remove it from ISR, stalling producers with acks=all {code}
so the operator is preventing from rolling broker 0 because otherwise you would 
have under replicated partition.

This can happen in general and not strictly related to a rollback migration 
operation. It could even happen when you apply a broker config change which 
needs a rolling of all brokers but the operator can't roll because of the 
possibility of under replicated partitions.

At some point AFAICS, the broker 0 is rolled because broker 1 seems to catch 
up. All three brokers have ZooKeeper connect configuration as they are using 
the ZooKeeper ensemble again.

> KRaft migration rollback causes downtime
> 
>
> Key: KAFKA-18871
> URL: https://issues.apache.org/jira/browse/KAFKA-18871
> Project: Kafka
>  Issue Type: Bug
>  Components: kraft, migration
>Affects Versions: 3.9.0
>Reporter: Daniel Urban
>Priority: Critical
> Attachments: kraft-rollback-bug.zip
>
>
> When testing the KRaft migration rollback feature, found the following 
> scenario:
>  # Execute KRaft migration on a 3 broker 3 ZK node cluster to the last step, 
> but do not finalize the migration.
>  ## In the test, we put a slow but continuous produce+consume load on the 
> cluster, with a topic (partitions=3, RF=3, min ISR=2)
>  # Start the rollback procedure
>  # First we roll back the brokers from KRaft mode to migration mode (both 
> controller and ZK configs are set, process roles are removed, 
> {{zookeeper.metadata.migration.enable}} is true)
>  # Then we delete the KRaft controllers, delete the /controller znode
>  # Then we immediately start rolling the brokers 1 by 1 to ZK mode by 
> removing the {{zookeeper.metadata.migration.enable}} flag and the 
> controller.* configurations.
>  # At this point, when we restart the 1st broker (let's call it broker 0) 
> into ZK mode, we find an issue which occurs ~1 out of 20 times:
> If broker 0 is not in the ISR for one of the partitions at this point, it can 
> simply never become part of the ISR. Since we are aiming for zero downtime, 
> we check the ISR states of partitions between broker restarts, and our 
> process gets blocked at this point. We have tried multiple workarounds at 
> this point, but it seems that there is no workaround which still ensures zero 
> downtime.
> Some more details about the process:
>  * We are using Strimzi to drive this process, but have verified that Strimzi 
> follows the documented steps precisely.
>  * When we reach the error state, it doesn't matter which broker became the 
> controller through the ZK node, the brokers still in migration mode get 
> stuck, and they flood the logs with the following error:
> {code:java}
> 2025-02-26 10:55:21,985 WARN [RaftManager id=0] Error connecting to node 
> kraft-rollback-kafka-controller-pool-5.kraft-rollback-kafka-kafka-brokers.csm-op-test-kraft-rollback-e7798bef.svc.cluster.local:9090
>  (id: 5 rack: null) (org.apache.kafka.clients.NetworkClient) 
> [kafka-raft-outbound-request-thread]
> java.net.UnknownHostException: 
> kraft-rollback-kafka-controller-pool-5.kraft-rollback-kafka-kafka-brokers.csm-op-test-kraft-rollback-e7798bef.svc.cluster.local
>         at 
> java.base/java.net.InetAddress$CachedAddresses.get(InetAddress.java:801)
>         at java.base/java.net.InetAddress.getAllByName0(InetAddress.java:1533)
>         at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1385)
>         at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1306)
>         at 
> org.apache.kafka.clients.DefaultHostResolver.resolve(DefaultHostResolver.java:27)
>         at org.apache.kafka.clients.ClientUtils.resolve(ClientUtils.java:125)
>         at 
> org.apache.kafka.clients.ClusterConnectionState

[jira] [Created] (KAFKA-18889) Make records in ShareFetchResponse non-nullable

2025-02-27 Thread Apoorv Mittal (Jira)
Apoorv Mittal created KAFKA-18889:
-

 Summary: Make records in ShareFetchResponse non-nullable
 Key: KAFKA-18889
 URL: https://issues.apache.org/jira/browse/KAFKA-18889
 Project: Kafka
  Issue Type: Sub-task
Reporter: Apoorv Mittal
Assignee: Apoorv Mittal


The records should be made non-nullable in ShareFetchResponse to match with 
recent changes for FetchResponse. Context: 
https://github.com/apache/kafka/pull/18726



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (KAFKA-18886) add behavior change of CreateTopicPolicy and AlterConfigPolicy to zk2kraft

2025-02-27 Thread Logan Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-18886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Logan Zhu reassigned KAFKA-18886:
-

Assignee: Logan Zhu  (was: Chia-Ping Tsai)

> add behavior change of CreateTopicPolicy and AlterConfigPolicy to zk2kraft
> --
>
> Key: KAFKA-18886
> URL: https://issues.apache.org/jira/browse/KAFKA-18886
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Chia-Ping Tsai
>Assignee: Logan Zhu
>Priority: Minor
>
> as title. they are not running on broker under kraft.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] KAFKA-18827: Initialize share group state group coordinator impl. [3/N] [kafka]

2025-02-27 Thread via GitHub


AndrewJSchofield commented on code in PR #19026:
URL: https://github.com/apache/kafka/pull/19026#discussion_r1973313677


##
group-coordinator/src/main/java/org/apache/kafka/coordinator/group/GroupMetadataManager.java:
##
@@ -463,6 +472,19 @@ GroupMetadataManager build() {
  */
 private final ShareGroupPartitionAssignor shareGroupAssignor;
 
+/**
+ * A record class to hold the value representing 
ShareGroupStatePartitionMetadata for the TimelineHashmap
+ * keyed on share group id.
+ *
+ * @param initializedTopics Map of set of partition ids keyed on the topic 
id.
+ * @param deletingTopicsMap of topic names keyed on topic id.
+ */
+private record ShareGroupStatePartitionMetadataInfo(

Review Comment:
   I don't understand why we need topic name in one case, but not the other.
   
   Option 1:
   ```
   Map intializedTopics
   Set deletingTopics
   ```
   
   Option 2:
   ```
   Map> intializedTopics
   Map deletingTopics
   ```
   
   I suspect that option 1 would be sufficient. What am I missing?



##
group-coordinator/src/main/java/org/apache/kafka/coordinator/group/GroupMetadataManager.java:
##
@@ -2190,10 +2213,10 @@ private CoordinatorResult 
classicGroupJoinToConsumerGro
  * @param clientHostThe client host.
  * @param subscribedTopicNames  The list of subscribed topic names from 
the request or null.
  *
- * @return A Result containing the ShareGroupHeartbeat response and
- * a list of records to update the state machine.
+ * @return A Result containing a pair of ShareGroupHeartbeat response and 
maybe InitializeShareGroupStateParameters
+ * and a list of records to update the state machine.
  */
-private CoordinatorResult shareGroupHeartbeat(
+private CoordinatorResult>, CoordinatorRecord> 
shareGroupHeartbeat(

Review Comment:
   I am tempted to create a `Pair` in the common module. There are other 
places I've seen where we need a pair. Absolutely no need for you to address 
that in this PR. I understand what you're doing here.



##
group-coordinator/src/main/java/org/apache/kafka/coordinator/group/GroupMetadataManager.java:
##
@@ -4741,6 +4889,50 @@ public void replay(
 }
 }
 
+/**
+ * Replays ShareGroupStatePartitionMetadataKey/Value to update the hard 
state of
+ * the share group.
+ *
+ * @param key   A ShareGroupStatePartitionMetadataKey key.
+ * @param value A ShareGroupStatePartitionMetadataValue record.
+ */
+public void replay(
+ShareGroupStatePartitionMetadataKey key,
+ShareGroupStatePartitionMetadataValue value
+) {
+String groupId = key.groupId();
+
+getOrMaybeCreatePersistedShareGroup(groupId, false);
+
+// Update timeline structures with info about initialized/deleted 
topics.
+if (value == null) {
+// Tombstone!
+shareGroupPartitionMetadata.remove(groupId);
+} else {
+// Init java record.
+ShareGroupStatePartitionMetadataInfo record = 
shareGroupPartitionMetadata.computeIfAbsent(
+groupId, k -> new ShareGroupStatePartitionMetadataInfo(new 
HashMap<>(), new HashMap<>())
+);
+
+// Remove all topicIds in deleting state from java record.
+Map deleting = new HashMap<>();
+for (ShareGroupStatePartitionMetadataValue.TopicInfo deletingTopic 
: value.deletingTopics()) {
+deleting.put(deletingTopic.topicId(), 
deletingTopic.topicName());
+}
+deleting.forEach((tId, tName) -> 
record.initializedTopics.remove(tId));

Review Comment:
   Looks to me like the initialized topics list is not initialized yet.



##
group-coordinator/src/main/java/org/apache/kafka/coordinator/group/GroupCoordinatorShard.java:
##
@@ -1066,6 +1086,13 @@ public void replay(
 );
 break;
 
+case SHARE_GROUP_STATE_PARTITION_METADATA:

Review Comment:
   Please move this case to follow `SHARE_GROUP_CURRENT_MEMBER_ASSIGNMENT`.



##
group-coordinator/src/main/java/org/apache/kafka/coordinator/group/GroupMetadataManager.java:
##
@@ -3981,6 +4105,30 @@ public 
CoordinatorResult sha
 request.subscribedTopicNames());
 }
 
+/**
+ * Handles an initialize share group state request. This is usually part of
+ * shareGroupHeartbeat code flow.
+ * @param groupId The group id corresponding to the share group whose 
share partitions have been initialized.
+ * @param topicPartitionMap Map representing topic partition data to be 
added to the share state partition metadata.
+ *
+ * @return A Result containing ShareGroupStatePartitionMetadata records 
and Void response.
+ */
+public CoordinatorResult 
initializeShareGroupState(
+String groupId,
+Map>> topicPartitionMap
+) {
+/

Re: [PR] KAFKA-18860: Remove Missing Features [kafka]

2025-02-27 Thread via GitHub


frankvicky commented on PR #19048:
URL: https://github.com/apache/kafka/pull/19048#issuecomment-2687668866

   It was between "Deploying Considerations" and "ZooKeeper to KRaft 
Migration". Now it's gone.
   Preview:
   
![未命名設計](https://github.com/user-attachments/assets/1ecdf6a6-204c-48a1-9d24-8797ae688b8b)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Assigned] (KAFKA-18882) Move BaseKey, TxnKey, and UnknownKey to transaction-coordinator module

2025-02-27 Thread Chia-Ping Tsai (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-18882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai reassigned KAFKA-18882:
--

Assignee: xuanzhang gong  (was: Chia-Ping Tsai)

> Move BaseKey, TxnKey, and UnknownKey to transaction-coordinator module
> --
>
> Key: KAFKA-18882
> URL: https://issues.apache.org/jira/browse/KAFKA-18882
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Chia-Ping Tsai
>Assignee: xuanzhang gong
>Priority: Minor
>
> as title



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-18882) Move BaseKey, TxnKey, and UnknownKey to transaction-coordinator module

2025-02-27 Thread Chia-Ping Tsai (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-18882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17931130#comment-17931130
 ] 

Chia-Ping Tsai commented on KAFKA-18882:


[~javakillah] sorry that we are working on it :(

> Move BaseKey, TxnKey, and UnknownKey to transaction-coordinator module
> --
>
> Key: KAFKA-18882
> URL: https://issues.apache.org/jira/browse/KAFKA-18882
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
>Priority: Minor
>
> as title



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] KAFKA-18734: Implemented share partition metrics (KIP-1103) [kafka]

2025-02-27 Thread via GitHub


apoorvmittal10 commented on code in PR #19045:
URL: https://github.com/apache/kafka/pull/19045#discussion_r1973509905


##
server/src/main/java/org/apache/kafka/server/share/metrics/SharePartitionMetrics.java:
##
@@ -0,0 +1,166 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.kafka.server.share.metrics;
+
+import org.apache.kafka.common.utils.Utils;
+import org.apache.kafka.server.metrics.KafkaMetricsGroup;
+
+import com.yammer.metrics.core.Histogram;
+import com.yammer.metrics.core.Meter;
+
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import java.util.concurrent.TimeUnit;
+import java.util.function.Supplier;
+
+/**
+ * SharePartitionMetrics is used to track the broker-side metrics for the 
SharePartition.
+ */
+public class SharePartitionMetrics implements AutoCloseable {
+
+public static final String INFLIGHT_MESSAGE_COUNT = "InFlightMessageCount";

Review Comment:
   Done.



##
core/src/main/java/kafka/server/share/SharePartition.java:
##
@@ -1265,19 +1289,7 @@ boolean canAcquireRecords() {
 if (nextFetchOffset() != endOffset() + 1) {
 return true;
 }
-
-lock.readLock().lock();
-long numRecords;
-try {
-if (cachedState.isEmpty()) {
-numRecords = 0;
-} else {
-numRecords = this.endOffset - this.startOffset + 1;
-}
-} finally {
-lock.readLock().unlock();
-}
-return numRecords < maxInFlightMessages;
+return numInflightRecords() < maxInFlightMessages;

Review Comment:
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-18371 TopicBasedRemoteLogMetadataManagerConfig exposes sensitive configuration data in logs [kafka]

2025-02-27 Thread via GitHub


Wadimz commented on PR #18349:
URL: https://github.com/apache/kafka/pull/18349#issuecomment-2687852370

   thanks for reminder! I've updated the branch from trunk. @chia7712 could you 
please check when have time? thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-18734: Implemented share partition metrics (KIP-1103) [kafka]

2025-02-27 Thread via GitHub


apoorvmittal10 commented on code in PR #19045:
URL: https://github.com/apache/kafka/pull/19045#discussion_r1973510264


##
server/src/main/java/org/apache/kafka/server/share/metrics/SharePartitionMetrics.java:
##
@@ -0,0 +1,166 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.kafka.server.share.metrics;
+
+import org.apache.kafka.common.utils.Utils;
+import org.apache.kafka.server.metrics.KafkaMetricsGroup;
+
+import com.yammer.metrics.core.Histogram;
+import com.yammer.metrics.core.Meter;
+
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import java.util.concurrent.TimeUnit;
+import java.util.function.Supplier;
+
+/**
+ * SharePartitionMetrics is used to track the broker-side metrics for the 
SharePartition.
+ */
+public class SharePartitionMetrics implements AutoCloseable {
+
+public static final String INFLIGHT_MESSAGE_COUNT = "InFlightMessageCount";
+public static final String INFLIGHT_BATCH_COUNT = "InFlightBatchCount";
+
+private static final String ACQUISITION_LOCK_TIMEOUT_PER_SEC = 
"AcquisitionLockTimeoutPerSec";
+private static final String INFLIGHT_BATCH_MESSAGE_COUNT = 
"InFlightBatchMessageCount";
+private static final String FETCH_LOCK_TIME_MS = "FetchLockTimeMs";
+private static final String FETCH_LOCK_RATIO = "FetchLockRatio";
+
+/**
+ * Metric for the rate of acquisition lock timeouts for records.
+ */
+private final Meter acquisitionLockTimeoutPerSec;
+/**
+ * Metric for the number of in-flight messages for the batch.
+ */
+private final Histogram inFlightBatchMessageCount;
+/**
+ * Metric for the time the fetch lock is held.
+ */
+private final Histogram fetchLockTimeMs;
+/**
+ * Metric for the ratio of fetch lock time to the total time.
+ */
+private final Histogram fetchLockRatio;
+
+private final Map tags;
+private final KafkaMetricsGroup metricsGroup;
+
+public SharePartitionMetrics(String groupId, String topic, int partition) {
+this.tags = Utils.mkMap(
+Utils.mkEntry("group", Objects.requireNonNull(groupId)),
+Utils.mkEntry("topic", Objects.requireNonNull(topic)),
+Utils.mkEntry("partition", String.valueOf(partition))
+);
+this.metricsGroup = new KafkaMetricsGroup("kafka.server", 
"SharePartitionMetrics");
+
+this.acquisitionLockTimeoutPerSec = metricsGroup.newMeter(
+ACQUISITION_LOCK_TIMEOUT_PER_SEC,
+"acquisition lock timeout",
+TimeUnit.SECONDS,
+this.tags);
+
+this.inFlightBatchMessageCount = metricsGroup.newHistogram(
+INFLIGHT_BATCH_MESSAGE_COUNT,
+true,
+this.tags);
+
+this.fetchLockTimeMs = metricsGroup.newHistogram(
+FETCH_LOCK_TIME_MS,
+true,
+this.tags);
+
+this.fetchLockRatio = metricsGroup.newHistogram(
+FETCH_LOCK_RATIO,
+true,
+this.tags);
+}
+
+/**
+ * Register a gauge for the in-flight message count.
+ *
+ * @param messageCountSupplier The supplier for the in-flight message 
count.
+ */
+public void registerInflightMessageCount(Supplier 
messageCountSupplier) {

Review Comment:
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-18827: Initialize share group state group coordinator impl. [3/N] [kafka]

2025-02-27 Thread via GitHub


smjn commented on PR #19026:
URL: https://github.com/apache/kafka/pull/19026#issuecomment-2687877409

   @AndrewJSchofield 
   Thanks for the review, incorporated changes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] WIP Investigate OOM [kafka]

2025-02-27 Thread via GitHub


ijuma commented on code in PR #19031:
URL: https://github.com/apache/kafka/pull/19031#discussion_r1973623540


##
build.gradle:
##
@@ -54,7 +54,7 @@ ext {
   buildVersionFileName = "kafka-version.properties"
 
   defaultMaxHeapSize = "2g"
-  defaultJvmArgs = ["-Xss4m", "-XX:+UseParallelGC"]
+  defaultJvmArgs = ["-Xss4m", "-XX:+UseParallelGC", "-XX:-UseGCOverheadLimit"]

Review Comment:
   As you said, the build is unlikely to succeed in either case. The GC 
overhead thing at least gives a hint that there is a memory leak or the heap is 
too small. Isn't that better than a timeout with no information?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (KAFKA-18871) KRaft migration rollback causes downtime

2025-02-27 Thread Daniel Urban (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17931172#comment-17931172
 ] 

Daniel Urban commented on KAFKA-18871:
--

[~ppatierno] I agree that this issue does not happen because of Strimzi - the 
only minor issue I found is that Strimzi keeps deleting the /controller znode 
on each recon in this state (every 2 minutes), which is unnecessary.

I think this bug is related to the migration rollback - it seems to me that no 
matter which broker is the controller, the brokers in migration mode still want 
to communicate with the KRaft nodes, and they are stuck in a loop retrying to 
connect to them. This means that when broker 1 successfully catches up in 
partition kraft-test-topic-2, the leader (which is still in migration mode) 
does not update the ISR in the metadata. I will do another pass on the broker 
logs of 0 and 2 (migration mode brokers) to see if I can find something related 
to the ISR update, but I did manually verify through the last offsets that 
broker 1 did catch up in the partition.

> KRaft migration rollback causes downtime
> 
>
> Key: KAFKA-18871
> URL: https://issues.apache.org/jira/browse/KAFKA-18871
> Project: Kafka
>  Issue Type: Bug
>  Components: kraft, migration
>Affects Versions: 3.9.0
>Reporter: Daniel Urban
>Priority: Critical
> Attachments: kraft-rollback-bug.zip
>
>
> When testing the KRaft migration rollback feature, found the following 
> scenario:
>  # Execute KRaft migration on a 3 broker 3 ZK node cluster to the last step, 
> but do not finalize the migration.
>  ## In the test, we put a slow but continuous produce+consume load on the 
> cluster, with a topic (partitions=3, RF=3, min ISR=2)
>  # Start the rollback procedure
>  # First we roll back the brokers from KRaft mode to migration mode (both 
> controller and ZK configs are set, process roles are removed, 
> {{zookeeper.metadata.migration.enable}} is true)
>  # Then we delete the KRaft controllers, delete the /controller znode
>  # Then we immediately start rolling the brokers 1 by 1 to ZK mode by 
> removing the {{zookeeper.metadata.migration.enable}} flag and the 
> controller.* configurations.
>  # At this point, when we restart the 1st broker (let's call it broker 0) 
> into ZK mode, we find an issue which occurs ~1 out of 20 times:
> If broker 0 is not in the ISR for one of the partitions at this point, it can 
> simply never become part of the ISR. Since we are aiming for zero downtime, 
> we check the ISR states of partitions between broker restarts, and our 
> process gets blocked at this point. We have tried multiple workarounds at 
> this point, but it seems that there is no workaround which still ensures zero 
> downtime.
> Some more details about the process:
>  * We are using Strimzi to drive this process, but have verified that Strimzi 
> follows the documented steps precisely.
>  * When we reach the error state, it doesn't matter which broker became the 
> controller through the ZK node, the brokers still in migration mode get 
> stuck, and they flood the logs with the following error:
> {code:java}
> 2025-02-26 10:55:21,985 WARN [RaftManager id=0] Error connecting to node 
> kraft-rollback-kafka-controller-pool-5.kraft-rollback-kafka-kafka-brokers.csm-op-test-kraft-rollback-e7798bef.svc.cluster.local:9090
>  (id: 5 rack: null) (org.apache.kafka.clients.NetworkClient) 
> [kafka-raft-outbound-request-thread]
> java.net.UnknownHostException: 
> kraft-rollback-kafka-controller-pool-5.kraft-rollback-kafka-kafka-brokers.csm-op-test-kraft-rollback-e7798bef.svc.cluster.local
>         at 
> java.base/java.net.InetAddress$CachedAddresses.get(InetAddress.java:801)
>         at java.base/java.net.InetAddress.getAllByName0(InetAddress.java:1533)
>         at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1385)
>         at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1306)
>         at 
> org.apache.kafka.clients.DefaultHostResolver.resolve(DefaultHostResolver.java:27)
>         at org.apache.kafka.clients.ClientUtils.resolve(ClientUtils.java:125)
>         at 
> org.apache.kafka.clients.ClusterConnectionStates$NodeConnectionState.resolveAddresses(ClusterConnectionStates.java:536)
>         at 
> org.apache.kafka.clients.ClusterConnectionStates$NodeConnectionState.currentAddress(ClusterConnectionStates.java:511)
>         at 
> org.apache.kafka.clients.ClusterConnectionStates$NodeConnectionState.access$200(ClusterConnectionStates.java:466)
>         at 
> org.apache.kafka.clients.ClusterConnectionStates.currentAddress(ClusterConnectionStates.java:173)
>         at 
> org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:1075)
>         at 
> org.apache.kafka.clients.NetworkClient.ready(Networ

Re: [PR] KAFKA-6829: Retry commits on unknown topic or partition [kafka]

2025-02-27 Thread via GitHub


aminaaddd commented on PR #4948:
URL: https://github.com/apache/kafka/pull/4948#issuecomment-2689167200

   | ERROR:root:Error: 
KafkaError{code=UNKNOWN_TOPIC_OR_PART,val=3,str="Subscribed topic not 
available: housing_topic: Broker: Unknown topic or partition"}
   
   i have this problem and i don't know how to solve it.
   
 broker:
   image: confluentinc/cp-kafka:latest
   container_name: broker1
   hostname: broker1
   ports:
 - "9092:9092"   # Port interne du broker
 - "29092:29092" # Port pour accès externe
   environment:
 KAFKA_AUTO_CREATE_TOPICS_ENABLE: "true"
 KAFKA_NODE_ID: 1
 KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: 
CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
 KAFKA_ADVERTISED_LISTENERS: 
PLAINTEXT://broker1:9092,PLAINTEXT_HOST://localhost:29092
 KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
 KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
 KAFKA_PROCESS_ROLES: broker,controller
 KAFKA_CONTROLLER_QUORUM_VOTERS: 1@broker1:29093
 KAFKA_LISTENERS: 
PLAINTEXT://broker1:9092,CONTROLLER://broker1:29093,PLAINTEXT_HOST://0.0.0.0:29092
 KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
 KAFKA_CONTROLLER_LISTENER_NAMES: CONTROLLER
 KAFKA_LOG_DIRS: /tmp/kraft-combined-logs
 CLUSTER_ID: "MkU3OEVBNTcwNTJENDM2Qk"
   healthcheck:
 test: ["CMD", "kafka-broker-api-versions", "--bootstrap-server", 
"broker1:9092"]
 interval: 10s
 timeout: 5s
 retries: 5
 
 topic-creator:
   image: confluentinc/cp-kafka:latest
   container_name: topic-creator
   depends_on:
 - broker
   entrypoint: ["/bin/sh", "-c", "sleep 20 && kafka-topics --create --topic 
housing_topic --bootstrap-server broker1:9092 --replication-factor 1 
--partitions 1"]
   restart: "no"
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Assigned] (KAFKA-18894) Add support for ConfigProvider

2025-02-27 Thread Jhen-Yung Hsu (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-18894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jhen-Yung Hsu reassigned KAFKA-18894:
-

Assignee: Jhen-Yung Hsu

> Add support for ConfigProvider
> --
>
> Key: KAFKA-18894
> URL: https://issues.apache.org/jira/browse/KAFKA-18894
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Mickael Maison
>Assignee: Jhen-Yung Hsu
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] KAFKA-18864:remove the Evolving tag from stable public interfaces [kafka]

2025-02-27 Thread via GitHub


junrao commented on code in PR #19036:
URL: https://github.com/apache/kafka/pull/19036#discussion_r1974060328


##
clients/src/main/java/org/apache/kafka/clients/admin/AbortTransactionOptions.java:
##
@@ -16,9 +16,7 @@
  */
 package org.apache.kafka.clients.admin;
 
-import org.apache.kafka.common.annotation.InterfaceStability;
 

Review Comment:
   extra new line. Ditto in quite a few other places.



##
clients/src/main/java/org/apache/kafka/clients/admin/AbortTransactionResult.java:
##
@@ -27,7 +26,6 @@
  *
  * The API of this class is evolving, see {@link Admin} for details.
  */
-@InterfaceStability.Evolving

Review Comment:
   Could we adjust the comment above accordingly? Ditto in quite a few other 
places.



##
clients/src/main/java/org/apache/kafka/common/acl/AccessControlEntryFilter.java:
##
@@ -26,7 +25,6 @@
  *
  * The API for this class is still evolving and we may break compatibility in 
minor releases, if necessary.
  */
-@InterfaceStability.Evolving

Review Comment:
   Could we adjust the comment above accordingly?



##
clients/src/main/java/org/apache/kafka/common/acl/AccessControlEntry.java:
##
@@ -17,7 +17,6 @@
 
 package org.apache.kafka.common.acl;
 
-import org.apache.kafka.common.annotation.InterfaceStability;
 

Review Comment:
   extra new line



##
clients/src/main/java/org/apache/kafka/common/acl/AccessControlEntryFilter.java:
##
@@ -17,7 +17,6 @@
 
 package org.apache.kafka.common.acl;
 
-import org.apache.kafka.common.annotation.InterfaceStability;
 

Review Comment:
   extra newline



##
clients/src/main/java/org/apache/kafka/common/acl/AccessControlEntry.java:
##
@@ -26,7 +25,6 @@
  *
  * The API for this class is still evolving and we may break compatibility in 
minor releases, if necessary.
  */
-@InterfaceStability.Evolving

Review Comment:
   Could we adjust the comment above accordingly?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Assigned] (KAFKA-18891) Add support for RemoteLogMetadataManager and RemoteStorageManager

2025-02-27 Thread TaiJuWu (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-18891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

TaiJuWu reassigned KAFKA-18891:
---

Assignee: TaiJuWu

> Add support for RemoteLogMetadataManager and RemoteStorageManager
> -
>
> Key: KAFKA-18891
> URL: https://issues.apache.org/jira/browse/KAFKA-18891
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Mickael Maison
>Assignee: TaiJuWu
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] HOTFIX: remove PageView example to support Java11 for :streams:examples module [kafka]

2025-02-27 Thread via GitHub


gharris1727 commented on code in PR #19052:
URL: https://github.com/apache/kafka/pull/19052#discussion_r1974314924


##
docs/streams/developer-guide/datatypes.html:
##
@@ -156,11 +156,9 @@ Primitive and basic types
 JSON
-The Kafka Streams code examples also include a basic serde 
implementation for JSON:
-
-  https://github.com/apache/kafka/blob/{{dotVersion}}/streams/examples/src/main/java/org/apache/kafka/streams/examples/pageview/PageViewTypedDemo.java#L83";>PageViewTypedDemo
-
-As shown in the example, you can use JSONSerdes inner classes Serdes.serdeFrom(, 
) to construct JSON compatible 
serializers and deserializers.
+You can use JsonSerializer and 
JsonDeerializer from Kafka Connect to construct JSON compatible 
serializers and deserializers

Review Comment:
   ```suggestion
   You can use JsonSerializer and 
JsonDeserializer from Kafka Connect to construct JSON compatible 
serializers and deserializers
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Assigned] (KAFKA-14488) Move log layer tests to storage module

2025-02-27 Thread TaiJuWu (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

TaiJuWu reassigned KAFKA-14488:
---

Assignee: TaiJuWu

> Move log layer tests to storage module
> --
>
> Key: KAFKA-14488
> URL: https://issues.apache.org/jira/browse/KAFKA-14488
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Ismael Juma
>Assignee: TaiJuWu
>Priority: Major
>
> This should be split into multiple tasks.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] MINOR: Add README.md test command [kafka]

2025-02-27 Thread via GitHub


chia7712 commented on code in PR #19025:
URL: https://github.com/apache/kafka/pull/19025#discussion_r1974442226


##
README.md:
##
@@ -75,6 +76,7 @@ Retries are disabled by default, but you can set 
maxTestRetryFailures and maxTes
 The following example declares -PmaxTestRetries=1 and -PmaxTestRetryFailures=3 
to enable a failed test to be retried once, with a total retry limit of 3.
 
 ./gradlew test -PmaxTestRetries=1 -PmaxTestRetryFailures=3
+./gradlew test -Pkafka.test.run.flaky=true -PmaxTestRetries=1 
-PmaxTestRetryFailures=3

Review Comment:
   @mingdaoy could you please address @m1a2st comment?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (KAFKA-18891) Add support for RemoteLogMetadataManager and RemoteStorageManager

2025-02-27 Thread Mickael Maison (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-18891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17931280#comment-17931280
 ] 

Mickael Maison commented on KAFKA-18891:


Sure, it's unassigned so feel free to grab it

> Add support for RemoteLogMetadataManager and RemoteStorageManager
> -
>
> Key: KAFKA-18891
> URL: https://issues.apache.org/jira/browse/KAFKA-18891
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Mickael Maison
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[PR] HOTFIX: remove PageView example to support Java11 for :streams:examples module [kafka]

2025-02-27 Thread via GitHub


mjsax opened a new pull request, #19052:
URL: https://github.com/apache/kafka/pull/19052

   The PageView example depends on Connect to pull in Json (de)serializers, but 
Connect does not support Java11 any longer.
   
   To allow supporting Java11 for the Kafka Streams examples, this PR removes 
the PageView examples and Connect dependency.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Assigned] (KAFKA-18892) Add support for ClientQuotaCallback

2025-02-27 Thread Jira


 [ 
https://issues.apache.org/jira/browse/KAFKA-18892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

黃竣陽 reassigned KAFKA-18892:
---

Assignee: 黃竣陽

> Add support for ClientQuotaCallback
> ---
>
> Key: KAFKA-18892
> URL: https://issues.apache.org/jira/browse/KAFKA-18892
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Mickael Maison
>Assignee: 黃竣陽
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (KAFKA-18896) Add support for Login

2025-02-27 Thread Jira


 [ 
https://issues.apache.org/jira/browse/KAFKA-18896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

黃竣陽 reassigned KAFKA-18896:
---

Assignee: 黃竣陽

> Add support for Login
> -
>
> Key: KAFKA-18896
> URL: https://issues.apache.org/jira/browse/KAFKA-18896
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Mickael Maison
>Assignee: 黃竣陽
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] MINOR: Extract HeartbeatRequestState from heartbeat request managers [kafka]

2025-02-27 Thread via GitHub


cadonna commented on code in PR #19043:
URL: https://github.com/apache/kafka/pull/19043#discussion_r1974015099


##
clients/src/main/java/org/apache/kafka/clients/consumer/internals/HeartbeatRequestState.java:
##
@@ -0,0 +1,107 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.kafka.clients.consumer.internals;
+
+import org.apache.kafka.common.utils.LogContext;
+import org.apache.kafka.common.utils.Time;
+import org.apache.kafka.common.utils.Timer;
+
+/**
+ * Represents the state of a heartbeat request, including logic for timing, 
retries, and exponential backoff. The object extends
+ * {@link org.apache.kafka.clients.consumer.internals.RequestState} to enable 
exponential backoff and duplicated request handling. The two fields that it 
holds are:

Review Comment:
   Oh, I did not notice that! Thanks!
   
   I do not have an answer to your question. Also the old code had that. I will 
remove the last sentence.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (KAFKA-18898) 4.0 Upgrade docs rendering below other versions

2025-02-27 Thread Jira


[ 
https://issues.apache.org/jira/browse/KAFKA-18898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17931318#comment-17931318
 ] 

黃竣陽 commented on KAFKA-18898:
-

Hello [~davidarthur] ,The {{zk2kraft}} page has been moved to a separate 
section and now has its own page in this PR: 
[https://github.com/apache/kafka/pull/18961].

 

I test in my local, and the document is below:
!image-2025-02-28-06-28-28-356.png|width=585,height=299!
!image-2025-02-28-06-29-11-204.png|width=684,height=385!

> 4.0 Upgrade docs rendering below other versions
> ---
>
> Key: KAFKA-18898
> URL: https://issues.apache.org/jira/browse/KAFKA-18898
> Project: Kafka
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 4.0.0
>Reporter: David Arthur
>Priority: Critical
> Attachments: image-2025-02-27-15-16-27-927.png, 
> image-2025-02-27-15-18-37-080.png, image-2025-02-27-15-19-26-207.png, 
> image-2025-02-28-06-28-28-356.png, image-2025-02-28-06-29-11-204.png
>
>
> In section 1.5, we have the "Significant Changes" text which comes from 
> 40/zk2kraft.html
> !image-2025-02-27-15-16-27-927.png!
> But then just below this section we see the "Upgrading to 3.9.0 from any 
> version 0.8.x" section
>  
> !image-2025-02-27-15-18-37-080.png!
>  
> The "Upgrading to 4.0.0 from any version 0.8.x through 3.9.x" section appears 
> at the very end of the other "Upgrading to ..." sections.
>  
> !image-2025-02-27-15-19-26-207.png!
>  
> The "Upgrading to 4.0.0" section should be at the very top of the upgrading 
> section, not the bottom.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] KAFKA-18860: Remove Missing Features section [kafka]

2025-02-27 Thread via GitHub


chia7712 commented on PR #19048:
URL: https://github.com/apache/kafka/pull/19048#issuecomment-2689265249

   cherry-pick to 4.0
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-18860: Remove Missing Features section [kafka]

2025-02-27 Thread via GitHub


chia7712 merged PR #19048:
URL: https://github.com/apache/kafka/pull/19048


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (KAFKA-18371) TopicBasedRemoteLogMetadataManagerConfig exposes sensitive configuration data in logs

2025-02-27 Thread Chia-Ping Tsai (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-18371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai updated KAFKA-18371:
---
Fix Version/s: 4.1.0
   (was: 4.1)

> TopicBasedRemoteLogMetadataManagerConfig exposes sensitive configuration data 
> in logs
> -
>
> Key: KAFKA-18371
> URL: https://issues.apache.org/jira/browse/KAFKA-18371
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 3.1.0
>Reporter: Vadym Zhytkevych
>Assignee: Vadym Zhytkevych
>Priority: Major
>  Labels: storage, tiered-storage
> Fix For: 4.1.0
>
>
> {code:java}
> [2024-12-20 14:52:56,805] INFO Successfully configured topic-based RLMM with 
> config: 
> TopicBasedRemoteLogMetadataManagerConfig{clientIdPrefix='__remote_log_metadata_client_6',
>  metadataTopicPartitionsCount=50, consumeWaitMs=12, 
> metadataTopicRetentionMs=-1, metadataTopicReplicationFactor=3, 
> initializationRetryMaxTimeoutMs=12, initializationRetryIntervalMs=100, 
> commonProps={request.timeout.ms=1, ssl.client.auth=none, 
> ssl.keystore.location=/etc/kafka/ssl/keystore.p12, 
> bootstrap.servers=server1:9094, security.protocol=SASL_SSL, 
> password=CLEARTEXT, ssl.truststore.location=/etc/pki/java/cacerts, 
> ssl.keystore.password=CLEARTEXT, sasl.mechanism=SCRAM-SHA-512, 
> ssl.key.password=CLEARTEXT, 
> sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule 
> required username="username" password="CLEARTEXT";, 
> ssl.truststore.password=CLEARTEXT, …{code}
>  
> Issue is related to using toString() method of 
> TopicBasedRemoteLogMetadataManagerConfig, that prints maps of consumerProps 
> and producerProps without masking.
>  
> Current workaround: logger for class TopicBasedRemoteLogMetadataManagerConfig 
> can be disabled to not expose sensitive data.
> Expected behavior:  sensitive configuration data masked automatically in logs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] KAFKA-18371 TopicBasedRemoteLogMetadataManagerConfig exposes sensitive configuration data in logs [kafka]

2025-02-27 Thread via GitHub


chia7712 merged PR #18349:
URL: https://github.com/apache/kafka/pull/18349


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-18860: Remove Missing Features section [kafka]

2025-02-27 Thread via GitHub


chia7712 commented on PR #19048:
URL: https://github.com/apache/kafka/pull/19048#issuecomment-2689263012

   for future reviewers, I have opened 
https://issues.apache.org/jira/browse/KAFKA-18885 to add the "behavioral 
differences between ZooKeeper and KRaft" - that will include the missing 
features under kraft, such as dynamic configs


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Resolved] (KAFKA-18860) Clarify the KRaft missing feature

2025-02-27 Thread Chia-Ping Tsai (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-18860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai resolved KAFKA-18860.

Resolution: Fixed

trunk: 
[https://github.com/apache/kafka/commit/e2d9ced098c35068900746afe608820ae967d5a7]

 

4.0: 
https://github.com/chia7712/kafka/commit/e7a13f759ee3509f108b146aa7d3c9c6d3161255

> Clarify the KRaft missing feature
> -
>
> Key: KAFKA-18860
> URL: https://issues.apache.org/jira/browse/KAFKA-18860
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Luke Chen
>Assignee: TengYao Chi
>Priority: Major
> Fix For: 4.0.0
>
>
> In the KRaft [missing 
> feature|https://kafka.apache.org/documentation/#kraft_missing] section, 
> there's one remaining:
>  
> Modifying certain dynamic configurations on the standalone KRaft controller
>  
> We should clarify what it means and make it clear to users. To me, I don't 
> think there's anything we can't update to KRaft controller. But not 100% sure 
> what the original author mean.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] MINOR: correct an ELR test name in ActivationRecordsGeneratorTest [kafka]

2025-02-27 Thread via GitHub


junrao merged PR #19044:
URL: https://github.com/apache/kafka/pull/19044


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Assigned] (KAFKA-18817) ShareGroupHeartbeat and ShareGroupDescribe API must check topic describe

2025-02-27 Thread Lan Ding (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-18817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lan Ding reassigned KAFKA-18817:


Assignee: Lan Ding  (was: Shivsundar R)

> ShareGroupHeartbeat and ShareGroupDescribe API must check topic describe
> 
>
> Key: KAFKA-18817
> URL: https://issues.apache.org/jira/browse/KAFKA-18817
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Andrew Schofield
>Assignee: Lan Ding
>Priority: Major
> Fix For: 4.1.0
>
>
> ShareGroupHeartbeat and ShareGroupDescribe RPCs must check topic describe 
> authorisation to ensure we don't leak information to clients without the 
> required permission. This is the share group equivalent of 
> https://issues.apache.org/jira/browse/KAFKA-18813.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[PR] MINOR: Removing share partition manager flaky annotation [kafka]

2025-02-27 Thread via GitHub


apoorvmittal10 opened a new pull request, #19053:
URL: https://github.com/apache/kafka/pull/19053

   There isn't any flaky test for SharePartitionManager in last 7 days, 
removing flaky annotation.
   
   
https://develocity.apache.org/scans/tests?search.timeZoneId=Europe%2FLondon&tests.container=kafka.server.share.SharePartitionManagerTest


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Assigned] (KAFKA-18883) Move TransactionLog to transaction-coordinator module

2025-02-27 Thread PoAn Yang (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-18883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

PoAn Yang reassigned KAFKA-18883:
-

Assignee: TaiJuWu  (was: PoAn Yang)

> Move TransactionLog to transaction-coordinator module
> -
>
> Key: KAFKA-18883
> URL: https://issues.apache.org/jira/browse/KAFKA-18883
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: PoAn Yang
>Assignee: TaiJuWu
>Priority: Minor
>
> as title



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-18898) 4.0 Upgrade docs rendering below other versions

2025-02-27 Thread David Arthur (Jira)
David Arthur created KAFKA-18898:


 Summary: 4.0 Upgrade docs rendering below other versions
 Key: KAFKA-18898
 URL: https://issues.apache.org/jira/browse/KAFKA-18898
 Project: Kafka
  Issue Type: Bug
  Components: documentation
Reporter: David Arthur
 Attachments: image-2025-02-27-15-16-27-927.png, 
image-2025-02-27-15-18-37-080.png, image-2025-02-27-15-19-26-207.png

In section 1.5, we have the "Significant Changes" text which comes from 
40/zk2kraft.html

!image-2025-02-27-15-16-27-927.png!

But then just below this section we see the "Upgrading to 3.9.0 from any 
version 0.8.x" section

 

!image-2025-02-27-15-18-37-080.png!

 

The "Upgrading to 4.0.0 from any version 0.8.x through 3.9.x" section appears 
at the very end of the other "Upgrading to ..." sections.

 

!image-2025-02-27-15-19-26-207.png!

 

The "Upgrading to 4.0.0" section should be at the very top of the upgrading 
section, not the bottom.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] KAFKA-18734: Implemented share partition metrics (KIP-1103) [kafka]

2025-02-27 Thread via GitHub


apoorvmittal10 commented on code in PR #19045:
URL: https://github.com/apache/kafka/pull/19045#discussion_r1974546851


##
core/src/main/java/kafka/server/share/SharePartition.java:
##
@@ -1326,6 +1355,42 @@ int leaderEpoch() {
 return leaderEpoch;
 }
 
+// Visible for testing
+void recordFetchLockRatioMetric(long acquiredTime) {
+// Update the total fetch lock acquired time.
+double fetchLockToTotalTime;
+if (acquiredTime + timeSinceLastLockAcquisitionMs == 0) {
+// If the total time is 0 then the ratio is 1 i.e. the fetch lock 
was acquired for the complete time.
+fetchLockToTotalTime = 1;

Review Comment:
   Done.



##
core/src/main/java/kafka/server/share/SharePartition.java:
##
@@ -1326,6 +1355,42 @@ int leaderEpoch() {
 return leaderEpoch;
 }
 
+// Visible for testing
+void recordFetchLockRatioMetric(long acquiredTime) {

Review Comment:
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-18734: Implemented share partition metrics (KIP-1103) [kafka]

2025-02-27 Thread via GitHub


apoorvmittal10 commented on code in PR #19045:
URL: https://github.com/apache/kafka/pull/19045#discussion_r1974552557


##
core/src/main/java/kafka/server/share/SharePartition.java:
##
@@ -1291,13 +1303,30 @@ public boolean maybeAcquireFetchLock() {
 if (stateNotActive()) {
 return false;
 }
-return fetchLock.compareAndSet(false, true);
+boolean acquired = fetchLock.compareAndSet(false, true);
+if (acquired) {
+long currentTime = time.hiResClockMs();
+fetchLockAcquiredTimeMs = currentTime;
+timeSinceLastLockAcquisitionMs = timeSinceLastLockAcquisitionMs != 
0 ? currentTime - timeSinceLastLockAcquisitionMs : 0;

Review Comment:
   Though this should not happen in practice, but I have added a safe check 
with comment.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Assigned] (KAFKA-18897) Add support for SslEngineFactory

2025-02-27 Thread Mingdao Yang (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-18897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingdao Yang reassigned KAFKA-18897:


Assignee: Mingdao Yang

> Add support for SslEngineFactory
> 
>
> Key: KAFKA-18897
> URL: https://issues.apache.org/jira/browse/KAFKA-18897
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Mickael Maison
>Assignee: Mingdao Yang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-18894) Add support for ConfigProvider

2025-02-27 Thread Jhen-Yung Hsu (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-18894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17931297#comment-17931297
 ] 

Jhen-Yung Hsu commented on KAFKA-18894:
---

I'm working on this, thanks :)

> Add support for ConfigProvider
> --
>
> Key: KAFKA-18894
> URL: https://issues.apache.org/jira/browse/KAFKA-18894
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Mickael Maison
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (KAFKA-18890) Add support for CreateTopicPolicy and AlterConfigPolicy

2025-02-27 Thread Jhen-Yung Hsu (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-18890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jhen-Yung Hsu reassigned KAFKA-18890:
-

Assignee: Jhen-Yung Hsu

> Add support for CreateTopicPolicy and AlterConfigPolicy
> ---
>
> Key: KAFKA-18890
> URL: https://issues.apache.org/jira/browse/KAFKA-18890
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Mickael Maison
>Assignee: Jhen-Yung Hsu
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (KAFKA-18895) Add support for AuthenticateCallbackHandler

2025-02-27 Thread Mingdao Yang (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-18895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingdao Yang reassigned KAFKA-18895:


Assignee: Mingdao Yang

> Add support for AuthenticateCallbackHandler
> ---
>
> Key: KAFKA-18895
> URL: https://issues.apache.org/jira/browse/KAFKA-18895
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Mickael Maison
>Assignee: Mingdao Yang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-18890) Add support for CreateTopicPolicy and AlterConfigPolicy

2025-02-27 Thread Jhen-Yung Hsu (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-18890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17931296#comment-17931296
 ] 

Jhen-Yung Hsu commented on KAFKA-18890:
---

I'm working on this, thanks :)

> Add support for CreateTopicPolicy and AlterConfigPolicy
> ---
>
> Key: KAFKA-18890
> URL: https://issues.apache.org/jira/browse/KAFKA-18890
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Mickael Maison
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-18898) 4.0 Upgrade docs rendering below other versions

2025-02-27 Thread Chia-Ping Tsai (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-18898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17931317#comment-17931317
 ] 

Chia-Ping Tsai commented on KAFKA-18898:


[~davidarthur] Are you testing the 4.0 RC or 4.0 branch? IIRC, the stale 
upgrade docs is removed by 
https://github.com/apache/kafka/commit/da8f390c4599d7199c4cdf2bb85441146e859b17

> 4.0 Upgrade docs rendering below other versions
> ---
>
> Key: KAFKA-18898
> URL: https://issues.apache.org/jira/browse/KAFKA-18898
> Project: Kafka
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 4.0.0
>Reporter: David Arthur
>Priority: Critical
> Attachments: image-2025-02-27-15-16-27-927.png, 
> image-2025-02-27-15-18-37-080.png, image-2025-02-27-15-19-26-207.png, 
> image-2025-02-28-06-28-28-356.png, image-2025-02-28-06-29-11-204.png
>
>
> In section 1.5, we have the "Significant Changes" text which comes from 
> 40/zk2kraft.html
> !image-2025-02-27-15-16-27-927.png!
> But then just below this section we see the "Upgrading to 3.9.0 from any 
> version 0.8.x" section
>  
> !image-2025-02-27-15-18-37-080.png!
>  
> The "Upgrading to 4.0.0 from any version 0.8.x through 3.9.x" section appears 
> at the very end of the other "Upgrading to ..." sections.
>  
> !image-2025-02-27-15-19-26-207.png!
>  
> The "Upgrading to 4.0.0" section should be at the very top of the upgrading 
> section, not the bottom.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-18898) 4.0 Upgrade docs rendering below other versions

2025-02-27 Thread Jira


 [ 
https://issues.apache.org/jira/browse/KAFKA-18898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

黃竣陽 updated KAFKA-18898:

Attachment: image-2025-02-28-06-28-28-356.png

> 4.0 Upgrade docs rendering below other versions
> ---
>
> Key: KAFKA-18898
> URL: https://issues.apache.org/jira/browse/KAFKA-18898
> Project: Kafka
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 4.0.0
>Reporter: David Arthur
>Priority: Critical
> Attachments: image-2025-02-27-15-16-27-927.png, 
> image-2025-02-27-15-18-37-080.png, image-2025-02-27-15-19-26-207.png, 
> image-2025-02-28-06-28-28-356.png
>
>
> In section 1.5, we have the "Significant Changes" text which comes from 
> 40/zk2kraft.html
> !image-2025-02-27-15-16-27-927.png!
> But then just below this section we see the "Upgrading to 3.9.0 from any 
> version 0.8.x" section
>  
> !image-2025-02-27-15-18-37-080.png!
>  
> The "Upgrading to 4.0.0 from any version 0.8.x through 3.9.x" section appears 
> at the very end of the other "Upgrading to ..." sections.
>  
> !image-2025-02-27-15-19-26-207.png!
>  
> The "Upgrading to 4.0.0" section should be at the very top of the upgrading 
> section, not the bottom.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-18898) 4.0 Upgrade docs rendering below other versions

2025-02-27 Thread Jira


 [ 
https://issues.apache.org/jira/browse/KAFKA-18898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

黃竣陽 updated KAFKA-18898:

Attachment: image-2025-02-28-06-29-11-204.png

> 4.0 Upgrade docs rendering below other versions
> ---
>
> Key: KAFKA-18898
> URL: https://issues.apache.org/jira/browse/KAFKA-18898
> Project: Kafka
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 4.0.0
>Reporter: David Arthur
>Priority: Critical
> Attachments: image-2025-02-27-15-16-27-927.png, 
> image-2025-02-27-15-18-37-080.png, image-2025-02-27-15-19-26-207.png, 
> image-2025-02-28-06-28-28-356.png, image-2025-02-28-06-29-11-204.png
>
>
> In section 1.5, we have the "Significant Changes" text which comes from 
> 40/zk2kraft.html
> !image-2025-02-27-15-16-27-927.png!
> But then just below this section we see the "Upgrading to 3.9.0 from any 
> version 0.8.x" section
>  
> !image-2025-02-27-15-18-37-080.png!
>  
> The "Upgrading to 4.0.0 from any version 0.8.x through 3.9.x" section appears 
> at the very end of the other "Upgrading to ..." sections.
>  
> !image-2025-02-27-15-19-26-207.png!
>  
> The "Upgrading to 4.0.0" section should be at the very top of the upgrading 
> section, not the bottom.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-18890) Add support for CreateTopicPolicy and AlterConfigPolicy

2025-02-27 Thread Mickael Maison (Jira)
Mickael Maison created KAFKA-18890:
--

 Summary: Add support for CreateTopicPolicy and AlterConfigPolicy
 Key: KAFKA-18890
 URL: https://issues.apache.org/jira/browse/KAFKA-18890
 Project: Kafka
  Issue Type: Sub-task
Reporter: Mickael Maison






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] KAFKA-16580: Enable dynamic quorum reconfiguration for raft simulation tests pt 1 [kafka]

2025-02-27 Thread via GitHub


kevin-wu24 commented on PR #18987:
URL: https://github.com/apache/kafka/pull/18987#issuecomment-2688961782

   https://github.com/apache/kafka/pull/18987#discussion_r1971722138 contains 
the most recent observation on what is going on when we're checking the 
`MajorityReachedHighWatermark` invariant when adding/removing a voter.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (KAFKA-18891) Add support for RemoteLogMetadataManager and RemoteStorageManager

2025-02-27 Thread TaiJuWu (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-18891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17931278#comment-17931278
 ] 

TaiJuWu commented on KAFKA-18891:
-

Hi[~mimaison] , may I take this over if you are not working on this.

> Add support for RemoteLogMetadataManager and RemoteStorageManager
> -
>
> Key: KAFKA-18891
> URL: https://issues.apache.org/jira/browse/KAFKA-18891
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Mickael Maison
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] KAFKA-18864:remove the Evolving tag from stable public interfaces [kafka]

2025-02-27 Thread via GitHub


gongxuanzhang commented on code in PR #19036:
URL: https://github.com/apache/kafka/pull/19036#discussion_r1974595611


##
clients/src/main/java/org/apache/kafka/clients/admin/AbortTransactionOptions.java:
##
@@ -16,9 +16,7 @@
  */
 package org.apache.kafka.clients.admin;
 
-import org.apache.kafka.common.annotation.InterfaceStability;
 

Review Comment:
   I will update it 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] MINOR: Extract HeartbeatRequestState from heartbeat request managers [kafka]

2025-02-27 Thread via GitHub


cadonna commented on PR #19043:
URL: https://github.com/apache/kafka/pull/19043#issuecomment-2687387498

   Call for review: @aliehsaeedii 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (KAFKA-18881) Document the ConsumerRecord as non-thread safe

2025-02-27 Thread Chia-Ping Tsai (Jira)
Chia-Ping Tsai created KAFKA-18881:
--

 Summary: Document the ConsumerRecord as non-thread safe
 Key: KAFKA-18881
 URL: https://issues.apache.org/jira/browse/KAFKA-18881
 Project: Kafka
  Issue Type: Improvement
Reporter: Chia-Ping Tsai
Assignee: Chia-Ping Tsai


Due to concurrent issues encountered with ConsumerRecord's headers, it is 
necessary to add documentation to remind users.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-18471) Race conditions when accessing RecordHeader data

2025-02-27 Thread Chia-Ping Tsai (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-18471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai resolved KAFKA-18471.

Resolution: Duplicate

duplicate to KAFKA-18470

>  Race conditions when accessing RecordHeader data
> -
>
> Key: KAFKA-18471
> URL: https://issues.apache.org/jira/browse/KAFKA-18471
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 3.8.1
>Reporter: Vinicius Vieira dos Santos
>Priority: Major
>
> There is a race condition in the {{RecordHeader}} class of Kafka when an 
> instance is created using the [[constructor with 
> ByteBuffer|https://github.com/a0x8o/kafka/blob/master/clients/src/main/java/org/apache/kafka/common/header/internals/RecordHeader.java#L38]{{{}{}}}|https://github.com/a0x8o/kafka/blob/master/clients/src/main/java/org/apache/kafka/common/header/internals/RecordHeader.java#L38].
>  In this scenario, when attempting to access the {{key}} or {{{}value{}}}, a 
> process copies the {{ByteBuffer}} into a byte array.
> During this process, multiple threads may simultaneously invoke the method 
> responsible for the copying. This can lead to a situation where one thread 
> successfully completes the operation, while another abruptly has the buffer 
> set to {{null}} during the process.
>  
> Exception example:
>  
> {code:java}
> Exception in thread "pool-1-thread-3" java.lang.NullPointerException: Cannot 
> invoke "java.nio.ByteBuffer.remaining()" because "this.keyBuffer" is null
>     at 
> org.apache.kafka.common.header.internals.RecordHeader.key(RecordHeader.java:45)
>     at 
> br.com.autbank.workflow.TestMainExample.lambda$main$0(TestMainExample.java:36)
>     at java.base/java.lang.Iterable.forEach(Iterable.java:75)
>     at 
> br.com.autbank.workflow.TestMainExample.lambda$main$1(TestMainExample.java:32)
>     at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
>     at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
>     at java.base/java.lang.Thread.run(Thread.java:1583) {code}
>  
> Code example for error:
>  
> {code:java}
> public class TestMainExample {
> public static void main(String[] args) throws InterruptedException {
> ExecutorService executorService = Executors.newFixedThreadPool(5);
> for (int i = 0; i < 200_000; i++) {
> Charset charset = StandardCharsets.UTF_8;
> RecordHeaders headers = new RecordHeaders();
> headers.add(new RecordHeader(charset.encode("header-key-1"), 
> charset.encode("value-1")));
> headers.add(new RecordHeader(charset.encode("header-key-2"), 
> charset.encode("value-2")));
> headers.add(new RecordHeader(charset.encode("header-key-3"), 
> charset.encode("2025-01-06T00:00:00.0-00:00[UTC]")));
> headers.add(new RecordHeader(charset.encode("header-key-4"), 
> charset.encode("2025-01-06T00:00:00.0-00:00[UTC]")));
> headers.add(new RecordHeader(charset.encode("header-key-5"), 
> charset.encode("account-number")));
> headers.add(new RecordHeader(charset.encode("header-key-6"), 
> charset.encode("operation-id")));
> headers.add(new RecordHeader(charset.encode("header-key-7"), 
> charset.encode("agency-code")));
> headers.add(new RecordHeader(charset.encode("header-key-8"), 
> charset.encode("branch-code")));
> 
> CountDownLatch count = new CountDownLatch(5);
> for (int j = 0; j < 5; j++) {
> executorService.execute(() -> {
> try {
> headers.forEach((hdr) -> {
> if (hdr.value() == null) {
> throw new IllegalStateException("Bug find on 
> value");
> }
> if (hdr.key() == null) {
> throw new IllegalStateException("Bug find on 
> key");
> }
> });
> } finally {
> count.countDown();
> }
> });
> }
> count.await();
> }
> }
> } {code}
> I did a test synchronizing the method I use to access the headers and this 
> resolved the problem in the context of my application, but I believe the 
> ideal would be to either mark that the class is not thread safe or 
> synchronize access to the bytebuffer. Thank you in advance to the team.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (KAFKA-18871) KRaft migration rollback causes downtime

2025-02-27 Thread Luke Chen (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17931070#comment-17931070
 ] 

Luke Chen edited comment on KAFKA-18871 at 2/27/25 9:34 AM:


[~durban] , thanks for reporting. Could you share the kafka all brokers logs 
and strimzi operator logs? Thanks.


was (Author: showuon):
[~durban] , thanks for reporting. Could you share the kafka brokers logs and 
strimzi operator logs? Thanks.

> KRaft migration rollback causes downtime
> 
>
> Key: KAFKA-18871
> URL: https://issues.apache.org/jira/browse/KAFKA-18871
> Project: Kafka
>  Issue Type: Bug
>  Components: kraft, migration
>Affects Versions: 3.9.0
>Reporter: Daniel Urban
>Priority: Critical
>
> When testing the KRaft migration rollback feature, found the following 
> scenario:
>  # Execute KRaft migration on a 3 broker 3 ZK node cluster to the last step, 
> but do not finalize the migration.
>  ## In the test, we put a slow but continuous produce+consume load on the 
> cluster, with a topic (partitions=3, RF=3, min ISR=2)
>  # Start the rollback procedure
>  # First we roll back the brokers from KRaft mode to migration mode (both 
> controller and ZK configs are set, process roles are removed, 
> {{zookeeper.metadata.migration.enable}} is true)
>  # Then we delete the KRaft controllers, delete the /controller znode
>  # Then we immediately start rolling the brokers 1 by 1 to ZK mode by 
> removing the {{zookeeper.metadata.migration.enable}} flag and the 
> controller.* configurations.
>  # At this point, when we restart the 1st broker (let's call it broker 0) 
> into ZK mode, we find an issue which occurs ~1 out of 20 times:
> If broker 0 is not in the ISR for one of the partitions at this point, it can 
> simply never become part of the ISR. Since we are aiming for zero downtime, 
> we check the ISR states of partitions between broker restarts, and our 
> process gets blocked at this point. We have tried multiple workarounds at 
> this point, but it seems that there is no workaround which still ensures zero 
> downtime.
> Some more details about the process:
>  * We are using Strimzi to drive this process, but have verified that Strimzi 
> follows the documented steps precisely.
>  * When we reach the error state, it doesn't matter which broker became the 
> controller through the ZK node, the brokers still in migration mode get 
> stuck, and they flood the logs with the following error:
> {code:java}
> 2025-02-26 10:55:21,985 WARN [RaftManager id=0] Error connecting to node 
> kraft-rollback-kafka-controller-pool-5.kraft-rollback-kafka-kafka-brokers.csm-op-test-kraft-rollback-e7798bef.svc.cluster.local:9090
>  (id: 5 rack: null) (org.apache.kafka.clients.NetworkClient) 
> [kafka-raft-outbound-request-thread]
> java.net.UnknownHostException: 
> kraft-rollback-kafka-controller-pool-5.kraft-rollback-kafka-kafka-brokers.csm-op-test-kraft-rollback-e7798bef.svc.cluster.local
>         at 
> java.base/java.net.InetAddress$CachedAddresses.get(InetAddress.java:801)
>         at java.base/java.net.InetAddress.getAllByName0(InetAddress.java:1533)
>         at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1385)
>         at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1306)
>         at 
> org.apache.kafka.clients.DefaultHostResolver.resolve(DefaultHostResolver.java:27)
>         at org.apache.kafka.clients.ClientUtils.resolve(ClientUtils.java:125)
>         at 
> org.apache.kafka.clients.ClusterConnectionStates$NodeConnectionState.resolveAddresses(ClusterConnectionStates.java:536)
>         at 
> org.apache.kafka.clients.ClusterConnectionStates$NodeConnectionState.currentAddress(ClusterConnectionStates.java:511)
>         at 
> org.apache.kafka.clients.ClusterConnectionStates$NodeConnectionState.access$200(ClusterConnectionStates.java:466)
>         at 
> org.apache.kafka.clients.ClusterConnectionStates.currentAddress(ClusterConnectionStates.java:173)
>         at 
> org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:1075)
>         at 
> org.apache.kafka.clients.NetworkClient.ready(NetworkClient.java:321)
>         at 
> org.apache.kafka.server.util.InterBrokerSendThread.sendRequests(InterBrokerSendThread.java:146)
>         at 
> org.apache.kafka.server.util.InterBrokerSendThread.pollOnce(InterBrokerSendThread.java:109)
>         at 
> org.apache.kafka.server.util.InterBrokerSendThread.doWork(InterBrokerSendThread.java:137)
>         at 
> org.apache.kafka.server.util.ShutdownableThread.run(ShutdownableThread.java:136)
>  {code}
>  * Manually verified the last offsets of the replicas, and broker 0 is caught 
> up in the partition.
>  * Even after stopping the produce load, the issue persists.
>  * Even after removing the /contro

Re: [PR] [MINOR] Clean up group-coordinator [kafka]

2025-02-27 Thread via GitHub


sjhajharia commented on PR #19008:
URL: https://github.com/apache/kafka/pull/19008#issuecomment-2687360738

   hey @chia7712 
   Can you please take a look at this PR too. TIA 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Assigned] (KAFKA-18880) Remove kafka.cluster.Broker and BrokerEndPointNotAvailableException

2025-02-27 Thread Chia-Ping Tsai (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-18880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai reassigned KAFKA-18880:
--

Assignee: xuanzhang gong  (was: Lan Ding)

> Remove kafka.cluster.Broker and BrokerEndPointNotAvailableException
> ---
>
> Key: KAFKA-18880
> URL: https://issues.apache.org/jira/browse/KAFKA-18880
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Chia-Ping Tsai
>Assignee: xuanzhang gong
>Priority: Minor
>
> they are not used anymore after removing zk code



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-18880) Remove kafka.cluster.Broker and BrokerEndPointNotAvailableException

2025-02-27 Thread xuanzhang gong (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-18880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17931066#comment-17931066
 ] 

xuanzhang gong commented on KAFKA-18880:


Hello, I have sent PR to solve this issue

> Remove kafka.cluster.Broker and BrokerEndPointNotAvailableException
> ---
>
> Key: KAFKA-18880
> URL: https://issues.apache.org/jira/browse/KAFKA-18880
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Chia-Ping Tsai
>Assignee: Lan Ding
>Priority: Minor
>
> they are not used anymore after removing zk code



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-18880) Remove kafka.cluster.Broker and BrokerEndPointNotAvailableException

2025-02-27 Thread Chia-Ping Tsai (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-18880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17931067#comment-17931067
 ] 

Chia-Ping Tsai commented on KAFKA-18880:


[~isding_l] sorry that it seems [~gongxuanzhang] has filed the PR, so I will 
re-assign the issue to [~gongxuanzhang] 

> Remove kafka.cluster.Broker and BrokerEndPointNotAvailableException
> ---
>
> Key: KAFKA-18880
> URL: https://issues.apache.org/jira/browse/KAFKA-18880
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Chia-Ping Tsai
>Assignee: Lan Ding
>Priority: Minor
>
> they are not used anymore after removing zk code



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (KAFKA-18880) Remove kafka.cluster.Broker and BrokerEndPointNotAvailableException

2025-02-27 Thread Lan Ding (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-18880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lan Ding reassigned KAFKA-18880:


Assignee: Lan Ding

> Remove kafka.cluster.Broker and BrokerEndPointNotAvailableException
> ---
>
> Key: KAFKA-18880
> URL: https://issues.apache.org/jira/browse/KAFKA-18880
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Chia-Ping Tsai
>Assignee: Lan Ding
>Priority: Minor
>
> they are not used anymore after removing zk code



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[PR] KAFKA-18880:Remove kafka.cluster.Broker and BrokerEndPointNotAvailableException [kafka]

2025-02-27 Thread via GitHub


gongxuanzhang opened a new pull request, #19047:
URL: https://github.com/apache/kafka/pull/19047

   Remove kafka.cluster.Broker and BrokerEndPointNotAvailableException


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] MINOR: Disallow unused local variables [kafka]

2025-02-27 Thread via GitHub


AndrewJSchofield commented on code in PR #18963:
URL: https://github.com/apache/kafka/pull/18963#discussion_r1973200080


##
tools/src/main/java/org/apache/kafka/tools/ShareConsumerPerformance.java:
##
@@ -79,7 +79,8 @@ public static void main(String[] args) {
 shareConsumers.forEach(shareConsumer -> 
shareConsumersMetrics.add(shareConsumer.metrics()));
 }
 shareConsumers.forEach(shareConsumer -> {
-Map> val = 
shareConsumer.commitSync();
+@SuppressWarnings("UnusedLocalVariable")
+Map> ignored = 
shareConsumer.commitSync();

Review Comment:
   This is OK with me in this case. It's just making sure that any in-flight 
acknowledgements are committed before the consumer is closed in a performance 
test.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Assigned] (KAFKA-18881) Document the ConsumerRecord as non-thread safe

2025-02-27 Thread Logan Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-18881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Logan Zhu reassigned KAFKA-18881:
-

Assignee: Logan Zhu  (was: Chia-Ping Tsai)

> Document the ConsumerRecord as non-thread safe
> --
>
> Key: KAFKA-18881
> URL: https://issues.apache.org/jira/browse/KAFKA-18881
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Chia-Ping Tsai
>Assignee: Logan Zhu
>Priority: Major
>
> Due to concurrent issues encountered with ConsumerRecord's headers, it is 
> necessary to add documentation to remind users.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-18871) KRaft migration rollback causes downtime

2025-02-27 Thread Luke Chen (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17931070#comment-17931070
 ] 

Luke Chen commented on KAFKA-18871:
---

[~durban] , thanks for reporting. Could you share the kafka brokers logs and 
strimzi operator logs? Thanks.

> KRaft migration rollback causes downtime
> 
>
> Key: KAFKA-18871
> URL: https://issues.apache.org/jira/browse/KAFKA-18871
> Project: Kafka
>  Issue Type: Bug
>  Components: kraft, migration
>Affects Versions: 3.9.0
>Reporter: Daniel Urban
>Priority: Critical
>
> When testing the KRaft migration rollback feature, found the following 
> scenario:
>  # Execute KRaft migration on a 3 broker 3 ZK node cluster to the last step, 
> but do not finalize the migration.
>  ## In the test, we put a slow but continuous produce+consume load on the 
> cluster, with a topic (partitions=3, RF=3, min ISR=2)
>  # Start the rollback procedure
>  # First we roll back the brokers from KRaft mode to migration mode (both 
> controller and ZK configs are set, process roles are removed, 
> {{zookeeper.metadata.migration.enable}} is true)
>  # Then we delete the KRaft controllers, delete the /controller znode
>  # Then we immediately start rolling the brokers 1 by 1 to ZK mode by 
> removing the {{zookeeper.metadata.migration.enable}} flag and the 
> controller.* configurations.
>  # At this point, when we restart the 1st broker (let's call it broker 0) 
> into ZK mode, we find an issue which occurs ~1 out of 20 times:
> If broker 0 is not in the ISR for one of the partitions at this point, it can 
> simply never become part of the ISR. Since we are aiming for zero downtime, 
> we check the ISR states of partitions between broker restarts, and our 
> process gets blocked at this point. We have tried multiple workarounds at 
> this point, but it seems that there is no workaround which still ensures zero 
> downtime.
> Some more details about the process:
>  * We are using Strimzi to drive this process, but have verified that Strimzi 
> follows the documented steps precisely.
>  * When we reach the error state, it doesn't matter which broker became the 
> controller through the ZK node, the brokers still in migration mode get 
> stuck, and they flood the logs with the following error:
> {code:java}
> 2025-02-26 10:55:21,985 WARN [RaftManager id=0] Error connecting to node 
> kraft-rollback-kafka-controller-pool-5.kraft-rollback-kafka-kafka-brokers.csm-op-test-kraft-rollback-e7798bef.svc.cluster.local:9090
>  (id: 5 rack: null) (org.apache.kafka.clients.NetworkClient) 
> [kafka-raft-outbound-request-thread]
> java.net.UnknownHostException: 
> kraft-rollback-kafka-controller-pool-5.kraft-rollback-kafka-kafka-brokers.csm-op-test-kraft-rollback-e7798bef.svc.cluster.local
>         at 
> java.base/java.net.InetAddress$CachedAddresses.get(InetAddress.java:801)
>         at java.base/java.net.InetAddress.getAllByName0(InetAddress.java:1533)
>         at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1385)
>         at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1306)
>         at 
> org.apache.kafka.clients.DefaultHostResolver.resolve(DefaultHostResolver.java:27)
>         at org.apache.kafka.clients.ClientUtils.resolve(ClientUtils.java:125)
>         at 
> org.apache.kafka.clients.ClusterConnectionStates$NodeConnectionState.resolveAddresses(ClusterConnectionStates.java:536)
>         at 
> org.apache.kafka.clients.ClusterConnectionStates$NodeConnectionState.currentAddress(ClusterConnectionStates.java:511)
>         at 
> org.apache.kafka.clients.ClusterConnectionStates$NodeConnectionState.access$200(ClusterConnectionStates.java:466)
>         at 
> org.apache.kafka.clients.ClusterConnectionStates.currentAddress(ClusterConnectionStates.java:173)
>         at 
> org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:1075)
>         at 
> org.apache.kafka.clients.NetworkClient.ready(NetworkClient.java:321)
>         at 
> org.apache.kafka.server.util.InterBrokerSendThread.sendRequests(InterBrokerSendThread.java:146)
>         at 
> org.apache.kafka.server.util.InterBrokerSendThread.pollOnce(InterBrokerSendThread.java:109)
>         at 
> org.apache.kafka.server.util.InterBrokerSendThread.doWork(InterBrokerSendThread.java:137)
>         at 
> org.apache.kafka.server.util.ShutdownableThread.run(ShutdownableThread.java:136)
>  {code}
>  * Manually verified the last offsets of the replicas, and broker 0 is caught 
> up in the partition.
>  * Even after stopping the produce load, the issue persists.
>  * Even after removing the /controller node manually (to retrigger election), 
> regardless of which broker becomes the controller, the issue persists.
> Based on the above, it seems that during the rollback, brokers in m

[jira] [Created] (KAFKA-18882) Move BaseKey, TxnKey, and UnknownKey to transaction-coordinator module

2025-02-27 Thread Chia-Ping Tsai (Jira)
Chia-Ping Tsai created KAFKA-18882:
--

 Summary: Move BaseKey, TxnKey, and UnknownKey to 
transaction-coordinator module
 Key: KAFKA-18882
 URL: https://issues.apache.org/jira/browse/KAFKA-18882
 Project: Kafka
  Issue Type: Sub-task
Reporter: Chia-Ping Tsai
Assignee: Chia-Ping Tsai


as title



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-18884) Move TransactionMetadata to transaction-coordinator module

2025-02-27 Thread PoAn Yang (Jira)
PoAn Yang created KAFKA-18884:
-

 Summary: Move TransactionMetadata to transaction-coordinator module
 Key: KAFKA-18884
 URL: https://issues.apache.org/jira/browse/KAFKA-18884
 Project: Kafka
  Issue Type: Sub-task
Reporter: PoAn Yang
Assignee: PoAn Yang


as title



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] KAFKA-18854:Move DynamicConfig to server module [kafka]

2025-02-27 Thread via GitHub


chia7712 commented on code in PR #19019:
URL: https://github.com/apache/kafka/pull/19019#discussion_r1973272253


##
core/src/main/scala/kafka/server/DynamicBrokerConfig.scala:
##
@@ -82,16 +83,6 @@ object DynamicBrokerConfig {
   private[server] val DynamicSecurityConfigs = 
SslConfigs.RECONFIGURABLE_CONFIGS.asScala
   private[server] val DynamicProducerStateManagerConfig = 
Set(TransactionLogConfig.PRODUCER_ID_EXPIRATION_MS_CONFIG, 
TransactionLogConfig.TRANSACTION_PARTITION_VERIFICATION_ENABLE_CONFIG)
 
-  val AllDynamicConfigs = DynamicSecurityConfigs ++
-LogCleaner.ReconfigurableConfigs ++

Review Comment:
   Could you please move `ReconfigurableConfigs` to `CleanerConfig`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



  1   2   >