[GitHub] [kafka] cadonna commented on pull request #11532: MINOR: Fix system test test_upgrade_to_cooperative_rebalance
cadonna commented on pull request #11532: URL: https://github.com/apache/kafka/pull/11532#issuecomment-978990529 @showuon It happened to all of us at least once. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] cadonna commented on pull request #11532: MINOR: Fix system test test_upgrade_to_cooperative_rebalance
cadonna commented on pull request #11532: URL: https://github.com/apache/kafka/pull/11532#issuecomment-978991581 > @dajac Yes, I will cherry-pick it! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] cadonna commented on pull request #11528: MINOR: update rocksDb memory management doc
cadonna commented on pull request #11528: URL: https://github.com/apache/kafka/pull/11528#issuecomment-979005047 Test failures are not relevant since the PR does only change docs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] cadonna merged pull request #11528: MINOR: update rocksDb memory management doc
cadonna merged pull request #11528: URL: https://github.com/apache/kafka/pull/11528 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (KAFKA-13069) Add magic number to DefaultKafkaPrincipalBuilder.KafkaPrincipalSerde
[ https://issues.apache.org/jira/browse/KAFKA-13069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Jacot updated KAFKA-13069: Fix Version/s: (was: 3.1.0) > Add magic number to DefaultKafkaPrincipalBuilder.KafkaPrincipalSerde > > > Key: KAFKA-13069 > URL: https://issues.apache.org/jira/browse/KAFKA-13069 > Project: Kafka > Issue Type: Bug >Affects Versions: 2.8.0, 3.0.0 >Reporter: Ron Dagostino >Assignee: Ron Dagostino >Priority: Critical > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[GitHub] [kafka] cadonna commented on pull request #11532: MINOR: Fix system test test_upgrade_to_cooperative_rebalance
cadonna commented on pull request #11532: URL: https://github.com/apache/kafka/pull/11532#issuecomment-979038595 Test failures are unrelated since only comments were added to the production code and the system tests are not executed during a build. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] dajac opened a new pull request #11538: KAFKA-10712; Update release scripts to Python3
dajac opened a new pull request #11538: URL: https://github.com/apache/kafka/pull/11538 ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] cadonna merged pull request #11532: MINOR: Fix system test test_upgrade_to_cooperative_rebalance
cadonna merged pull request #11532: URL: https://github.com/apache/kafka/pull/11532 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] cadonna commented on pull request #11532: MINOR: Fix system test test_upgrade_to_cooperative_rebalance
cadonna commented on pull request #11532: URL: https://github.com/apache/kafka/pull/11532#issuecomment-979042278 Cherry-picked to 3.1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] dajac commented on a change in pull request #11538: KAFKA-10712; Update release scripts to Python3
dajac commented on a change in pull request #11538: URL: https://github.com/apache/kafka/pull/11538#discussion_r756737962 ## File path: release.py ## @@ -211,13 +209,14 @@ def get_jdk(prefs, version): """ Get settings for the specified JDK version. """ -jdk_java_home = get_pref(prefs, 'jdk%d' % version, lambda: raw_input("Enter the path for JAVA_HOME for a JDK%d compiler (blank to use default JAVA_HOME): " % version)) +jdk_java_home = get_pref(prefs, 'jdk%d' % version, lambda: input("Enter the path for JAVA_HOME for a JDK%d compiler (blank to use default JAVA_HOME): " % version)) jdk_env = dict(os.environ) if jdk_java_home.strip() else None if jdk_env is not None: jdk_env['JAVA_HOME'] = jdk_java_home java_version = cmd_output("%s/bin/java -version" % jdk_java_home, env=jdk_env) -if version == 8 and "1.8.0" not in java_version: - fail("JDK 8 is required") -elif "%d.0" % version not in java_version or '"%d"' % version not in java_version: +if version == 8: + if "1.8.0" not in java_version: +fail("JDK 8 is required") +elif "%d.0" % version not in java_version and '"%d"' % version not in java_version: Review comment: We had a bug here. The script should fail if both conditions are not met. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] dajac opened a new pull request #11539: MINOR: Update upgrade doc for 3.1
dajac opened a new pull request #11539: URL: https://github.com/apache/kafka/pull/11539 ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] vijaykriishna closed pull request #10873: KAFKA-7360 Fixed code snippet
vijaykriishna closed pull request #10873: URL: https://github.com/apache/kafka/pull/10873 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] (KAFKA-7360) Code example in "Accessing Processor Context" misses a closing parenthesis
[ https://issues.apache.org/jira/browse/KAFKA-7360 ] Vijay deleted comment on KAFKA-7360: -- was (Author: vijaykriishna): [~seknop] Please review [https://github.com/apache/kafka/pull/10873] > Code example in "Accessing Processor Context" misses a closing parenthesis > -- > > Key: KAFKA-7360 > URL: https://issues.apache.org/jira/browse/KAFKA-7360 > Project: Kafka > Issue Type: Bug > Components: documentation >Affects Versions: 2.0.0 >Reporter: Sven Erik Knop >Assignee: Vijay >Priority: Minor > > https://kafka.apache.org/20/documentation/streams/developer-guide/processor-api.html#accessing-processor-context > Code example has some issues: > public void process(String key, String value) { > > // add a header to the elements > context().headers().add.("key", "key" > } > Should be > public void process(String key, String value) { > > // add a header to the elements > context().headers().add("key", "value") > } -- This message was sent by Atlassian Jira (v8.20.1#820001)
[GitHub] [kafka] vijaykriishna opened a new pull request #11540: KAFKA-7360 Fixed code snippet
vijaykriishna opened a new pull request #11540: URL: https://github.com/apache/kafka/pull/11540 *More detailed description of your change, if necessary. The PR title and PR message become the squashed commit message, so use a separate comment to ping reviewers.* *Summary of testing strategy (including rationale) for the feature or bug fix. Unit and/or integration tests are expected for any behaviour change and system tests should be considered for larger changes.* ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] dajac commented on pull request #11538: KAFKA-10712; Update release scripts to Python3
dajac commented on pull request #11538: URL: https://github.com/apache/kafka/pull/11538#issuecomment-979268883 I've run the script and everything seems to work correctly. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] dajac edited a comment on pull request #11538: KAFKA-10712; Update release scripts to Python3
dajac edited a comment on pull request #11538: URL: https://github.com/apache/kafka/pull/11538#issuecomment-979268883 I've ran the script and everything seems to work correctly. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] patrickstuedi opened a new pull request #11541: chore: keeping track of latest seen position in different state stores
patrickstuedi opened a new pull request #11541: URL: https://github.com/apache/kafka/pull/11541 Previously [KAFKA-13480](https://github.com/apache/kafka/pull/11514) was adding position tracking to the RocksDB and InMemory KV store. This PR adds other stores such as CachingKVStore, and all Session and Windowed variants. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (KAFKA-13482) JRE: Duplicate Key: Multiple bootstrap server URLs
Michael Anstis created KAFKA-13482: -- Summary: JRE: Duplicate Key: Multiple bootstrap server URLs Key: KAFKA-13482 URL: https://issues.apache.org/jira/browse/KAFKA-13482 Project: Kafka Issue Type: Bug Components: clients Affects Versions: 2.8.1 Environment: Docker (Kafka server), Docker (client) and local client. Reporter: Michael Anstis I am running a Kafka server in a Docker Container. It needs to listen to both the Docker "internal" network and "external" local network. Some services run in other Docker containers and need Kafka. Some services run locally, not in Docker, and need the same Kafka instance too. Configuring bootstrap servers for the single Kafka instance: {code}kafka.bootstrap.servers=PLAINTEXT://localhost:49212,OUTSIDE://kafka-WLff1:9092{code} Leads to the following stack trace: {code} 2021-11-25 16:17:10,559 ERROR [org.apa.kaf.cli.pro.int.Sender] (kafka-producer-network-thread | kafka-producer-kogito-tracing-model) [Producer clientId=kafka-producer-kogito-tracing-model] Uncaught error in kafka producer I/O thread: : java.lang.IllegalStateException: Duplicate key 0 (attempted merging values localhost:49212 (id: 0 rack: null) and kafka-WLff1:9092 (id: 0 rack: null)) at java.base/java.util.stream.Collectors.duplicateKeyException(Collectors.java:133) at java.base/java.util.stream.Collectors.lambda$uniqKeysMapAccumulator$1(Collectors.java:180) at java.base/java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169) at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133) at java.base/java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801) at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913) at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:578) at org.apache.kafka.common.requests.MetadataResponse$Holder.createBrokers(MetadataResponse.java:414) at org.apache.kafka.common.requests.MetadataResponse$Holder.(MetadataResponse.java:407) at org.apache.kafka.common.requests.MetadataResponse.holder(MetadataResponse.java:187) at org.apache.kafka.common.requests.MetadataResponse.topicMetadata(MetadataResponse.java:210) at org.apache.kafka.clients.NetworkClient$DefaultMetadataUpdater.handleSuccessfulResponse(NetworkClient.java:1086) at org.apache.kafka.clients.NetworkClient.handleCompletedReceives(NetworkClient.java:887) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:570) at org.apache.kafka.clients.producer.internals.Sender.runOnce(Sender.java:327) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:242) at java.base/java.lang.Thread.run(Thread.java:832) {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[GitHub] [kafka] vijaykriishna closed pull request #11540: KAFKA-7360 Fixed code snippet
vijaykriishna closed pull request #11540: URL: https://github.com/apache/kafka/pull/11540 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] vijaykriishna opened a new pull request #11542: KAFKA-7360 Fixed code snippet in v2.0
vijaykriishna opened a new pull request #11542: URL: https://github.com/apache/kafka/pull/11542 *More detailed description of your change, if necessary. The PR title and PR message become the squashed commit message, so use a separate comment to ping reviewers.* *Summary of testing strategy (including rationale) for the feature or bug fix. Unit and/or integration tests are expected for any behaviour change and system tests should be considered for larger changes.* ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Assigned] (KAFKA-13323) The words are ambiguous
[ https://issues.apache.org/jira/browse/KAFKA-13323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vijay reassigned KAFKA-13323: - Assignee: Vijay > The words are ambiguous > --- > > Key: KAFKA-13323 > URL: https://issues.apache.org/jira/browse/KAFKA-13323 > Project: Kafka > Issue Type: Improvement > Components: clients >Affects Versions: 2.0.0, 3.0.0 >Reporter: 李壮壮 >Assignee: Vijay >Priority: Trivial > > In 'Fetcher' class, there is a field means consumer client should record > whether the cached subscriptions have fetchPosition in order to do > 'updateFetchPositions' action. > but the words 'cachedSubscriptionHashAllFetchPositions' are ambiguous. there > should not has 'hash' function, i mean it should be replcaed by > 'cachedSubscriptionHasAllFetchPositions'. > > the code lists following > -- > // to keep from repeatedly scanning subscriptions in poll(), cache the result > during metadata updates > private boolean cachedSubscriptionHashAllFetchPositions; > -- -- This message was sent by Atlassian Jira (v8.20.1#820001)
[GitHub] [kafka] jsancio merged pull request #11533: MINOR: reduce log cleaner offset memory usage in KRaftClusterTestKit
jsancio merged pull request #11533: URL: https://github.com/apache/kafka/pull/11533 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (KAFKA-13483) Stale Missing ISR on a partition after a zookeeper timeout
F Méthot created KAFKA-13483: Summary: Stale Missing ISR on a partition after a zookeeper timeout Key: KAFKA-13483 URL: https://issues.apache.org/jira/browse/KAFKA-13483 Project: Kafka Issue Type: Bug Components: replication Affects Versions: 2.8.1 Reporter: F Méthot We hit a situation where we had a Stale Missing ISR on a single partition on an output changelog topic after a "broker to zookeeper" connection timed out in our production system, This ticket shows the logs of what happened and a workaround that got us out of this situation. *Cluster config* 7 Kafka Brokers v2.8.1 (k8s bitnami) 3 Zookeeper v3.6.2 (k8s bitnami) kubernetes v1.20.6 *Processing pattern:* {code:java} source topic -> KStream application: update 40 stores backed by -> data-##-changelog topics {code} All topics have {*}10 partitions{*}, {*}3 replicas{*}, *min.isr 2* After a broker to zookeeper connection timeed out (see logs below) , lots of topic's partitions ISR went missing. Almost all partition recovered a few milliseconds later, as the reconnection to zk re-established. Except for partition number 3 of one of the 40 data-##-changelog topics It stayed overnight under-replicated, preventing any progress to be done from the source topic's partition 3 of the kstream app. At the same time halting output of data for the 39 other changelog topic on partition 3. +*Successfull Workaround*+ We ran kafka-reassign-partitions.sh on that partition, with the exact same replicas config, and the ISR came back normal, in a matter of milliseconds. {code:java} kafka-reassign-partitions.sh --bootstrap-server kafka.kafka.svc.cluster.local:9092 --reassignment-json-file ./replicas-data-23-changlog.json {code} where replicas-data-23-changlog.json contains that original ISR config {code:java} {"version":1,"partitions":[{"topic":"data-23-changelog","partition":3,"replicas":[1007,1005,1003]}]} {code} +*Questions:*+ Would you be able to provide an explanation why that specific partition did not recover like the others after the zk timeout failure? Or could it be a bug? We are glad the workaround worked, but is there an explanation why it did? Otherwise what should have been done to address this issue? +*Observed summary of the logs*+ {code:java} {code} *[2021-11-20 20:21:42,577] WARN Client session timed out, have not heard from server in 26677ms for sessionid 0x286f5260006 (org.apache.zookeeper.ClientCnxn) [2021-11-20 20:21:42,582] INFO Client session timed out, have not heard from server in 26677ms for sessionid 0x286f5260006, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn) [2021-11-20 20:21:44,644] INFO Opening socket connection to server zookeeper.kafka.svc.cluster.local [2021-11-20 20:21:44,646] INFO Socket connection established, initiating session, client: , server: zookeeper.kafka.svc.cluster.local (org.apache.zookeeper.ClientCnxn) [2021-11-20 20:21:44,649] INFO Session establishment complete on server zookeeper.kafka.svc.cluster.local, sessionid = 0x286f5260006, negotiated timeout = 4 (org.apache.zookeeper.ClientCnxn) [2021-11-20 20:21:57,133] INFO [ReplicaFetcher replicaId=1007, leaderId=1001, fetcherId=0] Shutting down (kafka.server.ReplicaFetcherThread) [2021-11-20 20:21:57,137] INFO [ReplicaFetcher replicaId=1007, leaderId=1001, fetcherId=0] Error sending fetch request (sessionId=1896541533, epoch=50199) to node 1001: (org.apache.kafka.clients.FetchSessionHandler) java.io.IOException: Client was shutdown before response was read at org.apache.kafka.clients.NetworkClientUtils.sendAndReceive(NetworkClientUtils.java:109) at kafka.server.ReplicaFetcherBlockingSend.sendRequest(ReplicaFetcherBlockingSend.scala:110) at kafka.server.ReplicaFetcherThread.fetchFromLeader(ReplicaFetcherThread.scala:217) at kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:325) at kafka.server.AbstractFetcherThread.$anonfun$maybeFetch$3(AbstractFetcherThread.scala:141) at kafka.server.AbstractFetcherThread.$anonfun$maybeFetch$3$adapted(AbstractFetcherThread.scala:140) at scala.Option.foreach(Option.scala:407) at kafka.server.AbstractFetcherThread.maybeFetch(AbstractFetcherThread.scala:140) at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:123) at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:96) [2021-11-20 20:21:57,141] INFO [ReplicaFetcher replicaId=1007, leaderId=1001, fetcherId=0] Stopped (kafka.server.ReplicaFetcherThread) [2021-11-20 20:21:57,141] INFO [ReplicaFetcher replicaId=1007, leaderId=1001, fetcherId=0] Shutdown completed (kafka.server.ReplicaFetcherThread) [2021-11-20 20:21:57,145] INFO [ReplicaFetcher replicaId=1007, lea
[jira] [Updated] (KAFKA-13483) Stale Missing ISR on a partition after a zookeeper timeout
[ https://issues.apache.org/jira/browse/KAFKA-13483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] F Méthot updated KAFKA-13483: - Description: We hit a situation where we had a Stale Missing ISR on a single partition on an output changelog topic after a "broker to zookeeper" connection timed out in our production system, This ticket shows the logs of what happened and a workaround that got us out of this situation. *Cluster config* 7 Kafka Brokers v2.8.1 (k8s bitnami) 3 Zookeeper v3.6.2 (k8s bitnami) kubernetes v1.20.6 *Processing pattern:* {code:java} source topic -> KStream application: update 40 stores backed by -> data-##-changelog topics {code} All topics have {*}10 partitions{*}, {*}3 replicas{*}, *min.isr 2* After a broker to zookeeper connection timeed out (see logs below) , lots of topic's partitions ISR went missing. Almost all partition recovered a few milliseconds later, as the reconnection to zk re-established. Except for partition number 3 of one of the 40 data-##-changelog topics It stayed overnight under-replicated, preventing any progress to be done from the source topic's partition 3 of the kstream app. At the same time halting output of data for the 39 other changelog topic on partition 3. +*Successfull Workaround*+ We ran kafka-reassign-partitions.sh on that partition, with the exact same replicas config, and the ISR came back normal, in a matter of milliseconds. {code:java} kafka-reassign-partitions.sh --bootstrap-server kafka.kafka.svc.cluster.local:9092 --reassignment-json-file ./replicas-data-23-changlog.json {code} where replicas-data-23-changlog.json contains that original ISR config {code:java} {"version":1,"partitions":[{"topic":"data-23-changelog","partition":3,"replicas":[1007,1005,1003]}]} {code} +*Questions:*+ Would you be able to provide an explanation why that specific partition did not recover like the others after the zk timeout failure? Or could it be a bug? We are glad the workaround worked, but is there an explanation why it did? Otherwise what should have been done to address this issue? +*Observed summary of the logs*+ {code:java} [2021-11-20 20:21:42,577] WARN Client session timed out, have not heard from server in 26677ms for sessionid 0x286f5260006 (org.apache.zookeeper.ClientCnxn) [2021-11-20 20:21:42,582] INFO Client session timed out, have not heard from server in 26677ms for sessionid 0x286f5260006, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn) [2021-11-20 20:21:44,644] INFO Opening socket connection to server zookeeper.kafka.svc.cluster.local [2021-11-20 20:21:44,646] INFO Socket connection established, initiating session, client: , server: zookeeper.kafka.svc.cluster.local (org.apache.zookeeper.ClientCnxn) [2021-11-20 20:21:44,649] INFO Session establishment complete on server zookeeper.kafka.svc.cluster.local, sessionid = 0x286f5260006, negotiated timeout = 4 (org.apache.zookeeper.ClientCnxn) [2021-11-20 20:21:57,133] INFO [ReplicaFetcher replicaId=1007, leaderId=1001, fetcherId=0] Shutting down (kafka.server.ReplicaFetcherThread) [2021-11-20 20:21:57,137] INFO [ReplicaFetcher replicaId=1007, leaderId=1001, fetcherId=0] Error sending fetch request (sessionId=1896541533, epoch=50199) to node 1001: (org.apache.kafka.clients.FetchSessionHandler) java.io.IOException: Client was shutdown before response was read at org.apache.kafka.clients.NetworkClientUtils.sendAndReceive(NetworkClientUtils.java:109) at kafka.server.ReplicaFetcherBlockingSend.sendRequest(ReplicaFetcherBlockingSend.scala:110) at kafka.server.ReplicaFetcherThread.fetchFromLeader(ReplicaFetcherThread.scala:217) at kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:325) at kafka.server.AbstractFetcherThread.$anonfun$maybeFetch$3(AbstractFetcherThread.scala:141) at kafka.server.AbstractFetcherThread.$anonfun$maybeFetch$3$adapted(AbstractFetcherThread.scala:140) at scala.Option.foreach(Option.scala:407) at kafka.server.AbstractFetcherThread.maybeFetch(AbstractFetcherThread.scala:140) at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:123) at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:96) [2021-11-20 20:21:57,141] INFO [ReplicaFetcher replicaId=1007, leaderId=1001, fetcherId=0] Stopped (kafka.server.ReplicaFetcherThread) [2021-11-20 20:21:57,141] INFO [ReplicaFetcher replicaId=1007, leaderId=1001, fetcherId=0] Shutdown completed (kafka.server.ReplicaFetcherThread) [2021-11-20 20:21:57,145] INFO [ReplicaFetcher replicaId=1007, leaderId=1005, fetcherId=0] Shutting down (kafka.server.ReplicaFetcherThread) [2021-11-20 20:21:57,145] INFO [ReplicaFetcher replicaId=1007, leaderId=1005, fetcherId=0] Stopped (kafka.server.ReplicaFetcherThread) [2021-11-20
[jira] [Updated] (KAFKA-13483) Stale Missing ISR on a partition after a zookeeper timeout
[ https://issues.apache.org/jira/browse/KAFKA-13483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] F Méthot updated KAFKA-13483: - Description: We hit a situation where we had a Stale Missing ISR on a single partition on an output changelog topic after a "broker to zookeeper" connection timed out in our production system, This ticket shows the logs of what happened and a workaround that got us out of this situation. *Cluster config* 7 Kafka Brokers v2.8.1 (k8s bitnami) 3 Zookeeper v3.6.2 (k8s bitnami) kubernetes v1.20.6 *Processing pattern:* {code:java} source topic -> KStream application: update 40 stores backed by -> data-##-changelog topics {code} All topics have {*}10 partitions{*}, {*}3 replicas{*}, *min.isr 2* After a broker to zookeeper connection timeed out (see logs below) , lots of topic's partitions ISR went missing. Almost all partition recovered a few milliseconds later, as the reconnection to zk re-established. Except for partition number 3 of *one* of the 40 data-##-changelog topics It stayed overnight under-replicated, preventing any progress to be done from the source topic's partition 3 of the kstream app. At the same time halting output of data for the 39 other changelog topic on partition 3. +*Successfull Workaround*+ We ran kafka-reassign-partitions.sh on that partition, with the exact same replicas config, and the ISR came back normal, in a matter of milliseconds. {code:java} kafka-reassign-partitions.sh --bootstrap-server kafka.kafka.svc.cluster.local:9092 --reassignment-json-file ./replicas-data-23-changlog.json {code} where replicas-data-23-changlog.json contains that original ISR config {code:java} {"version":1,"partitions":[{"topic":"data-23-changelog","partition":3,"replicas":[1007,1005,1003]}]} {code} +*Questions:*+ Would you be able to provide an explanation why that specific partition did not recover like the others after the zk timeout failure? Or could it be a bug? We are glad the workaround worked, but is there an explanation why it did? Otherwise what should have been done to address this issue? +*Observed summary of the logs*+ {code:java} [2021-11-20 20:21:42,577] WARN Client session timed out, have not heard from server in 26677ms for sessionid 0x286f5260006 (org.apache.zookeeper.ClientCnxn) [2021-11-20 20:21:42,582] INFO Client session timed out, have not heard from server in 26677ms for sessionid 0x286f5260006, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn) [2021-11-20 20:21:44,644] INFO Opening socket connection to server zookeeper.kafka.svc.cluster.local [2021-11-20 20:21:44,646] INFO Socket connection established, initiating session, client: , server: zookeeper.kafka.svc.cluster.local (org.apache.zookeeper.ClientCnxn) [2021-11-20 20:21:44,649] INFO Session establishment complete on server zookeeper.kafka.svc.cluster.local, sessionid = 0x286f5260006, negotiated timeout = 4 (org.apache.zookeeper.ClientCnxn) [2021-11-20 20:21:57,133] INFO [ReplicaFetcher replicaId=1007, leaderId=1001, fetcherId=0] Shutting down (kafka.server.ReplicaFetcherThread) [2021-11-20 20:21:57,137] INFO [ReplicaFetcher replicaId=1007, leaderId=1001, fetcherId=0] Error sending fetch request (sessionId=1896541533, epoch=50199) to node 1001: (org.apache.kafka.clients.FetchSessionHandler) java.io.IOException: Client was shutdown before response was read at org.apache.kafka.clients.NetworkClientUtils.sendAndReceive(NetworkClientUtils.java:109) at kafka.server.ReplicaFetcherBlockingSend.sendRequest(ReplicaFetcherBlockingSend.scala:110) at kafka.server.ReplicaFetcherThread.fetchFromLeader(ReplicaFetcherThread.scala:217) at kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:325) at kafka.server.AbstractFetcherThread.$anonfun$maybeFetch$3(AbstractFetcherThread.scala:141) at kafka.server.AbstractFetcherThread.$anonfun$maybeFetch$3$adapted(AbstractFetcherThread.scala:140) at scala.Option.foreach(Option.scala:407) at kafka.server.AbstractFetcherThread.maybeFetch(AbstractFetcherThread.scala:140) at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:123) at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:96) [2021-11-20 20:21:57,141] INFO [ReplicaFetcher replicaId=1007, leaderId=1001, fetcherId=0] Stopped (kafka.server.ReplicaFetcherThread) [2021-11-20 20:21:57,141] INFO [ReplicaFetcher replicaId=1007, leaderId=1001, fetcherId=0] Shutdown completed (kafka.server.ReplicaFetcherThread) [2021-11-20 20:21:57,145] INFO [ReplicaFetcher replicaId=1007, leaderId=1005, fetcherId=0] Shutting down (kafka.server.ReplicaFetcherThread) [2021-11-20 20:21:57,145] INFO [ReplicaFetcher replicaId=1007, leaderId=1005, fetcherId=0] Stopped (kafka.server.ReplicaFetcherThread) [2021-11-2
[jira] [Updated] (KAFKA-13483) Stale Missing ISR on a partition after a zookeeper timeout
[ https://issues.apache.org/jira/browse/KAFKA-13483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] F Méthot updated KAFKA-13483: - Description: We hit a situation where we had a Stale Missing ISR on a single partition on an output changelog topic after a "broker to zookeeper" connection timed out in our production system, This ticket shows the logs of what happened and a workaround that got us out of this situation. *Cluster config* 7 Kafka Brokers v2.8.1 (k8s bitnami) 3 Zookeeper v3.6.2 (k8s bitnami) kubernetes v1.20.6 *Processing pattern:* {code:java} source topic -> KStream application: update 40 stores backed by -> data-##-changelog topics {code} All topics have {*}10 partitions{*}, {*}3 replicas{*}, *min.isr 2* After a broker to zookeeper connection timeed out (see logs below) , lots of topic's partitions ISR went missing. Almost all partition recovered a few milliseconds later, as the reconnection to zk re-established. Except for partition number 3 of *one* of the 40 data-##-changelog topics It stayed overnight under-replicated, preventing any progress to be done from the source topic's partition 3 of the kstream app. At the same time halting production of data for the 39 other changelog topic on partition 3 (because they also reply on partition 3 of the input topic) +*Successfull Workaround*+ We ran kafka-reassign-partitions.sh on that partition, with the exact same replicas config, and the ISR came back normal, in a matter of milliseconds. {code:java} kafka-reassign-partitions.sh --bootstrap-server kafka.kafka.svc.cluster.local:9092 --reassignment-json-file ./replicas-data-23-changlog.json {code} where replicas-data-23-changlog.json contains that original ISR config {code:java} {"version":1,"partitions":[{"topic":"data-23-changelog","partition":3,"replicas":[1007,1005,1003]}]} {code} +*Questions:*+ Would you be able to provide an explanation why that specific partition did not recover like the others after the zk timeout failure? Or could it be a bug? We are glad the workaround worked, but is there an explanation why it did? Otherwise what should have been done to address this issue? +*Observed summary of the logs*+ {code:java} [2021-11-20 20:21:42,577] WARN Client session timed out, have not heard from server in 26677ms for sessionid 0x286f5260006 (org.apache.zookeeper.ClientCnxn) [2021-11-20 20:21:42,582] INFO Client session timed out, have not heard from server in 26677ms for sessionid 0x286f5260006, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn) [2021-11-20 20:21:44,644] INFO Opening socket connection to server zookeeper.kafka.svc.cluster.local [2021-11-20 20:21:44,646] INFO Socket connection established, initiating session, client: , server: zookeeper.kafka.svc.cluster.local (org.apache.zookeeper.ClientCnxn) [2021-11-20 20:21:44,649] INFO Session establishment complete on server zookeeper.kafka.svc.cluster.local, sessionid = 0x286f5260006, negotiated timeout = 4 (org.apache.zookeeper.ClientCnxn) [2021-11-20 20:21:57,133] INFO [ReplicaFetcher replicaId=1007, leaderId=1001, fetcherId=0] Shutting down (kafka.server.ReplicaFetcherThread) [2021-11-20 20:21:57,137] INFO [ReplicaFetcher replicaId=1007, leaderId=1001, fetcherId=0] Error sending fetch request (sessionId=1896541533, epoch=50199) to node 1001: (org.apache.kafka.clients.FetchSessionHandler) java.io.IOException: Client was shutdown before response was read at org.apache.kafka.clients.NetworkClientUtils.sendAndReceive(NetworkClientUtils.java:109) at kafka.server.ReplicaFetcherBlockingSend.sendRequest(ReplicaFetcherBlockingSend.scala:110) at kafka.server.ReplicaFetcherThread.fetchFromLeader(ReplicaFetcherThread.scala:217) at kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:325) at kafka.server.AbstractFetcherThread.$anonfun$maybeFetch$3(AbstractFetcherThread.scala:141) at kafka.server.AbstractFetcherThread.$anonfun$maybeFetch$3$adapted(AbstractFetcherThread.scala:140) at scala.Option.foreach(Option.scala:407) at kafka.server.AbstractFetcherThread.maybeFetch(AbstractFetcherThread.scala:140) at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:123) at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:96) [2021-11-20 20:21:57,141] INFO [ReplicaFetcher replicaId=1007, leaderId=1001, fetcherId=0] Stopped (kafka.server.ReplicaFetcherThread) [2021-11-20 20:21:57,141] INFO [ReplicaFetcher replicaId=1007, leaderId=1001, fetcherId=0] Shutdown completed (kafka.server.ReplicaFetcherThread) [2021-11-20 20:21:57,145] INFO [ReplicaFetcher replicaId=1007, leaderId=1005, fetcherId=0] Shutting down (kafka.server.ReplicaFetcherThread) [2021-11-20 20:21:57,145] INFO [ReplicaFetcher replicaId=1007, leaderId=1005, fet
[jira] [Created] (KAFKA-13484) Tag Offline Partition Count by Topic Name
Mason Joseph Legere created KAFKA-13484: --- Summary: Tag Offline Partition Count by Topic Name Key: KAFKA-13484 URL: https://issues.apache.org/jira/browse/KAFKA-13484 Project: Kafka Issue Type: Improvement Reporter: Mason Joseph Legere Assignee: Mason Joseph Legere Attachments: Offline Partition Tagging by Topic.png *DoD:* * "OfflineParitionsCount" metric tagged by the topic name only in the case where the topic has an offline partition. * Once a topic no longer has any offline partitions, the emitting gauge is removed from the the metrics registry. * Gauges are removed following topic deletion. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (KAFKA-13484) Tag Offline Partition Count by Topic Name
[ https://issues.apache.org/jira/browse/KAFKA-13484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mason Joseph Legere updated KAFKA-13484: Description: See [KIP|https://cwiki.apache.org/confluence/pages/resumedraft.action?draftId=195728141&draftShareId=0ba16789-309e-4aeb-8cac-0c06b13f86bc&]{*}{*} *DoD:* * "OfflineParitionsCount" metric tagged by the topic name only in the case where the topic has an offline partition. * Once a topic no longer has any offline partitions, the emitting gauge is removed from the the metrics registry. * Gauges are removed following topic deletion. was: *DoD:* * "OfflineParitionsCount" metric tagged by the topic name only in the case where the topic has an offline partition. * Once a topic no longer has any offline partitions, the emitting gauge is removed from the the metrics registry. * Gauges are removed following topic deletion. > Tag Offline Partition Count by Topic Name > -- > > Key: KAFKA-13484 > URL: https://issues.apache.org/jira/browse/KAFKA-13484 > Project: Kafka > Issue Type: Improvement >Reporter: Mason Joseph Legere >Assignee: Mason Joseph Legere >Priority: Minor > Labels: KIP > Attachments: Offline Partition Tagging by Topic.png > > > See > [KIP|https://cwiki.apache.org/confluence/pages/resumedraft.action?draftId=195728141&draftShareId=0ba16789-309e-4aeb-8cac-0c06b13f86bc&]{*}{*} > *DoD:* > * "OfflineParitionsCount" metric tagged by the topic name only in the case > where the topic has an offline partition. > * Once a topic no longer has any offline partitions, the emitting gauge is > removed from the the metrics registry. > * Gauges are removed following topic deletion. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[GitHub] [kafka] socutes commented on pull request #11530: MINOR:Update code comment information
socutes commented on pull request #11530: URL: https://github.com/apache/kafka/pull/11530#issuecomment-979689692 @showuon @dajac Thanks~. fixed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] socutes commented on a change in pull request #11529: KAFKA-12932: Interfaces for SnapshotReader and SnapshotWriter
socutes commented on a change in pull request #11529: URL: https://github.com/apache/kafka/pull/11529#discussion_r757224604 ## File path: core/src/test/scala/unit/kafka/security/authorizer/AclAuthorizerTest.scala ## @@ -123,8 +123,14 @@ class AclAuthorizerTest extends QuorumTestHarness with BaseAuthorizerTest { val user1 = new KafkaPrincipal(KafkaPrincipal.USER_TYPE, username) val user2 = new KafkaPrincipal(KafkaPrincipal.USER_TYPE, "rob") val user3 = new KafkaPrincipal(KafkaPrincipal.USER_TYPE, "batman") +val user4 = new KafkaPrincipal(KafkaPrincipal.USER_TYPE, "segment") Review comment: Yes, you're right! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] shharrnam opened a new pull request #11543: Update release.py
shharrnam opened a new pull request #11543: URL: https://github.com/apache/kafka/pull/11543 *More detailed description of your change, if necessary. The PR title and PR message become the squashed commit message, so use a separate comment to ping reviewers.* *Summary of testing strategy (including rationale) for the feature or bug fix. Unit and/or integration tests are expected for any behaviour change and system tests should be considered for larger changes.* ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] ZoeyLiu commented on pull request #8575: KAFKA-8713 KIP-581 Add accept.optional.null to solve optional null
ZoeyLiu commented on pull request #8575: URL: https://github.com/apache/kafka/pull/8575#issuecomment-979753077 Any update or possible workaround ? @rhauch -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] ZoeyLiu commented on pull request #8575: KAFKA-8713 KIP-581 Add accept.optional.null to solve optional null
ZoeyLiu commented on pull request #8575: URL: https://github.com/apache/kafka/pull/8575#issuecomment-979757582 Do you have any updates or possible workaround? @rhauch @ijuma @hachikuji @guozhangwang @mjsax @rajinisivaram @cmccabe -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] ZoeyLiu removed a comment on pull request #8575: KAFKA-8713 KIP-581 Add accept.optional.null to solve optional null
ZoeyLiu removed a comment on pull request #8575: URL: https://github.com/apache/kafka/pull/8575#issuecomment-979753077 Any update or possible workaround ? @rhauch -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] socutes commented on pull request #11529: KAFKA-12932: Interfaces for SnapshotReader and SnapshotWriter
socutes commented on pull request #11529: URL: https://github.com/apache/kafka/pull/11529#issuecomment-979758153 @jsancio Thank you very much for your review. The modification has been made according to your suggestion, please help to review it again. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [kafka] ahamedmulaffer commented on pull request #8575: KAFKA-8713 KIP-581 Add accept.optional.null to solve optional null
ahamedmulaffer commented on pull request #8575: URL: https://github.com/apache/kafka/pull/8575#issuecomment-979761098 @ZoeyLiu possible workaround you have to do at your application level for now For example for fields which have a default value to NOT NULL but it will accept NULL for those fields you have to query that field after you capture the cdc data -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org