[jira] [Commented] (KAFKA-8504) Suppressed do not emit with TimeWindows
[ https://issues.apache.org/jira/browse/KAFKA-8504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16859804#comment-16859804 ] Simone commented on KAFKA-8504: --- Yeah, I understand that due to backward compatibility is not easy to fix this but I agree with you about the docs and I think that given the workaround is quite simple (just need to set a custom grace period) probably a good written documentation could be enough (for now at least) Thank you very much for the reply :) > Suppressed do not emit with TimeWindows > --- > > Key: KAFKA-8504 > URL: https://issues.apache.org/jira/browse/KAFKA-8504 > Project: Kafka > Issue Type: Improvement > Components: documentation, streams >Affects Versions: 2.2.1 >Reporter: Simone >Priority: Minor > > Hi, I'm playing a bit with KafkaStream and the new suppress feature. I > noticed that when using a {{TimeWindows}} without explicitly setting the > grace {{suppress}} will not emit any message if used with > {{Suppressed.untilWindowCloses.}} > I look a bit into the code and from what I understood with this configuration > {{suppress}} should use the {{grace}} setting of the {{TimeWindows}}. But > since using {{TimeWindows.of(Duration)}} default the grace to {{-1}} and when > getting the grace using the method {{TimeWindows.gracePeriodMs()}} in case of > grace equals to -1 the return value is set to {{maintainMs() - size()}} I > think that the end of window is not properly calculated. > Of course is possible to avoid this problem forcing the {{grace}} to 0 when > creating the TimeWindows but I think that this should be the default > behaviour at least when it comes to the suppress feature. > I hope I have not misunderstood the code in my analysis, thank you :) > Simone -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-8517) A lot of WARN messages in kafka log "Received a PartitionLeaderEpoch assignment for an epoch < latestEpoch:
Jacek Żoch created KAFKA-8517: - Summary: A lot of WARN messages in kafka log "Received a PartitionLeaderEpoch assignment for an epoch < latestEpoch: Key: KAFKA-8517 URL: https://issues.apache.org/jira/browse/KAFKA-8517 Project: Kafka Issue Type: Bug Components: logging Affects Versions: 0.11.0.1 Environment: PRD Reporter: Jacek Żoch We have 2.0 version but it was happening in version 0.11 In kafka log there is a lot of messages "Received a PartitionLeaderEpoch assignment for an epoch < latestEpoch. This implies messages have arrived out of order." On 23.05 we had Received a PartitionLeaderEpoch assignment for an epoch < latestEpoch. This implies messages have arrived out of order. New: \{epoch:181, offset:23562380995}, Current: \{epoch:362, offset10365488611} for Partition: __consumer_offsets-30 (kafka.server.epoch.LeaderEpochFileCache) Currently we have Received a PartitionLeaderEpoch assignment for an epoch < latestEpoch. This implies messages have arrived out of order. New: \{epoch:199, offset:24588072027}, Current: \{epoch:362, offset:10365488611} for Partition: __consumer_offsets-30 (kafka.server.epoch.LeaderEpochFileCache) I think kafka should either fix it "under the hood" or have information how to fix it There is no information, how dangerous is it and how to fix it -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-5876) IQ should throw different exceptions for different errors
[ https://issues.apache.org/jira/browse/KAFKA-5876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16859868#comment-16859868 ] ASF GitHub Bot commented on KAFKA-5876: --- vitojeng commented on pull request #5814: KAFKA-5876: IQ should throw different exceptions for different errors(KIP-216) URL: https://github.com/apache/kafka/pull/5814 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > IQ should throw different exceptions for different errors > - > > Key: KAFKA-5876 > URL: https://issues.apache.org/jira/browse/KAFKA-5876 > Project: Kafka > Issue Type: Task > Components: streams >Reporter: Matthias J. Sax >Assignee: Vito Jeng >Priority: Major > Labels: needs-kip, newbie++ > > Currently, IQ does only throws {{InvalidStateStoreException}} for all errors > that occur. However, we have different types of errors and should throw > different exceptions for those types. > For example, if a store was migrated it must be rediscovered while if a store > cannot be queried yet, because it is still re-created after a rebalance, the > user just needs to wait until store recreation is finished. > There might be other examples, too. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8450) Augment processed in MockProcessor as KeyValueAndTimestamp
[ https://issues.apache.org/jira/browse/KAFKA-8450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16859958#comment-16859958 ] SuryaTeja Duggi commented on KAFKA-8450: [~guozhang] [~mjsax] Can some one comment on this ticket. > Augment processed in MockProcessor as KeyValueAndTimestamp > -- > > Key: KAFKA-8450 > URL: https://issues.apache.org/jira/browse/KAFKA-8450 > Project: Kafka > Issue Type: Improvement > Components: streams, unit tests >Reporter: Guozhang Wang >Assignee: SuryaTeja Duggi >Priority: Major > Labels: newbie > > Today the book-keeping list of `processed` records in MockProcessor is in the > form of String, in which we just call the key / value type's toString > function in order to book-keep, this loses the type information as well as > not having timestamp associated with it. > It's better to translate its type to `KeyValueAndTimestamp` and refactor > impacted unit tests. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-8518) Update GitHub repo description to make it obvious that it is not a "mirror" anymore
Etienne Neveu created KAFKA-8518: Summary: Update GitHub repo description to make it obvious that it is not a "mirror" anymore Key: KAFKA-8518 URL: https://issues.apache.org/jira/browse/KAFKA-8518 Project: Kafka Issue Type: Improvement Components: documentation Reporter: Etienne Neveu When I go to [https://github.com/apache/kafka], the description at the top is "Mirror of Apache Kafka", which makes me think that the development is done elsewhere and this is just a mirrored GitHub repo. But I think the main development has now moved to GitHub, so it would be nice to change this description, to make it more obvious. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (KAFKA-8514) Kafka clients should not include Scala's Java 8 compatibility lib
[ https://issues.apache.org/jira/browse/KAFKA-8514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Almog Gavra reassigned KAFKA-8514: -- Assignee: Almog Gavra > Kafka clients should not include Scala's Java 8 compatibility lib > - > > Key: KAFKA-8514 > URL: https://issues.apache.org/jira/browse/KAFKA-8514 > Project: Kafka > Issue Type: Bug > Components: clients >Reporter: Enno Runne >Assignee: Almog Gavra >Priority: Major > > The work with KAFKA-8305 brought in > "org.scala-lang.modules:scala-java8-compat_2.12" > as dependency of the client lib. This will give users from Scala an extra > headache as it would need to be excluded when working with another Scala > version. > Instead, it should be moved to be a dependency of *"core"* for the > convenience of converting Java and Scala Option instances. > See > [https://github.com/apache/kafka/commit/8e161580b859b2fcd54c59625e232b99f3bb48d0#diff-c197962302397baf3a4cc36463dce5eaR942] > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8514) Kafka clients should not include Scala's Java 8 compatibility lib
[ https://issues.apache.org/jira/browse/KAFKA-8514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860099#comment-16860099 ] Almog Gavra commented on KAFKA-8514: ack. thanks for pointing this out [~ennru]! I'll change this and have a PR out soon after making sure everything compiles. > Kafka clients should not include Scala's Java 8 compatibility lib > - > > Key: KAFKA-8514 > URL: https://issues.apache.org/jira/browse/KAFKA-8514 > Project: Kafka > Issue Type: Bug > Components: clients >Reporter: Enno Runne >Assignee: Almog Gavra >Priority: Major > > The work with KAFKA-8305 brought in > "org.scala-lang.modules:scala-java8-compat_2.12" > as dependency of the client lib. This will give users from Scala an extra > headache as it would need to be excluded when working with another Scala > version. > Instead, it should be moved to be a dependency of *"core"* for the > convenience of converting Java and Scala Option instances. > See > [https://github.com/apache/kafka/commit/8e161580b859b2fcd54c59625e232b99f3bb48d0#diff-c197962302397baf3a4cc36463dce5eaR942] > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-8504) Suppressed do not emit with TimeWindows
[ https://issues.apache.org/jira/browse/KAFKA-8504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax updated KAFKA-8504: --- Labels: newbie (was: ) > Suppressed do not emit with TimeWindows > --- > > Key: KAFKA-8504 > URL: https://issues.apache.org/jira/browse/KAFKA-8504 > Project: Kafka > Issue Type: Improvement > Components: documentation, streams >Affects Versions: 2.2.1 >Reporter: Simone >Priority: Minor > Labels: newbie > > Hi, I'm playing a bit with KafkaStream and the new suppress feature. I > noticed that when using a {{TimeWindows}} without explicitly setting the > grace {{suppress}} will not emit any message if used with > {{Suppressed.untilWindowCloses.}} > I look a bit into the code and from what I understood with this configuration > {{suppress}} should use the {{grace}} setting of the {{TimeWindows}}. But > since using {{TimeWindows.of(Duration)}} default the grace to {{-1}} and when > getting the grace using the method {{TimeWindows.gracePeriodMs()}} in case of > grace equals to -1 the return value is set to {{maintainMs() - size()}} I > think that the end of window is not properly calculated. > Of course is possible to avoid this problem forcing the {{grace}} to 0 when > creating the TimeWindows but I think that this should be the default > behaviour at least when it comes to the suppress feature. > I hope I have not misunderstood the code in my analysis, thank you :) > Simone -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-8504) Update Docs to explain how to use suppress() in more details
[ https://issues.apache.org/jira/browse/KAFKA-8504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax updated KAFKA-8504: --- Summary: Update Docs to explain how to use suppress() in more details (was: Suppressed do not emit with TimeWindows) > Update Docs to explain how to use suppress() in more details > > > Key: KAFKA-8504 > URL: https://issues.apache.org/jira/browse/KAFKA-8504 > Project: Kafka > Issue Type: Improvement > Components: documentation, streams >Affects Versions: 2.2.1 >Reporter: Simone >Priority: Minor > Labels: newbie > > Hi, I'm playing a bit with KafkaStream and the new suppress feature. I > noticed that when using a {{TimeWindows}} without explicitly setting the > grace {{suppress}} will not emit any message if used with > {{Suppressed.untilWindowCloses.}} > I look a bit into the code and from what I understood with this configuration > {{suppress}} should use the {{grace}} setting of the {{TimeWindows}}. But > since using {{TimeWindows.of(Duration)}} default the grace to {{-1}} and when > getting the grace using the method {{TimeWindows.gracePeriodMs()}} in case of > grace equals to -1 the return value is set to {{maintainMs() - size()}} I > think that the end of window is not properly calculated. > Of course is possible to avoid this problem forcing the {{grace}} to 0 when > creating the TimeWindows but I think that this should be the default > behaviour at least when it comes to the suppress feature. > I hope I have not misunderstood the code in my analysis, thank you :) > Simone -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7315) Streams serialization docs contain a broken link for Avro
[ https://issues.apache.org/jira/browse/KAFKA-7315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860163#comment-16860163 ] ASF GitHub Bot commented on KAFKA-7315: --- mjsax commented on pull request #212: KAFKA-7315 update TOC internal links serdes all versions URL: https://github.com/apache/kafka-site/pull/212 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Streams serialization docs contain a broken link for Avro > - > > Key: KAFKA-7315 > URL: https://issues.apache.org/jira/browse/KAFKA-7315 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: John Roesler >Assignee: Victoria Bialas >Priority: Major > Labels: docuentation, newbie > > https://kafka.apache.org/documentation/streams/developer-guide/datatypes.html#avro -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7315) Streams serialization docs contain a broken link for Avro
[ https://issues.apache.org/jira/browse/KAFKA-7315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860167#comment-16860167 ] ASF GitHub Bot commented on KAFKA-7315: --- mjsax commented on pull request #6875: KAFKA-7315 DOCS update TOC internal links serdes all versions URL: https://github.com/apache/kafka/pull/6875 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Streams serialization docs contain a broken link for Avro > - > > Key: KAFKA-7315 > URL: https://issues.apache.org/jira/browse/KAFKA-7315 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: John Roesler >Assignee: Victoria Bialas >Priority: Major > Labels: docuentation, newbie > > https://kafka.apache.org/documentation/streams/developer-guide/datatypes.html#avro -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (KAFKA-7315) Streams serialization docs contain a broken link for Avro
[ https://issues.apache.org/jira/browse/KAFKA-7315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-7315. Resolution: Fixed > Streams serialization docs contain a broken link for Avro > - > > Key: KAFKA-7315 > URL: https://issues.apache.org/jira/browse/KAFKA-7315 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: John Roesler >Assignee: Victoria Bialas >Priority: Major > Labels: docuentation, newbie > > https://kafka.apache.org/documentation/streams/developer-guide/datatypes.html#avro -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8516) Consider allowing all replicas to have read/write permissions
[ https://issues.apache.org/jira/browse/KAFKA-8516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860225#comment-16860225 ] Matthias J. Sax commented on KAFKA-8516: [~Yohan123] Are you aware of [https://cwiki.apache.org/confluence/display/KAFKA/KIP-392%3A+Allow+consumers+to+fetch+from+closest+replica] This ticket might be a duplicate to the KIP? > Consider allowing all replicas to have read/write permissions > - > > Key: KAFKA-8516 > URL: https://issues.apache.org/jira/browse/KAFKA-8516 > Project: Kafka > Issue Type: Improvement >Reporter: Richard Yu >Priority: Major > > Currently, in Kafka internals, a leader is responsible for all the read and > write operations requested by the user. This naturally incurs a bottleneck > since one replica, as the leader, would experience a significantly heavier > workload than other replicas and also means that all client commands must > pass through a chokepoint. If a leader fails, all processing effectively > comes to a halt until another leader election. In order to help solve this > problem, we could think about redesigning Kafka core so that any replica is > able to do read and write operations as well. That is, the system be changed > so that _all_ replicas have read/write permissions. > > This has multiple positives. Notably the following: > * Workload can be more evenly distributed since leader replicas are weighted > more than follower replicas (in this new design, all replicas are equal) > * Some failures would not be as catastrophic as in the leader-follower > paradigm. There is no one single "leader". If one replica goes down, others > are still able to read/write as needed. Processing could continue without > interruption. > The implementation for such a change like this will be very extensive and > discussion would be needed to decide if such an improvement as described > above would warrant such a drastic redesign of Kafka internals. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8514) Kafka clients should not include Scala's Java 8 compatibility lib
[ https://issues.apache.org/jira/browse/KAFKA-8514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860241#comment-16860241 ] Almog Gavra commented on KAFKA-8514: [https://github.com/apache/kafka/pull/6910] - everything compiles and passes locally > Kafka clients should not include Scala's Java 8 compatibility lib > - > > Key: KAFKA-8514 > URL: https://issues.apache.org/jira/browse/KAFKA-8514 > Project: Kafka > Issue Type: Bug > Components: clients >Reporter: Enno Runne >Assignee: Almog Gavra >Priority: Major > > The work with KAFKA-8305 brought in > "org.scala-lang.modules:scala-java8-compat_2.12" > as dependency of the client lib. This will give users from Scala an extra > headache as it would need to be excluded when working with another Scala > version. > Instead, it should be moved to be a dependency of *"core"* for the > convenience of converting Java and Scala Option instances. > See > [https://github.com/apache/kafka/commit/8e161580b859b2fcd54c59625e232b99f3bb48d0#diff-c197962302397baf3a4cc36463dce5eaR942] > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8514) Kafka clients should not include Scala's Java 8 compatibility lib
[ https://issues.apache.org/jira/browse/KAFKA-8514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860240#comment-16860240 ] ASF GitHub Bot commented on KAFKA-8514: --- agavra commented on pull request #6910: KAFKA-8514: move the scala-java8-compat import to the :core project insetad of :clients URL: https://github.com/apache/kafka/pull/6910 See https://issues.apache.org/jira/browse/KAFKA-8514 - this makes it so that only the `:core` module depends on the newly added java8 convertors. ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Kafka clients should not include Scala's Java 8 compatibility lib > - > > Key: KAFKA-8514 > URL: https://issues.apache.org/jira/browse/KAFKA-8514 > Project: Kafka > Issue Type: Bug > Components: clients >Reporter: Enno Runne >Assignee: Almog Gavra >Priority: Major > > The work with KAFKA-8305 brought in > "org.scala-lang.modules:scala-java8-compat_2.12" > as dependency of the client lib. This will give users from Scala an extra > headache as it would need to be excluded when working with another Scala > version. > Instead, it should be moved to be a dependency of *"core"* for the > convenience of converting Java and Scala Option instances. > See > [https://github.com/apache/kafka/commit/8e161580b859b2fcd54c59625e232b99f3bb48d0#diff-c197962302397baf3a4cc36463dce5eaR942] > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-8519) Trogdor should support network degradation
David Arthur created KAFKA-8519: --- Summary: Trogdor should support network degradation Key: KAFKA-8519 URL: https://issues.apache.org/jira/browse/KAFKA-8519 Project: Kafka Issue Type: Improvement Components: system tests Reporter: David Arthur Trogdor should allow us to simulate degraded networks, similar to the network partition spec. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (KAFKA-8519) Trogdor should support network degradation
[ https://issues.apache.org/jira/browse/KAFKA-8519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Arthur reassigned KAFKA-8519: --- Assignee: David Arthur > Trogdor should support network degradation > -- > > Key: KAFKA-8519 > URL: https://issues.apache.org/jira/browse/KAFKA-8519 > Project: Kafka > Issue Type: Improvement > Components: system tests >Reporter: David Arthur >Assignee: David Arthur >Priority: Minor > > Trogdor should allow us to simulate degraded networks, similar to the network > partition spec. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7760) Add broker configuration to set minimum value for segment.bytes and segment.ms
[ https://issues.apache.org/jira/browse/KAFKA-7760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860287#comment-16860287 ] ASF GitHub Bot commented on KAFKA-7760: --- dulvinw commented on pull request #6911: KAFKA-7760 URL: https://github.com/apache/kafka/pull/6911 *Add broker configuration to set minimum value for segment.bytes and segment.ms.* This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add broker configuration to set minimum value for segment.bytes and segment.ms > -- > > Key: KAFKA-7760 > URL: https://issues.apache.org/jira/browse/KAFKA-7760 > Project: Kafka > Issue Type: Improvement >Reporter: Badai Aqrandista >Assignee: Dulvin Witharane >Priority: Major > Labels: kip, newbie > > If someone set segment.bytes or segment.ms at topic level to a very small > value (e.g. segment.bytes=1000 or segment.ms=1000), Kafka will generate a > very high number of segment files. This can bring down the whole broker due > to hitting the maximum open file (for log) or maximum number of mmap-ed file > (for index). > To prevent that from happening, I would like to suggest adding two new items > to the broker configuration: > * min.topic.segment.bytes, defaults to 1048576: The minimum value for > segment.bytes. When someone sets topic configuration segment.bytes to a value > lower than this, Kafka throws an error INVALID VALUE. > * min.topic.segment.ms, defaults to 360: The minimum value for > segment.ms. When someone sets topic configuration segment.ms to a value lower > than this, Kafka throws an error INVALID VALUE. > Thanks > Badai -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-8500) member.id should always update upon static member rejoin despite of group state
[ https://issues.apache.org/jira/browse/KAFKA-8500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Boyang Chen updated KAFKA-8500: --- Issue Type: Bug (was: Improvement) > member.id should always update upon static member rejoin despite of group > state > --- > > Key: KAFKA-8500 > URL: https://issues.apache.org/jira/browse/KAFKA-8500 > Project: Kafka > Issue Type: Bug > Components: consumer, streams >Affects Versions: 2.3.0 >Reporter: Boyang Chen >Assignee: Boyang Chen >Priority: Blocker > > A blocking bug was detected by [~guozhang] that the `member.id` wasn't get > updated upon static member rejoining when the group is not in stable state. > This could make duplicate member fencing harder and potentially yield > incorrect processing outputs. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-8487) Consumer should not resetGeneration upon REBALANCE_IN_PROGRESS in commit response handler
[ https://issues.apache.org/jira/browse/KAFKA-8487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Boyang Chen updated KAFKA-8487: --- Affects Version/s: 2.3 > Consumer should not resetGeneration upon REBALANCE_IN_PROGRESS in commit > response handler > - > > Key: KAFKA-8487 > URL: https://issues.apache.org/jira/browse/KAFKA-8487 > Project: Kafka > Issue Type: Bug > Components: consumer >Affects Versions: 2.3 >Reporter: Guozhang Wang >Assignee: Guozhang Wang >Priority: Blocker > > In consumer, we handle the errors in sync / heartbeat / join response such > that: > 1. UNKNOWN_MEMBER_ID / ILLEGAL_GENERATION: we reset the generation and > request re-join. > 2. REBALANCE_IN_PROGRESS: do nothing if a rejoin will be executed, or request > re-join explicitly. > However, for commit response, we require resetGeneration for > REBALANCE_IN_PROGRESS as well. This is a flaw in two folds: > 1. As in KIP-345, with static members, reseting generation will lose the > member.id and hence may cause incorrect fencing. > 2. As in KIP-429, resetting generation will cause partitions to be "lost" > unnecessarily before re-joining the group. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-8487) Consumer should not resetGeneration upon REBALANCE_IN_PROGRESS in commit response handler
[ https://issues.apache.org/jira/browse/KAFKA-8487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Boyang Chen updated KAFKA-8487: --- Component/s: streams > Consumer should not resetGeneration upon REBALANCE_IN_PROGRESS in commit > response handler > - > > Key: KAFKA-8487 > URL: https://issues.apache.org/jira/browse/KAFKA-8487 > Project: Kafka > Issue Type: Bug > Components: consumer, streams >Affects Versions: 2.3 >Reporter: Guozhang Wang >Assignee: Guozhang Wang >Priority: Blocker > > In consumer, we handle the errors in sync / heartbeat / join response such > that: > 1. UNKNOWN_MEMBER_ID / ILLEGAL_GENERATION: we reset the generation and > request re-join. > 2. REBALANCE_IN_PROGRESS: do nothing if a rejoin will be executed, or request > re-join explicitly. > However, for commit response, we require resetGeneration for > REBALANCE_IN_PROGRESS as well. This is a flaw in two folds: > 1. As in KIP-345, with static members, reseting generation will lose the > member.id and hence may cause incorrect fencing. > 2. As in KIP-429, resetting generation will cause partitions to be "lost" > unnecessarily before re-joining the group. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7937) Flaky Test ResetConsumerGroupOffsetTest.testResetOffsetsNotExistingGroup
[ https://issues.apache.org/jira/browse/KAFKA-7937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860318#comment-16860318 ] Matthias J. Sax commented on KAFKA-7937: Another failure in trunk: [https://builds.apache.org/blue/organizations/jenkins/kafka-trunk-jdk11/detail/kafka-trunk-jdk11/616/tests] > Flaky Test ResetConsumerGroupOffsetTest.testResetOffsetsNotExistingGroup > > > Key: KAFKA-7937 > URL: https://issues.apache.org/jira/browse/KAFKA-7937 > Project: Kafka > Issue Type: Bug > Components: admin, clients, unit tests >Affects Versions: 2.2.0, 2.1.1, 2.3.0 >Reporter: Matthias J. Sax >Assignee: Gwen Shapira >Priority: Critical > Fix For: 2.4.0 > > > To get stable nightly builds for `2.2` release, I create tickets for all > observed test failures. > https://builds.apache.org/blue/organizations/jenkins/kafka-2.2-jdk8/detail/kafka-2.2-jdk8/19/pipeline > {quote}kafka.admin.ResetConsumerGroupOffsetTest > > testResetOffsetsNotExistingGroup FAILED > java.util.concurrent.ExecutionException: > org.apache.kafka.common.errors.CoordinatorNotAvailableException: The > coordinator is not available. at > org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45) > at > org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32) > at > org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89) > at > org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:260) > at > kafka.admin.ConsumerGroupCommand$ConsumerGroupService.resetOffsets(ConsumerGroupCommand.scala:306) > at > kafka.admin.ResetConsumerGroupOffsetTest.testResetOffsetsNotExistingGroup(ResetConsumerGroupOffsetTest.scala:89) > Caused by: org.apache.kafka.common.errors.CoordinatorNotAvailableException: > The coordinator is not available.{quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-8520) TimeoutException in client side doesn't have stack trace
Shixiong Zhu created KAFKA-8520: --- Summary: TimeoutException in client side doesn't have stack trace Key: KAFKA-8520 URL: https://issues.apache.org/jira/browse/KAFKA-8520 Project: Kafka Issue Type: New Feature Components: clients Reporter: Shixiong Zhu When a TimeoutException is thrown directly in the client side, it doesn't have any stack trace because it inherits "org.apache.kafka.common.errors.ApiException". This makes the user hard to debug timeout issues, because it's hard to know which line in the user codes throwing this TimeoutException. It would be great that adding a new client side TimeoutException which contains the stack trace. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-8521) Client unable to get a complete transaction set of messages using a single poll call
Boris Rybalkin created KAFKA-8521: - Summary: Client unable to get a complete transaction set of messages using a single poll call Key: KAFKA-8521 URL: https://issues.apache.org/jira/browse/KAFKA-8521 Project: Kafka Issue Type: Bug Components: clients Reporter: Boris Rybalkin I am unable to reliably get a complete list of messages from a successful transaction on a client side. What I get instead sometimes is a subset of a complete transaction in one poll and a second half of a transaction in a second poll. Am I right that poll should always give me a full transaction message set if a transaction was committed and client uses read_committed isolation level or not? Pseudo code: Server: begin transaction send (1, "test1") send (2, "test2") commit transaction Client: isolation level: read_committed poll -> [ 1 ] poll -> [ 2 ] What I want is: poll -> [1, 2] Also what I observed, when keys are the same for the messages in the transaction I always get a complete message set in one poll, but when keys are very different in inside transaction I usually get transaction spread across multiple polls. I can provide a working example if you think that this is a bug and not a misunderstanding of how poll works. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8516) Consider allowing all replicas to have read/write permissions
[ https://issues.apache.org/jira/browse/KAFKA-8516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860350#comment-16860350 ] Richard Yu commented on KAFKA-8516: --- Thanks for the heads up! Will add the link to issue. They are pretty close, but I think that KIP is only covering the addition of read permissions to replicas, not write permissions (i.e. fetches = reads). I think implementing read permissions for all replicas is considerably easier than write permissions (since we have to guarantee consistency). > Consider allowing all replicas to have read/write permissions > - > > Key: KAFKA-8516 > URL: https://issues.apache.org/jira/browse/KAFKA-8516 > Project: Kafka > Issue Type: Improvement >Reporter: Richard Yu >Priority: Major > > Currently, in Kafka internals, a leader is responsible for all the read and > write operations requested by the user. This naturally incurs a bottleneck > since one replica, as the leader, would experience a significantly heavier > workload than other replicas and also means that all client commands must > pass through a chokepoint. If a leader fails, all processing effectively > comes to a halt until another leader election. In order to help solve this > problem, we could think about redesigning Kafka core so that any replica is > able to do read and write operations as well. That is, the system be changed > so that _all_ replicas have read/write permissions. > > This has multiple positives. Notably the following: > * Workload can be more evenly distributed since leader replicas are weighted > more than follower replicas (in this new design, all replicas are equal) > * Some failures would not be as catastrophic as in the leader-follower > paradigm. There is no one single "leader". If one replica goes down, others > are still able to read/write as needed. Processing could continue without > interruption. > The implementation for such a change like this will be very extensive and > discussion would be needed to decide if such an improvement as described > above would warrant such a drastic redesign of Kafka internals. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-8516) Consider allowing all replicas to have read/write permissions
[ https://issues.apache.org/jira/browse/KAFKA-8516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Richard Yu updated KAFKA-8516: -- Description: Currently, in Kafka internals, a leader is responsible for all the read and write operations requested by the user. This naturally incurs a bottleneck since one replica, as the leader, would experience a significantly heavier workload than other replicas and also means that all client commands must pass through a chokepoint. If a leader fails, all processing effectively comes to a halt until another leader election. In order to help solve this problem, we could think about redesigning Kafka core so that any replica is able to do read and write operations as well. That is, the system be changed so that _all_ replicas have read/write permissions. This has multiple positives. Notably the following: * Workload can be more evenly distributed since leader replicas are weighted more than follower replicas (in this new design, all replicas are equal) * Some failures would not be as catastrophic as in the leader-follower paradigm. There is no one single "leader". If one replica goes down, others are still able to read/write as needed. Processing could continue without interruption. The implementation for such a change like this will be very extensive and discussion would be needed to decide if such an improvement as described above would warrant such a drastic redesign of Kafka internals. Relevant KIP for read permissions can be found here: [KIP-392|[https://cwiki.apache.org/confluence/display/KAFKA/KIP-392%3A+Allow+consumers+to+fetch+from+closest+replica]] was: Currently, in Kafka internals, a leader is responsible for all the read and write operations requested by the user. This naturally incurs a bottleneck since one replica, as the leader, would experience a significantly heavier workload than other replicas and also means that all client commands must pass through a chokepoint. If a leader fails, all processing effectively comes to a halt until another leader election. In order to help solve this problem, we could think about redesigning Kafka core so that any replica is able to do read and write operations as well. That is, the system be changed so that _all_ replicas have read/write permissions. This has multiple positives. Notably the following: * Workload can be more evenly distributed since leader replicas are weighted more than follower replicas (in this new design, all replicas are equal) * Some failures would not be as catastrophic as in the leader-follower paradigm. There is no one single "leader". If one replica goes down, others are still able to read/write as needed. Processing could continue without interruption. The implementation for such a change like this will be very extensive and discussion would be needed to decide if such an improvement as described above would warrant such a drastic redesign of Kafka internals. > Consider allowing all replicas to have read/write permissions > - > > Key: KAFKA-8516 > URL: https://issues.apache.org/jira/browse/KAFKA-8516 > Project: Kafka > Issue Type: Improvement >Reporter: Richard Yu >Priority: Major > > Currently, in Kafka internals, a leader is responsible for all the read and > write operations requested by the user. This naturally incurs a bottleneck > since one replica, as the leader, would experience a significantly heavier > workload than other replicas and also means that all client commands must > pass through a chokepoint. If a leader fails, all processing effectively > comes to a halt until another leader election. In order to help solve this > problem, we could think about redesigning Kafka core so that any replica is > able to do read and write operations as well. That is, the system be changed > so that _all_ replicas have read/write permissions. > > This has multiple positives. Notably the following: > * Workload can be more evenly distributed since leader replicas are weighted > more than follower replicas (in this new design, all replicas are equal) > * Some failures would not be as catastrophic as in the leader-follower > paradigm. There is no one single "leader". If one replica goes down, others > are still able to read/write as needed. Processing could continue without > interruption. > The implementation for such a change like this will be very extensive and > discussion would be needed to decide if such an improvement as described > above would warrant such a drastic redesign of Kafka internals. > Relevant KIP for read permissions can be found here: > [KIP-392|[https://cwiki.apache.org/confluence/display/KAFKA/KIP-392%3A+Allow+consumers+to+fetch+from+closest+replica]] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-8516) Consider allowing all replicas to have read/write permissions
[ https://issues.apache.org/jira/browse/KAFKA-8516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Richard Yu updated KAFKA-8516: -- Description: Currently, in Kafka internals, a leader is responsible for all the read and write operations requested by the user. This naturally incurs a bottleneck since one replica, as the leader, would experience a significantly heavier workload than other replicas and also means that all client commands must pass through a chokepoint. If a leader fails, all processing effectively comes to a halt until another leader election. In order to help solve this problem, we could think about redesigning Kafka core so that any replica is able to do read and write operations as well. That is, the system be changed so that _all_ replicas have read/write permissions. This has multiple positives. Notably the following: * Workload can be more evenly distributed since leader replicas are weighted more than follower replicas (in this new design, all replicas are equal) * Some failures would not be as catastrophic as in the leader-follower paradigm. There is no one single "leader". If one replica goes down, others are still able to read/write as needed. Processing could continue without interruption. The implementation for such a change like this will be very extensive and discussion would be needed to decide if such an improvement as described above would warrant such a drastic redesign of Kafka internals. Relevant KIP for read permissions can be found here: [https://cwiki.apache.org/confluence/display/KAFKA/KIP-392%3A+Allow+consumers+to+fetch+from+closest+replica] was: Currently, in Kafka internals, a leader is responsible for all the read and write operations requested by the user. This naturally incurs a bottleneck since one replica, as the leader, would experience a significantly heavier workload than other replicas and also means that all client commands must pass through a chokepoint. If a leader fails, all processing effectively comes to a halt until another leader election. In order to help solve this problem, we could think about redesigning Kafka core so that any replica is able to do read and write operations as well. That is, the system be changed so that _all_ replicas have read/write permissions. This has multiple positives. Notably the following: * Workload can be more evenly distributed since leader replicas are weighted more than follower replicas (in this new design, all replicas are equal) * Some failures would not be as catastrophic as in the leader-follower paradigm. There is no one single "leader". If one replica goes down, others are still able to read/write as needed. Processing could continue without interruption. The implementation for such a change like this will be very extensive and discussion would be needed to decide if such an improvement as described above would warrant such a drastic redesign of Kafka internals. Relevant KIP for read permissions can be found here: [KIP-392|[https://cwiki.apache.org/confluence/display/KAFKA/KIP-392%3A+Allow+consumers+to+fetch+from+closest+replica]] > Consider allowing all replicas to have read/write permissions > - > > Key: KAFKA-8516 > URL: https://issues.apache.org/jira/browse/KAFKA-8516 > Project: Kafka > Issue Type: Improvement >Reporter: Richard Yu >Priority: Major > > Currently, in Kafka internals, a leader is responsible for all the read and > write operations requested by the user. This naturally incurs a bottleneck > since one replica, as the leader, would experience a significantly heavier > workload than other replicas and also means that all client commands must > pass through a chokepoint. If a leader fails, all processing effectively > comes to a halt until another leader election. In order to help solve this > problem, we could think about redesigning Kafka core so that any replica is > able to do read and write operations as well. That is, the system be changed > so that _all_ replicas have read/write permissions. > > This has multiple positives. Notably the following: > * Workload can be more evenly distributed since leader replicas are weighted > more than follower replicas (in this new design, all replicas are equal) > * Some failures would not be as catastrophic as in the leader-follower > paradigm. There is no one single "leader". If one replica goes down, others > are still able to read/write as needed. Processing could continue without > interruption. > The implementation for such a change like this will be very extensive and > discussion would be needed to decide if such an improvement as described > above would warrant such a drastic redesign of Kafka internals. > Relevant KIP for read permissions can be found here: > [https://cwiki.
[jira] [Commented] (KAFKA-8519) Trogdor should support network degradation
[ https://issues.apache.org/jira/browse/KAFKA-8519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860371#comment-16860371 ] ASF GitHub Bot commented on KAFKA-8519: --- mumrah commented on pull request #6912: KAFKA-8519 Add trogdor action to slow down a network URL: https://github.com/apache/kafka/pull/6912 TODO This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Trogdor should support network degradation > -- > > Key: KAFKA-8519 > URL: https://issues.apache.org/jira/browse/KAFKA-8519 > Project: Kafka > Issue Type: Improvement > Components: system tests >Reporter: David Arthur >Assignee: David Arthur >Priority: Minor > > Trogdor should allow us to simulate degraded networks, similar to the network > partition spec. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8513) Add kafka-streams-application-reset.bat for Windows platform
[ https://issues.apache.org/jira/browse/KAFKA-8513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860465#comment-16860465 ] Guozhang Wang commented on KAFKA-8513: -- I think this is not worth discussing with a KIP as it is just proposing to add a wrapper of an existing java tool. [~mjsax] If you've reviewed it and think it's mergable. > Add kafka-streams-application-reset.bat for Windows platform > > > Key: KAFKA-8513 > URL: https://issues.apache.org/jira/browse/KAFKA-8513 > Project: Kafka > Issue Type: Improvement > Components: streams, tools >Reporter: Kengo Seki >Assignee: Kengo Seki >Priority: Minor > > For improving Windows support, it'd be nice if there were a batch file > corresponding to bin/kafka-streams-application-reset.sh. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8333) Load high watermark checkpoint only once when handling LeaderAndIsr requests
[ https://issues.apache.org/jira/browse/KAFKA-8333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860471#comment-16860471 ] ASF GitHub Bot commented on KAFKA-8333: --- hachikuji commented on pull request #6800: KAFKA-8333; Load high watermark checkpoint lazily when initializing replicas URL: https://github.com/apache/kafka/pull/6800 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Load high watermark checkpoint only once when handling LeaderAndIsr requests > > > Key: KAFKA-8333 > URL: https://issues.apache.org/jira/browse/KAFKA-8333 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Major > > Currently we reload the checkpoint file separately for every partition that > is first initialized on the broker. It would be more efficient to do this one > time only when we receive the LeaderAndIsr request and to reuse the state. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-8333) Load high watermark checkpoint only once when handling LeaderAndIsr requests
[ https://issues.apache.org/jira/browse/KAFKA-8333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson updated KAFKA-8333: --- Issue Type: Improvement (was: Bug) > Load high watermark checkpoint only once when handling LeaderAndIsr requests > > > Key: KAFKA-8333 > URL: https://issues.apache.org/jira/browse/KAFKA-8333 > Project: Kafka > Issue Type: Improvement >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Major > Fix For: 2.4.0 > > > Currently we reload the checkpoint file separately for every partition that > is first initialized on the broker. It would be more efficient to do this one > time only when we receive the LeaderAndIsr request and to reuse the state. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (KAFKA-8333) Load high watermark checkpoint only once when handling LeaderAndIsr requests
[ https://issues.apache.org/jira/browse/KAFKA-8333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson resolved KAFKA-8333. Resolution: Fixed Fix Version/s: 2.4.0 > Load high watermark checkpoint only once when handling LeaderAndIsr requests > > > Key: KAFKA-8333 > URL: https://issues.apache.org/jira/browse/KAFKA-8333 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Major > Fix For: 2.4.0 > > > Currently we reload the checkpoint file separately for every partition that > is first initialized on the broker. It would be more efficient to do this one > time only when we receive the LeaderAndIsr request and to reuse the state. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8513) Add kafka-streams-application-reset.bat for Windows platform
[ https://issues.apache.org/jira/browse/KAFKA-8513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860483#comment-16860483 ] Kengo Seki commented on KAFKA-8513: --- Thank you for the comments [~mjsax] and [~guozhang]! I also think that KIP is unnecessary in this case just as KAFKA-5143 and KAFKA-8349, because its purpose and implementation are simple and straightforward, and it's not a change of the existing features so it doesn't break current behaviour. > Add kafka-streams-application-reset.bat for Windows platform > > > Key: KAFKA-8513 > URL: https://issues.apache.org/jira/browse/KAFKA-8513 > Project: Kafka > Issue Type: Improvement > Components: streams, tools >Reporter: Kengo Seki >Assignee: Kengo Seki >Priority: Minor > > For improving Windows support, it'd be nice if there were a batch file > corresponding to bin/kafka-streams-application-reset.sh. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (KAFKA-8349) Add Windows batch files corresponding to kafka-delete-records.sh and kafka-log-dirs.sh
[ https://issues.apache.org/jira/browse/KAFKA-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kengo Seki resolved KAFKA-8349. --- Resolution: Fixed Closing this since it's been already merged. Thanks [~hachikuji]! > Add Windows batch files corresponding to kafka-delete-records.sh and > kafka-log-dirs.sh > -- > > Key: KAFKA-8349 > URL: https://issues.apache.org/jira/browse/KAFKA-8349 > Project: Kafka > Issue Type: Improvement > Components: tools >Reporter: Kengo Seki >Assignee: Kengo Seki >Priority: Minor > > Some shell scripts don't have corresponding batch files in bin\windows. > For improving Windows platform support, I'd like to add the following batch > files: > - bin\windows\kafka-delete-records.bat > - bin\windows\kafka-log-dirs.bat -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-8522) Tombstones can survive forever
Evelyn Bayes created KAFKA-8522: --- Summary: Tombstones can survive forever Key: KAFKA-8522 URL: https://issues.apache.org/jira/browse/KAFKA-8522 Project: Kafka Issue Type: Bug Components: log cleaner Reporter: Evelyn Bayes This is a bit grey zone as to whether it's a "bug" but it is certainly unintended behaviour. Under specific conditions tombstones effectively survive forever: * Small amount of throughput; * min.cleanable.dirty.ratio near or at 0; and * Other parameters at default. What happens is all the data continuously gets cycled into the oldest segment. Old records get compacted away, but the new records continuously update the timestamp of the oldest segment reseting the countdown for deleting tombstones. So tombstones build up in the oldest segment forever. While you could "fix" this by reducing the segment size, this can be undesirable as a sudden change in throughput could cause a dangerous number of segments to be created. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8516) Consider allowing all replicas to have read/write permissions
[ https://issues.apache.org/jira/browse/KAFKA-8516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860587#comment-16860587 ] Matthias J. Sax commented on KAFKA-8516: Agreed. In fact, I am not even sure if we can (or want) to allow writing to different replicas at all. Solving the consistency problem is very, very(!) hard, and might not be possible without a major performance hit. Hence, I tend to think that it will never be implemented. > Consider allowing all replicas to have read/write permissions > - > > Key: KAFKA-8516 > URL: https://issues.apache.org/jira/browse/KAFKA-8516 > Project: Kafka > Issue Type: Improvement >Reporter: Richard Yu >Priority: Major > > Currently, in Kafka internals, a leader is responsible for all the read and > write operations requested by the user. This naturally incurs a bottleneck > since one replica, as the leader, would experience a significantly heavier > workload than other replicas and also means that all client commands must > pass through a chokepoint. If a leader fails, all processing effectively > comes to a halt until another leader election. In order to help solve this > problem, we could think about redesigning Kafka core so that any replica is > able to do read and write operations as well. That is, the system be changed > so that _all_ replicas have read/write permissions. > > This has multiple positives. Notably the following: > * Workload can be more evenly distributed since leader replicas are weighted > more than follower replicas (in this new design, all replicas are equal) > * Some failures would not be as catastrophic as in the leader-follower > paradigm. There is no one single "leader". If one replica goes down, others > are still able to read/write as needed. Processing could continue without > interruption. > The implementation for such a change like this will be very extensive and > discussion would be needed to decide if such an improvement as described > above would warrant such a drastic redesign of Kafka internals. > Relevant KIP for read permissions can be found here: > [https://cwiki.apache.org/confluence/display/KAFKA/KIP-392%3A+Allow+consumers+to+fetch+from+closest+replica] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8513) Add kafka-streams-application-reset.bat for Windows platform
[ https://issues.apache.org/jira/browse/KAFKA-8513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860588#comment-16860588 ] Matthias J. Sax commented on KAFKA-8513: Ack. Fine with me. The PR looks good in general. > Add kafka-streams-application-reset.bat for Windows platform > > > Key: KAFKA-8513 > URL: https://issues.apache.org/jira/browse/KAFKA-8513 > Project: Kafka > Issue Type: Improvement > Components: streams, tools >Reporter: Kengo Seki >Assignee: Kengo Seki >Priority: Minor > > For improving Windows support, it'd be nice if there were a batch file > corresponding to bin/kafka-streams-application-reset.sh. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8450) Augment processed in MockProcessor as KeyValueAndTimestamp
[ https://issues.apache.org/jira/browse/KAFKA-8450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860589#comment-16860589 ] Matthias J. Sax commented on KAFKA-8450: Yes. Instead of calling `makeRecord` that returns a String, the idea is to use `KeyValueTimestamp` instead. > Augment processed in MockProcessor as KeyValueAndTimestamp > -- > > Key: KAFKA-8450 > URL: https://issues.apache.org/jira/browse/KAFKA-8450 > Project: Kafka > Issue Type: Improvement > Components: streams, unit tests >Reporter: Guozhang Wang >Assignee: SuryaTeja Duggi >Priority: Major > Labels: newbie > > Today the book-keeping list of `processed` records in MockProcessor is in the > form of String, in which we just call the key / value type's toString > function in order to book-keep, this loses the type information as well as > not having timestamp associated with it. > It's better to translate its type to `KeyValueAndTimestamp` and refactor > impacted unit tests. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8516) Consider allowing all replicas to have read/write permissions
[ https://issues.apache.org/jira/browse/KAFKA-8516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860591#comment-16860591 ] Richard Yu commented on KAFKA-8516: --- Well, this is when we start straying into an area called "consensus algorithms". Kafka's current leader-replica model closely follows an algorithm referred to as Raft (research paper here: [https://raft.github.io/raft.pdf] ). If we wish to implement the write permissions part (which looks like a pretty big if), then we would perhaps have to consider something along the lines of EPaxos ( [https://www.cs.cmu.edu/~dga/papers/epaxos-sosp2013.pdf] ). cc [~hachikuji] your thoughts on this? > Consider allowing all replicas to have read/write permissions > - > > Key: KAFKA-8516 > URL: https://issues.apache.org/jira/browse/KAFKA-8516 > Project: Kafka > Issue Type: Improvement >Reporter: Richard Yu >Priority: Major > > Currently, in Kafka internals, a leader is responsible for all the read and > write operations requested by the user. This naturally incurs a bottleneck > since one replica, as the leader, would experience a significantly heavier > workload than other replicas and also means that all client commands must > pass through a chokepoint. If a leader fails, all processing effectively > comes to a halt until another leader election. In order to help solve this > problem, we could think about redesigning Kafka core so that any replica is > able to do read and write operations as well. That is, the system be changed > so that _all_ replicas have read/write permissions. > > This has multiple positives. Notably the following: > * Workload can be more evenly distributed since leader replicas are weighted > more than follower replicas (in this new design, all replicas are equal) > * Some failures would not be as catastrophic as in the leader-follower > paradigm. There is no one single "leader". If one replica goes down, others > are still able to read/write as needed. Processing could continue without > interruption. > The implementation for such a change like this will be very extensive and > discussion would be needed to decide if such an improvement as described > above would warrant such a drastic redesign of Kafka internals. > Relevant KIP for read permissions can be found here: > [https://cwiki.apache.org/confluence/display/KAFKA/KIP-392%3A+Allow+consumers+to+fetch+from+closest+replica] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (KAFKA-8516) Consider allowing all replicas to have read/write permissions
[ https://issues.apache.org/jira/browse/KAFKA-8516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860591#comment-16860591 ] Richard Yu edited comment on KAFKA-8516 at 6/11/19 6:15 AM: Well, this is when we start straying into an area called "consensus algorithms". Kafka's current leader-replica model closely follows an algorithm referred to as Raft (research paper here: [https://raft.github.io/raft.pdf] ). If we wish to implement the write permissions part (which looks like a pretty big if), then we would perhaps have to consider something along the lines of EPaxos ( [https://www.cs.cmu.edu/~dga/papers/epaxos-sosp2013.pdf] ). cc [~hachikuji] your thoughts on this? Edit: If implemented correctly, there should be a pretty good performance gain (results in RAFT paper anyways seem to indicate this). was (Author: yohan123): Well, this is when we start straying into an area called "consensus algorithms". Kafka's current leader-replica model closely follows an algorithm referred to as Raft (research paper here: [https://raft.github.io/raft.pdf] ). If we wish to implement the write permissions part (which looks like a pretty big if), then we would perhaps have to consider something along the lines of EPaxos ( [https://www.cs.cmu.edu/~dga/papers/epaxos-sosp2013.pdf] ). cc [~hachikuji] your thoughts on this? > Consider allowing all replicas to have read/write permissions > - > > Key: KAFKA-8516 > URL: https://issues.apache.org/jira/browse/KAFKA-8516 > Project: Kafka > Issue Type: Improvement >Reporter: Richard Yu >Priority: Major > > Currently, in Kafka internals, a leader is responsible for all the read and > write operations requested by the user. This naturally incurs a bottleneck > since one replica, as the leader, would experience a significantly heavier > workload than other replicas and also means that all client commands must > pass through a chokepoint. If a leader fails, all processing effectively > comes to a halt until another leader election. In order to help solve this > problem, we could think about redesigning Kafka core so that any replica is > able to do read and write operations as well. That is, the system be changed > so that _all_ replicas have read/write permissions. > > This has multiple positives. Notably the following: > * Workload can be more evenly distributed since leader replicas are weighted > more than follower replicas (in this new design, all replicas are equal) > * Some failures would not be as catastrophic as in the leader-follower > paradigm. There is no one single "leader". If one replica goes down, others > are still able to read/write as needed. Processing could continue without > interruption. > The implementation for such a change like this will be very extensive and > discussion would be needed to decide if such an improvement as described > above would warrant such a drastic redesign of Kafka internals. > Relevant KIP for read permissions can be found here: > [https://cwiki.apache.org/confluence/display/KAFKA/KIP-392%3A+Allow+consumers+to+fetch+from+closest+replica] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (KAFKA-8516) Consider allowing all replicas to have read/write permissions
[ https://issues.apache.org/jira/browse/KAFKA-8516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860591#comment-16860591 ] Richard Yu edited comment on KAFKA-8516 at 6/11/19 6:16 AM: Well, this is when we start straying into an area called "consensus algorithms". Kafka's current leader-replica model loosely follows an algorithm referred to as Raft (research paper here: [https://raft.github.io/raft.pdf] ). If we wish to implement the write permissions part (which looks like a pretty big if), then we would perhaps have to consider something along the lines of EPaxos ( [https://www.cs.cmu.edu/~dga/papers/epaxos-sosp2013.pdf] ). cc [~hachikuji] your thoughts on this? Edit: If implemented correctly, there should be a pretty good performance gain (results in RAFT paper anyways seem to indicate this). was (Author: yohan123): Well, this is when we start straying into an area called "consensus algorithms". Kafka's current leader-replica model closely follows an algorithm referred to as Raft (research paper here: [https://raft.github.io/raft.pdf] ). If we wish to implement the write permissions part (which looks like a pretty big if), then we would perhaps have to consider something along the lines of EPaxos ( [https://www.cs.cmu.edu/~dga/papers/epaxos-sosp2013.pdf] ). cc [~hachikuji] your thoughts on this? Edit: If implemented correctly, there should be a pretty good performance gain (results in RAFT paper anyways seem to indicate this). > Consider allowing all replicas to have read/write permissions > - > > Key: KAFKA-8516 > URL: https://issues.apache.org/jira/browse/KAFKA-8516 > Project: Kafka > Issue Type: Improvement >Reporter: Richard Yu >Priority: Major > > Currently, in Kafka internals, a leader is responsible for all the read and > write operations requested by the user. This naturally incurs a bottleneck > since one replica, as the leader, would experience a significantly heavier > workload than other replicas and also means that all client commands must > pass through a chokepoint. If a leader fails, all processing effectively > comes to a halt until another leader election. In order to help solve this > problem, we could think about redesigning Kafka core so that any replica is > able to do read and write operations as well. That is, the system be changed > so that _all_ replicas have read/write permissions. > > This has multiple positives. Notably the following: > * Workload can be more evenly distributed since leader replicas are weighted > more than follower replicas (in this new design, all replicas are equal) > * Some failures would not be as catastrophic as in the leader-follower > paradigm. There is no one single "leader". If one replica goes down, others > are still able to read/write as needed. Processing could continue without > interruption. > The implementation for such a change like this will be very extensive and > discussion would be needed to decide if such an improvement as described > above would warrant such a drastic redesign of Kafka internals. > Relevant KIP for read permissions can be found here: > [https://cwiki.apache.org/confluence/display/KAFKA/KIP-392%3A+Allow+consumers+to+fetch+from+closest+replica] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8041) Flaky Test LogDirFailureTest#testIOExceptionDuringLogRoll
[ https://issues.apache.org/jira/browse/KAFKA-8041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860594#comment-16860594 ] Matthias J. Sax commented on KAFKA-8041: Failed again: [https://builds.apache.org/blue/organizations/jenkins/kafka-2.0-jdk8/detail/kafka-2.0-jdk8/275/tests] > Flaky Test LogDirFailureTest#testIOExceptionDuringLogRoll > - > > Key: KAFKA-8041 > URL: https://issues.apache.org/jira/browse/KAFKA-8041 > Project: Kafka > Issue Type: Bug > Components: core, unit tests >Affects Versions: 2.0.1, 2.3.0 >Reporter: Matthias J. Sax >Priority: Critical > Labels: flaky-test > Fix For: 2.4.0 > > > [https://builds.apache.org/blue/organizations/jenkins/kafka-2.0-jdk8/detail/kafka-2.0-jdk8/236/tests] > {quote}java.lang.AssertionError: Expected some messages > at kafka.utils.TestUtils$.fail(TestUtils.scala:357) > at kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:787) > at > kafka.server.LogDirFailureTest.testProduceAfterLogDirFailureOnLeader(LogDirFailureTest.scala:189) > at > kafka.server.LogDirFailureTest.testIOExceptionDuringLogRoll(LogDirFailureTest.scala:63){quote} > STDOUT > {quote}[2019-03-05 03:44:58,614] ERROR [ReplicaFetcher replicaId=1, > leaderId=0, fetcherId=0] Error for partition topic-6 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-03-05 03:44:58,614] ERROR [ReplicaFetcher replicaId=1, leaderId=0, > fetcherId=0] Error for partition topic-0 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-03-05 03:44:58,615] ERROR [ReplicaFetcher replicaId=1, leaderId=0, > fetcherId=0] Error for partition topic-10 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-03-05 03:44:58,615] ERROR [ReplicaFetcher replicaId=1, leaderId=0, > fetcherId=0] Error for partition topic-4 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-03-05 03:44:58,615] ERROR [ReplicaFetcher replicaId=1, leaderId=0, > fetcherId=0] Error for partition topic-8 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-03-05 03:44:58,615] ERROR [ReplicaFetcher replicaId=1, leaderId=0, > fetcherId=0] Error for partition topic-2 at offset 0 > (kafka.server.ReplicaFetcherThread:76) > org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server > does not host this topic-partition. > [2019-03-05 03:45:00,248] ERROR Error while rolling log segment for topic-0 > in dir > /home/jenkins/jenkins-slave/workspace/kafka-2.0-jdk8/core/data/kafka-3869208920357262216 > (kafka.server.LogDirFailureChannel:76) > java.io.FileNotFoundException: > /home/jenkins/jenkins-slave/workspace/kafka-2.0-jdk8/core/data/kafka-3869208920357262216/topic-0/.index > (Not a directory) > at java.io.RandomAccessFile.open0(Native Method) > at java.io.RandomAccessFile.open(RandomAccessFile.java:316) > at java.io.RandomAccessFile.(RandomAccessFile.java:243) > at kafka.log.AbstractIndex.$anonfun$resize$1(AbstractIndex.scala:121) > at scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:12) > at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:251) > at kafka.log.AbstractIndex.resize(AbstractIndex.scala:115) > at kafka.log.AbstractIndex.$anonfun$trimToValidSize$1(AbstractIndex.scala:184) > at scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:12) > at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:251) > at kafka.log.AbstractIndex.trimToValidSize(AbstractIndex.scala:184) > at kafka.log.LogSegment.onBecomeInactiveSegment(LogSegment.scala:501) > at kafka.log.Log.$anonfun$roll$8(Log.scala:1520) > at kafka.log.Log.$anonfun$roll$8$adapted(Log.scala:1520) > at scala.Option.foreach(Option.scala:257) > at kafka.log.Log.$anonfun$roll$2(Log.scala:1520) > at kafka.log.Log.maybeHandleIOException(Log.scala:1881) > at kafka.log.Log.roll(Log.scala:1484) > at > kafka.server.LogDirFailureTest.testProduceAfterLogDirFailureOnLeader(LogDirFailureTest.scala:154) > at > kafka.server.LogDirFailureTest.testIOExceptionDuringLogRoll(LogDirFailureTest.scala:63) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.Del
[jira] [Commented] (KAFKA-8078) Flaky Test TableTableJoinIntegrationTest#testInnerInner
[ https://issues.apache.org/jira/browse/KAFKA-8078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860596#comment-16860596 ] Matthias J. Sax commented on KAFKA-8078: Failed again with timeout: [https://builds.apache.org/blue/organizations/jenkins/kafka-2.3-jdk8/detail/kafka-2.3-jdk8/47/tests] testLeftOuter, caching enabled > Flaky Test TableTableJoinIntegrationTest#testInnerInner > --- > > Key: KAFKA-8078 > URL: https://issues.apache.org/jira/browse/KAFKA-8078 > Project: Kafka > Issue Type: Bug > Components: streams, unit tests >Affects Versions: 2.3.0 >Reporter: Matthias J. Sax >Priority: Major > Labels: flaky-test > Fix For: 2.4.0 > > > [https://builds.apache.org/blue/organizations/jenkins/kafka-trunk-jdk8/detail/kafka-trunk-jdk8/3445/tests] > {quote}java.lang.AssertionError: Condition not met within timeout 15000. > Never received expected final result. > at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:365) > at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:325) > at > org.apache.kafka.streams.integration.AbstractJoinIntegrationTest.runTest(AbstractJoinIntegrationTest.java:246) > at > org.apache.kafka.streams.integration.TableTableJoinIntegrationTest.testInnerInner(TableTableJoinIntegrationTest.java:196){quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-7988) Flaky Test DynamicBrokerReconfigurationTest#testThreadPoolResize
[ https://issues.apache.org/jira/browse/KAFKA-7988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax updated KAFKA-7988: --- Affects Version/s: 2.3.0 > Flaky Test DynamicBrokerReconfigurationTest#testThreadPoolResize > > > Key: KAFKA-7988 > URL: https://issues.apache.org/jira/browse/KAFKA-7988 > Project: Kafka > Issue Type: Bug > Components: core, unit tests >Affects Versions: 2.2.0, 2.3.0 >Reporter: Matthias J. Sax >Assignee: Rajini Sivaram >Priority: Critical > Labels: flaky-test > Fix For: 2.4.0 > > > To get stable nightly builds for `2.2` release, I create tickets for all > observed test failures. > [https://builds.apache.org/blue/organizations/jenkins/kafka-2.2-jdk8/detail/kafka-2.2-jdk8/30/] > {quote}kafka.server.DynamicBrokerReconfigurationTest > testThreadPoolResize > FAILED java.lang.AssertionError: Invalid threads: expected 6, got 5: > List(ReplicaFetcherThread-0-0, ReplicaFetcherThread-0-1, > ReplicaFetcherThread-0-0, ReplicaFetcherThread-0-2, ReplicaFetcherThread-0-1) > at org.junit.Assert.fail(Assert.java:88) at > org.junit.Assert.assertTrue(Assert.java:41) at > kafka.server.DynamicBrokerReconfigurationTest.verifyThreads(DynamicBrokerReconfigurationTest.scala:1260) > at > kafka.server.DynamicBrokerReconfigurationTest.maybeVerifyThreadPoolSize$1(DynamicBrokerReconfigurationTest.scala:531) > at > kafka.server.DynamicBrokerReconfigurationTest.resizeThreadPool$1(DynamicBrokerReconfigurationTest.scala:550) > at > kafka.server.DynamicBrokerReconfigurationTest.reducePoolSize$1(DynamicBrokerReconfigurationTest.scala:536) > at > kafka.server.DynamicBrokerReconfigurationTest.$anonfun$testThreadPoolResize$3(DynamicBrokerReconfigurationTest.scala:559) > at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:158) at > kafka.server.DynamicBrokerReconfigurationTest.verifyThreadPoolResize$1(DynamicBrokerReconfigurationTest.scala:558) > at > kafka.server.DynamicBrokerReconfigurationTest.testThreadPoolResize(DynamicBrokerReconfigurationTest.scala:572){quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-7988) Flaky Test DynamicBrokerReconfigurationTest#testThreadPoolResize
[ https://issues.apache.org/jira/browse/KAFKA-7988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860597#comment-16860597 ] Matthias J. Sax commented on KAFKA-7988: Failed in 2.3: [https://builds.apache.org/blue/organizations/jenkins/kafka-2.3-jdk8/detail/kafka-2.3-jdk8/47/tests] {code:java} java.lang.AssertionError: expected:<{0=10, 1=11, 2=12, 3=13, 4=4, 5=5, 6=6, 7=7, 8=8, 9=9}> but was:<{0=10, 1=11, 2=12, 3=13, 4=14, 5=5, 6=6, 7=7, 8=8, 9=9}> at org.junit.Assert.fail(Assert.java:89) at org.junit.Assert.failNotEquals(Assert.java:835) at org.junit.Assert.assertEquals(Assert.java:120) at org.junit.Assert.assertEquals(Assert.java:146) at kafka.server.DynamicBrokerReconfigurationTest.stopAndVerifyProduceConsume(DynamicBrokerReconfigurationTest.scala:1353) at kafka.server.DynamicBrokerReconfigurationTest.verifyThreadPoolResize$1(DynamicBrokerReconfigurationTest.scala:615) at kafka.server.DynamicBrokerReconfigurationTest.testThreadPoolResize(DynamicBrokerReconfigurationTest.scala:629) {code} > Flaky Test DynamicBrokerReconfigurationTest#testThreadPoolResize > > > Key: KAFKA-7988 > URL: https://issues.apache.org/jira/browse/KAFKA-7988 > Project: Kafka > Issue Type: Bug > Components: core, unit tests >Affects Versions: 2.2.0 >Reporter: Matthias J. Sax >Assignee: Rajini Sivaram >Priority: Critical > Labels: flaky-test > Fix For: 2.4.0 > > > To get stable nightly builds for `2.2` release, I create tickets for all > observed test failures. > [https://builds.apache.org/blue/organizations/jenkins/kafka-2.2-jdk8/detail/kafka-2.2-jdk8/30/] > {quote}kafka.server.DynamicBrokerReconfigurationTest > testThreadPoolResize > FAILED java.lang.AssertionError: Invalid threads: expected 6, got 5: > List(ReplicaFetcherThread-0-0, ReplicaFetcherThread-0-1, > ReplicaFetcherThread-0-0, ReplicaFetcherThread-0-2, ReplicaFetcherThread-0-1) > at org.junit.Assert.fail(Assert.java:88) at > org.junit.Assert.assertTrue(Assert.java:41) at > kafka.server.DynamicBrokerReconfigurationTest.verifyThreads(DynamicBrokerReconfigurationTest.scala:1260) > at > kafka.server.DynamicBrokerReconfigurationTest.maybeVerifyThreadPoolSize$1(DynamicBrokerReconfigurationTest.scala:531) > at > kafka.server.DynamicBrokerReconfigurationTest.resizeThreadPool$1(DynamicBrokerReconfigurationTest.scala:550) > at > kafka.server.DynamicBrokerReconfigurationTest.reducePoolSize$1(DynamicBrokerReconfigurationTest.scala:536) > at > kafka.server.DynamicBrokerReconfigurationTest.$anonfun$testThreadPoolResize$3(DynamicBrokerReconfigurationTest.scala:559) > at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:158) at > kafka.server.DynamicBrokerReconfigurationTest.verifyThreadPoolResize$1(DynamicBrokerReconfigurationTest.scala:558) > at > kafka.server.DynamicBrokerReconfigurationTest.testThreadPoolResize(DynamicBrokerReconfigurationTest.scala:572){quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-8418) Connect System tests are not waiting for REST resources to be registered
[ https://issues.apache.org/jira/browse/KAFKA-8418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax updated KAFKA-8418: --- Fix Version/s: (was: 2.3) > Connect System tests are not waiting for REST resources to be registered > > > Key: KAFKA-8418 > URL: https://issues.apache.org/jira/browse/KAFKA-8418 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 2.2.0 >Reporter: Oleksandr Diachenko >Assignee: Oleksandr Diachenko >Priority: Blocker > Fix For: 1.0.3, 1.1.2, 2.0.2, 2.3.0, 2.1.2, 2.2.1 > > > I am getting an error while running Kafka tests: > {code} > Traceback (most recent call last): File > "/home/jenkins/workspace/system-test-kafka_5.3.x/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.7.5-py2.7.egg/ducktape/tests/runner_client.py", > line 132, in run data = self.run_test() File > "/home/jenkins/workspace/system-test-kafka_5.3.x/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.7.5-py2.7.egg/ducktape/tests/runner_client.py", > line 189, in run_test return self.test_context.function(self.test) File > "/home/jenkins/workspace/system-test-kafka_5.3.x/kafka/tests/kafkatest/tests/connect/connect_rest_test.py", > line 89, in test_rest_api assert set([connector_plugin['class'] for > connector_plugin in self.cc.list_connector_plugins()]) == > \{self.FILE_SOURCE_CONNECTOR, self.FILE_SINK_CONNECTOR} File > "/home/jenkins/workspace/system-test-kafka_5.3.x/kafka/tests/kafkatest/services/connect.py", > line 218, in list_connector_plugins return self._rest('/connector-plugins/', > node=node) File > "/home/jenkins/workspace/system-test-kafka_5.3.x/kafka/tests/kafkatest/services/connect.py", > line 234, in _rest raise ConnectRestError(resp.status_code, resp.text, > resp.url) ConnectRestError > {code} > From the logs, I see two messages: > {code} > [2019-05-23 16:09:59,373] INFO REST server listening at > http://172.31.39.205:8083/, advertising URL http://worker1:8083/ > (org.apache.kafka.connect.runtime.rest.RestServer) > {code} > and {code} > [2019-05-23 16:10:00,738] INFO REST resources initialized; server is started > and ready to handle requests > (org.apache.kafka.connect.runtime.rest.RestServer) > {code} > it takes 1365 ms to actually load REST resources, but the test is waiting on > a port to be listening. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-8265) Connect Client Config Override policy
[ https://issues.apache.org/jira/browse/KAFKA-8265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax updated KAFKA-8265: --- Fix Version/s: (was: 2.3) 2.3.0 > Connect Client Config Override policy > - > > Key: KAFKA-8265 > URL: https://issues.apache.org/jira/browse/KAFKA-8265 > Project: Kafka > Issue Type: New Feature > Components: KafkaConnect >Reporter: Magesh kumar Nandakumar >Assignee: Magesh kumar Nandakumar >Priority: Major > Fix For: 2.3.0 > > > Right now, each source connector and sink connector inherit their client > configurations from the worker properties. Within the worker properties, all > configurations that have a prefix of "producer." or "consumer." are applied > to all source connectors and sink connectors respectively. > We should allow the "producer." or "consumer." to be overridden in > accordance to an override policy determined by the administrator. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-8473) Adjust Connect system tests for incremental cooperative rebalancing and enable them for both eager and incremental cooperative rebalancing
[ https://issues.apache.org/jira/browse/KAFKA-8473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax updated KAFKA-8473: --- Affects Version/s: (was: 2.3) 2.3.0 > Adjust Connect system tests for incremental cooperative rebalancing and > enable them for both eager and incremental cooperative rebalancing > -- > > Key: KAFKA-8473 > URL: https://issues.apache.org/jira/browse/KAFKA-8473 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 2.3.0 >Reporter: Konstantine Karantasis >Assignee: Konstantine Karantasis >Priority: Critical > Fix For: 2.3.0 > > > > {{connect.protocol=compatible}} that enables incremental cooperative > rebalancing if all workers are in that version is now the default option. > System tests should be parameterized to keep running the for eager > rebalancing protocol as well to make sure no regression have happened. > Also, for the incremental cooperative protocol, a few tests need to be > adjusted to have a lower rebalancing delay (the delay that is applied to wait > for returning workers) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-8475) Temporarily restore SslFactory.sslContext() helper
[ https://issues.apache.org/jira/browse/KAFKA-8475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax updated KAFKA-8475: --- Fix Version/s: (was: 2.3) 2.3.0 > Temporarily restore SslFactory.sslContext() helper > -- > > Key: KAFKA-8475 > URL: https://issues.apache.org/jira/browse/KAFKA-8475 > Project: Kafka > Issue Type: Bug >Reporter: Colin P. McCabe >Assignee: Randall Hauch >Priority: Blocker > Fix For: 2.3.0 > > > Temporarily restore the SslFactory.sslContext() function, which some > connectors use. This function is not a public API and it will be removed > eventually. For now, we will mark it as deprecated. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-8473) Adjust Connect system tests for incremental cooperative rebalancing and enable them for both eager and incremental cooperative rebalancing
[ https://issues.apache.org/jira/browse/KAFKA-8473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax updated KAFKA-8473: --- Fix Version/s: (was: 2.3) 2.3.0 > Adjust Connect system tests for incremental cooperative rebalancing and > enable them for both eager and incremental cooperative rebalancing > -- > > Key: KAFKA-8473 > URL: https://issues.apache.org/jira/browse/KAFKA-8473 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 2.3 >Reporter: Konstantine Karantasis >Assignee: Konstantine Karantasis >Priority: Critical > Fix For: 2.3.0 > > > > {{connect.protocol=compatible}} that enables incremental cooperative > rebalancing if all workers are in that version is now the default option. > System tests should be parameterized to keep running the for eager > rebalancing protocol as well to make sure no regression have happened. > Also, for the incremental cooperative protocol, a few tests need to be > adjusted to have a lower rebalancing delay (the delay that is applied to wait > for returning workers) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8263) Flaky Test MetricsIntegrationTest#testStreamMetricOfWindowStore
[ https://issues.apache.org/jira/browse/KAFKA-8263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860599#comment-16860599 ] Matthias J. Sax commented on KAFKA-8263: [https://builds.apache.org/blue/organizations/jenkins/kafka-trunk-jdk8/detail/kafka-trunk-jdk8/3714/tests] > Flaky Test MetricsIntegrationTest#testStreamMetricOfWindowStore > --- > > Key: KAFKA-8263 > URL: https://issues.apache.org/jira/browse/KAFKA-8263 > Project: Kafka > Issue Type: Bug > Components: streams, unit tests >Affects Versions: 2.3.0 >Reporter: Matthias J. Sax >Assignee: Bruno Cadonna >Priority: Major > Labels: flaky-test > Fix For: 2.4.0 > > > [https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/3900/testReport/junit/org.apache.kafka.streams.integration/MetricsIntegrationTest/testStreamMetricOfWindowStore/] > {quote}java.lang.AssertionError: Condition not met within timeout 1. > testStoreMetricWindow -> Size of metrics of type:'put-latency-avg' must be > equal to:2 but it's equal to 0 expected:<2> but was:<0> at > org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:361){quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (KAFKA-8193) Flaky Test MetricsIntegrationTest#testStreamMetricOfWindowStore
[ https://issues.apache.org/jira/browse/KAFKA-8193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax resolved KAFKA-8193. Resolution: Duplicate > Flaky Test MetricsIntegrationTest#testStreamMetricOfWindowStore > --- > > Key: KAFKA-8193 > URL: https://issues.apache.org/jira/browse/KAFKA-8193 > Project: Kafka > Issue Type: Bug > Components: streams, unit tests >Affects Versions: 2.3.0 >Reporter: Konstantine Karantasis >Assignee: Guozhang Wang >Priority: Major > Labels: flaky-test > Fix For: 2.4.0 > > > [https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/3576/console] > *14:14:48* org.apache.kafka.streams.integration.MetricsIntegrationTest > > testStreamMetricOfWindowStore STARTED > *14:14:59* > org.apache.kafka.streams.integration.MetricsIntegrationTest.testStreamMetricOfWindowStore > failed, log available in > /home/jenkins/jenkins-slave/workspace/kafka-pr-jdk11-scala2.12/streams/build/reports/testOutput/org.apache.kafka.streams.integration.MetricsIntegrationTest.testStreamMetricOfWindowStore.test.stdout > *14:14:59* > *14:14:59* org.apache.kafka.streams.integration.MetricsIntegrationTest > > testStreamMetricOfWindowStore FAILED > *14:14:59* java.lang.AssertionError: Condition not met within timeout 1. > testStoreMetricWindow -> Size of metrics of type:'put-latency-avg' must be > equal to:2 but it's equal to 0 expected:<2> but was:<0> > *14:14:59* at > org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:361) > *14:14:59* at > org.apache.kafka.streams.integration.MetricsIntegrationTest.testStreamMetricOfWindowStore(MetricsIntegrationTest.java:260) > *14:15:01* -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-8193) Flaky Test MetricsIntegrationTest#testStreamMetricOfWindowStore
[ https://issues.apache.org/jira/browse/KAFKA-8193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax updated KAFKA-8193: --- Fix Version/s: (was: 2.4.0) > Flaky Test MetricsIntegrationTest#testStreamMetricOfWindowStore > --- > > Key: KAFKA-8193 > URL: https://issues.apache.org/jira/browse/KAFKA-8193 > Project: Kafka > Issue Type: Bug > Components: streams, unit tests >Affects Versions: 2.3.0 >Reporter: Konstantine Karantasis >Assignee: Guozhang Wang >Priority: Major > Labels: flaky-test > > [https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/3576/console] > *14:14:48* org.apache.kafka.streams.integration.MetricsIntegrationTest > > testStreamMetricOfWindowStore STARTED > *14:14:59* > org.apache.kafka.streams.integration.MetricsIntegrationTest.testStreamMetricOfWindowStore > failed, log available in > /home/jenkins/jenkins-slave/workspace/kafka-pr-jdk11-scala2.12/streams/build/reports/testOutput/org.apache.kafka.streams.integration.MetricsIntegrationTest.testStreamMetricOfWindowStore.test.stdout > *14:14:59* > *14:14:59* org.apache.kafka.streams.integration.MetricsIntegrationTest > > testStreamMetricOfWindowStore FAILED > *14:14:59* java.lang.AssertionError: Condition not met within timeout 1. > testStoreMetricWindow -> Size of metrics of type:'put-latency-avg' must be > equal to:2 but it's equal to 0 expected:<2> but was:<0> > *14:14:59* at > org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:361) > *14:14:59* at > org.apache.kafka.streams.integration.MetricsIntegrationTest.testStreamMetricOfWindowStore(MetricsIntegrationTest.java:260) > *14:15:01* -- This message was sent by Atlassian JIRA (v7.6.3#76005)