[jira] [Commented] (KAFKA-8504) Suppressed do not emit with TimeWindows

2019-06-10 Thread Simone (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16859804#comment-16859804
 ] 

Simone commented on KAFKA-8504:
---

Yeah, I understand that due to backward compatibility is not easy to fix this 
but I agree with you about the docs and I think that given the workaround is 
quite simple (just need to set a custom grace period) probably a good written 
documentation could be enough (for now at least)

Thank you very much for the reply :)

> Suppressed do not emit with TimeWindows
> ---
>
> Key: KAFKA-8504
> URL: https://issues.apache.org/jira/browse/KAFKA-8504
> Project: Kafka
>  Issue Type: Improvement
>  Components: documentation, streams
>Affects Versions: 2.2.1
>Reporter: Simone
>Priority: Minor
>
> Hi, I'm playing a bit with KafkaStream and the new suppress feature. I 
> noticed that when using a {{TimeWindows}} without explicitly setting the 
> grace {{suppress}} will not emit any message if used with 
> {{Suppressed.untilWindowCloses.}}
> I look a bit into the code and from what I understood with this configuration 
> {{suppress}} should use the {{grace}} setting of the {{TimeWindows}}. But 
> since using {{TimeWindows.of(Duration)}} default the grace to {{-1}} and when 
> getting the grace using the method {{TimeWindows.gracePeriodMs()}} in case of 
> grace equals to -1 the return value is set to {{maintainMs() - size()}} I 
> think that the end of window is not properly calculated. 
> Of course is possible to avoid this problem forcing the {{grace}} to 0 when 
> creating the TimeWindows but I think that this should be the default 
> behaviour at least when it comes to the suppress feature.
> I hope I have not misunderstood the code in my analysis, thank you :)
> Simone



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8517) A lot of WARN messages in kafka log "Received a PartitionLeaderEpoch assignment for an epoch < latestEpoch:

2019-06-10 Thread JIRA
Jacek Żoch created KAFKA-8517:
-

 Summary: A lot of WARN messages in kafka log "Received a 
PartitionLeaderEpoch assignment for an epoch < latestEpoch: 
 Key: KAFKA-8517
 URL: https://issues.apache.org/jira/browse/KAFKA-8517
 Project: Kafka
  Issue Type: Bug
  Components: logging
Affects Versions: 0.11.0.1
 Environment: PRD
Reporter: Jacek Żoch


We have 2.0 version but it was happening in version 0.11

In kafka log there is a lot of messages

"Received a PartitionLeaderEpoch assignment for an epoch < latestEpoch. This 
implies messages have arrived out of order."

On 23.05 we had 

Received a PartitionLeaderEpoch assignment for an epoch < latestEpoch. This 
implies messages have arrived out of order. New: \{epoch:181, 
offset:23562380995}, Current: \{epoch:362, offset10365488611} for Partition: 
__consumer_offsets-30 (kafka.server.epoch.LeaderEpochFileCache)

Currently we have

Received a PartitionLeaderEpoch assignment for an epoch < latestEpoch. This 
implies messages have arrived out of order. New: \{epoch:199, 
offset:24588072027}, Current: \{epoch:362, offset:10365488611} for Partition: 
__consumer_offsets-30 (kafka.server.epoch.LeaderEpochFileCache)

I think kafka should either fix it "under the hood" or have information how to 
fix it

There is no information, how dangerous is it and how to fix it

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-5876) IQ should throw different exceptions for different errors

2019-06-10 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-5876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16859868#comment-16859868
 ] 

ASF GitHub Bot commented on KAFKA-5876:
---

vitojeng commented on pull request #5814: KAFKA-5876: IQ should throw different 
exceptions for different errors(KIP-216)
URL: https://github.com/apache/kafka/pull/5814
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> IQ should throw different exceptions for different errors
> -
>
> Key: KAFKA-5876
> URL: https://issues.apache.org/jira/browse/KAFKA-5876
> Project: Kafka
>  Issue Type: Task
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Vito Jeng
>Priority: Major
>  Labels: needs-kip, newbie++
>
> Currently, IQ does only throws {{InvalidStateStoreException}} for all errors 
> that occur. However, we have different types of errors and should throw 
> different exceptions for those types.
> For example, if a store was migrated it must be rediscovered while if a store 
> cannot be queried yet, because it is still re-created after a rebalance, the 
> user just needs to wait until store recreation is finished.
> There might be other examples, too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8450) Augment processed in MockProcessor as KeyValueAndTimestamp

2019-06-10 Thread SuryaTeja Duggi (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16859958#comment-16859958
 ] 

SuryaTeja Duggi commented on KAFKA-8450:


[~guozhang] [~mjsax] Can some one comment on this ticket. 

> Augment processed in MockProcessor as KeyValueAndTimestamp
> --
>
> Key: KAFKA-8450
> URL: https://issues.apache.org/jira/browse/KAFKA-8450
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams, unit tests
>Reporter: Guozhang Wang
>Assignee: SuryaTeja Duggi
>Priority: Major
>  Labels: newbie
>
> Today the book-keeping list of `processed` records in MockProcessor is in the 
> form of String, in which we just call the key / value type's toString 
> function in order to book-keep, this loses the type information as well as 
> not having timestamp associated with it.
> It's better to translate its type to `KeyValueAndTimestamp` and refactor 
> impacted unit tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8518) Update GitHub repo description to make it obvious that it is not a "mirror" anymore

2019-06-10 Thread Etienne Neveu (JIRA)
Etienne Neveu created KAFKA-8518:


 Summary: Update GitHub repo description to make it obvious that it 
is not a "mirror" anymore
 Key: KAFKA-8518
 URL: https://issues.apache.org/jira/browse/KAFKA-8518
 Project: Kafka
  Issue Type: Improvement
  Components: documentation
Reporter: Etienne Neveu


When I go to [https://github.com/apache/kafka], the description at the top is 
"Mirror of Apache Kafka", which makes me think that the development is done 
elsewhere and this is just a mirrored GitHub repo.

But I think the main development has now moved to GitHub, so it would be nice 
to change this description, to make it more obvious.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KAFKA-8514) Kafka clients should not include Scala's Java 8 compatibility lib

2019-06-10 Thread Almog Gavra (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Almog Gavra reassigned KAFKA-8514:
--

Assignee: Almog Gavra

> Kafka clients should not include Scala's Java 8 compatibility lib
> -
>
> Key: KAFKA-8514
> URL: https://issues.apache.org/jira/browse/KAFKA-8514
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Reporter: Enno Runne
>Assignee: Almog Gavra
>Priority: Major
>
> The work with KAFKA-8305 brought in 
> "org.scala-lang.modules:scala-java8-compat_2.12" 
> as dependency of the client lib. This will give users from Scala an extra 
> headache as it would need to be excluded when working with another Scala 
> version.
> Instead, it should be moved to be a dependency of *"core"* for the 
> convenience of converting Java and Scala Option instances.
> See 
> [https://github.com/apache/kafka/commit/8e161580b859b2fcd54c59625e232b99f3bb48d0#diff-c197962302397baf3a4cc36463dce5eaR942]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8514) Kafka clients should not include Scala's Java 8 compatibility lib

2019-06-10 Thread Almog Gavra (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860099#comment-16860099
 ] 

Almog Gavra commented on KAFKA-8514:


ack. thanks for pointing this out [~ennru]! I'll change this and have a PR out 
soon after making sure everything compiles.

> Kafka clients should not include Scala's Java 8 compatibility lib
> -
>
> Key: KAFKA-8514
> URL: https://issues.apache.org/jira/browse/KAFKA-8514
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Reporter: Enno Runne
>Assignee: Almog Gavra
>Priority: Major
>
> The work with KAFKA-8305 brought in 
> "org.scala-lang.modules:scala-java8-compat_2.12" 
> as dependency of the client lib. This will give users from Scala an extra 
> headache as it would need to be excluded when working with another Scala 
> version.
> Instead, it should be moved to be a dependency of *"core"* for the 
> convenience of converting Java and Scala Option instances.
> See 
> [https://github.com/apache/kafka/commit/8e161580b859b2fcd54c59625e232b99f3bb48d0#diff-c197962302397baf3a4cc36463dce5eaR942]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8504) Suppressed do not emit with TimeWindows

2019-06-10 Thread Matthias J. Sax (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-8504:
---
Labels: newbie  (was: )

> Suppressed do not emit with TimeWindows
> ---
>
> Key: KAFKA-8504
> URL: https://issues.apache.org/jira/browse/KAFKA-8504
> Project: Kafka
>  Issue Type: Improvement
>  Components: documentation, streams
>Affects Versions: 2.2.1
>Reporter: Simone
>Priority: Minor
>  Labels: newbie
>
> Hi, I'm playing a bit with KafkaStream and the new suppress feature. I 
> noticed that when using a {{TimeWindows}} without explicitly setting the 
> grace {{suppress}} will not emit any message if used with 
> {{Suppressed.untilWindowCloses.}}
> I look a bit into the code and from what I understood with this configuration 
> {{suppress}} should use the {{grace}} setting of the {{TimeWindows}}. But 
> since using {{TimeWindows.of(Duration)}} default the grace to {{-1}} and when 
> getting the grace using the method {{TimeWindows.gracePeriodMs()}} in case of 
> grace equals to -1 the return value is set to {{maintainMs() - size()}} I 
> think that the end of window is not properly calculated. 
> Of course is possible to avoid this problem forcing the {{grace}} to 0 when 
> creating the TimeWindows but I think that this should be the default 
> behaviour at least when it comes to the suppress feature.
> I hope I have not misunderstood the code in my analysis, thank you :)
> Simone



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8504) Update Docs to explain how to use suppress() in more details

2019-06-10 Thread Matthias J. Sax (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-8504:
---
Summary: Update Docs to explain how to use suppress() in more details  
(was: Suppressed do not emit with TimeWindows)

> Update Docs to explain how to use suppress() in more details
> 
>
> Key: KAFKA-8504
> URL: https://issues.apache.org/jira/browse/KAFKA-8504
> Project: Kafka
>  Issue Type: Improvement
>  Components: documentation, streams
>Affects Versions: 2.2.1
>Reporter: Simone
>Priority: Minor
>  Labels: newbie
>
> Hi, I'm playing a bit with KafkaStream and the new suppress feature. I 
> noticed that when using a {{TimeWindows}} without explicitly setting the 
> grace {{suppress}} will not emit any message if used with 
> {{Suppressed.untilWindowCloses.}}
> I look a bit into the code and from what I understood with this configuration 
> {{suppress}} should use the {{grace}} setting of the {{TimeWindows}}. But 
> since using {{TimeWindows.of(Duration)}} default the grace to {{-1}} and when 
> getting the grace using the method {{TimeWindows.gracePeriodMs()}} in case of 
> grace equals to -1 the return value is set to {{maintainMs() - size()}} I 
> think that the end of window is not properly calculated. 
> Of course is possible to avoid this problem forcing the {{grace}} to 0 when 
> creating the TimeWindows but I think that this should be the default 
> behaviour at least when it comes to the suppress feature.
> I hope I have not misunderstood the code in my analysis, thank you :)
> Simone



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7315) Streams serialization docs contain a broken link for Avro

2019-06-10 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860163#comment-16860163
 ] 

ASF GitHub Bot commented on KAFKA-7315:
---

mjsax commented on pull request #212: KAFKA-7315 update TOC internal links 
serdes all versions
URL: https://github.com/apache/kafka-site/pull/212
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Streams serialization docs contain a broken link for Avro
> -
>
> Key: KAFKA-7315
> URL: https://issues.apache.org/jira/browse/KAFKA-7315
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: John Roesler
>Assignee: Victoria Bialas
>Priority: Major
>  Labels: docuentation, newbie
>
> https://kafka.apache.org/documentation/streams/developer-guide/datatypes.html#avro



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7315) Streams serialization docs contain a broken link for Avro

2019-06-10 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860167#comment-16860167
 ] 

ASF GitHub Bot commented on KAFKA-7315:
---

mjsax commented on pull request #6875: KAFKA-7315 DOCS update TOC internal 
links serdes all versions
URL: https://github.com/apache/kafka/pull/6875
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Streams serialization docs contain a broken link for Avro
> -
>
> Key: KAFKA-7315
> URL: https://issues.apache.org/jira/browse/KAFKA-7315
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: John Roesler
>Assignee: Victoria Bialas
>Priority: Major
>  Labels: docuentation, newbie
>
> https://kafka.apache.org/documentation/streams/developer-guide/datatypes.html#avro



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-7315) Streams serialization docs contain a broken link for Avro

2019-06-10 Thread Matthias J. Sax (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax resolved KAFKA-7315.

Resolution: Fixed

> Streams serialization docs contain a broken link for Avro
> -
>
> Key: KAFKA-7315
> URL: https://issues.apache.org/jira/browse/KAFKA-7315
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: John Roesler
>Assignee: Victoria Bialas
>Priority: Major
>  Labels: docuentation, newbie
>
> https://kafka.apache.org/documentation/streams/developer-guide/datatypes.html#avro



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8516) Consider allowing all replicas to have read/write permissions

2019-06-10 Thread Matthias J. Sax (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860225#comment-16860225
 ] 

Matthias J. Sax commented on KAFKA-8516:


[~Yohan123] Are you aware of 
[https://cwiki.apache.org/confluence/display/KAFKA/KIP-392%3A+Allow+consumers+to+fetch+from+closest+replica]

This ticket might be a duplicate to the KIP?

> Consider allowing all replicas to have read/write permissions
> -
>
> Key: KAFKA-8516
> URL: https://issues.apache.org/jira/browse/KAFKA-8516
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Richard Yu
>Priority: Major
>
> Currently, in Kafka internals, a leader is responsible for all the read and 
> write operations requested by the user. This naturally incurs a bottleneck 
> since one replica, as the leader, would experience a significantly heavier 
> workload than other replicas and also means that all client commands must 
> pass through a chokepoint. If a leader fails, all processing effectively 
> comes to a halt until another leader election. In order to help solve this 
> problem, we could think about redesigning Kafka core so that any replica is 
> able to do read and write operations as well. That is, the system be changed 
> so that _all_ replicas have read/write permissions.
>   
>  This has multiple positives. Notably the following:
>  * Workload can be more evenly distributed since leader replicas are weighted 
> more than follower replicas (in this new design, all replicas are equal)
>  * Some failures would not be as catastrophic as in the leader-follower 
> paradigm. There is no one single "leader". If one replica goes down, others 
> are still able to read/write as needed. Processing could continue without 
> interruption.
> The implementation for such a change like this will be very extensive and 
> discussion would be needed to decide if such an improvement as described 
> above would warrant such a drastic redesign of Kafka internals.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8514) Kafka clients should not include Scala's Java 8 compatibility lib

2019-06-10 Thread Almog Gavra (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860241#comment-16860241
 ] 

Almog Gavra commented on KAFKA-8514:


[https://github.com/apache/kafka/pull/6910] - everything compiles and passes 
locally

> Kafka clients should not include Scala's Java 8 compatibility lib
> -
>
> Key: KAFKA-8514
> URL: https://issues.apache.org/jira/browse/KAFKA-8514
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Reporter: Enno Runne
>Assignee: Almog Gavra
>Priority: Major
>
> The work with KAFKA-8305 brought in 
> "org.scala-lang.modules:scala-java8-compat_2.12" 
> as dependency of the client lib. This will give users from Scala an extra 
> headache as it would need to be excluded when working with another Scala 
> version.
> Instead, it should be moved to be a dependency of *"core"* for the 
> convenience of converting Java and Scala Option instances.
> See 
> [https://github.com/apache/kafka/commit/8e161580b859b2fcd54c59625e232b99f3bb48d0#diff-c197962302397baf3a4cc36463dce5eaR942]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8514) Kafka clients should not include Scala's Java 8 compatibility lib

2019-06-10 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860240#comment-16860240
 ] 

ASF GitHub Bot commented on KAFKA-8514:
---

agavra commented on pull request #6910: KAFKA-8514: move the scala-java8-compat 
import to the :core project insetad of :clients
URL: https://github.com/apache/kafka/pull/6910
 
 
   See https://issues.apache.org/jira/browse/KAFKA-8514 - this makes it so that 
only the `:core` module depends on the newly added java8 convertors.
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Kafka clients should not include Scala's Java 8 compatibility lib
> -
>
> Key: KAFKA-8514
> URL: https://issues.apache.org/jira/browse/KAFKA-8514
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Reporter: Enno Runne
>Assignee: Almog Gavra
>Priority: Major
>
> The work with KAFKA-8305 brought in 
> "org.scala-lang.modules:scala-java8-compat_2.12" 
> as dependency of the client lib. This will give users from Scala an extra 
> headache as it would need to be excluded when working with another Scala 
> version.
> Instead, it should be moved to be a dependency of *"core"* for the 
> convenience of converting Java and Scala Option instances.
> See 
> [https://github.com/apache/kafka/commit/8e161580b859b2fcd54c59625e232b99f3bb48d0#diff-c197962302397baf3a4cc36463dce5eaR942]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8519) Trogdor should support network degradation

2019-06-10 Thread David Arthur (JIRA)
David Arthur created KAFKA-8519:
---

 Summary: Trogdor should support network degradation
 Key: KAFKA-8519
 URL: https://issues.apache.org/jira/browse/KAFKA-8519
 Project: Kafka
  Issue Type: Improvement
  Components: system tests
Reporter: David Arthur


Trogdor should allow us to simulate degraded networks, similar to the network 
partition spec.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KAFKA-8519) Trogdor should support network degradation

2019-06-10 Thread David Arthur (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Arthur reassigned KAFKA-8519:
---

Assignee: David Arthur

> Trogdor should support network degradation
> --
>
> Key: KAFKA-8519
> URL: https://issues.apache.org/jira/browse/KAFKA-8519
> Project: Kafka
>  Issue Type: Improvement
>  Components: system tests
>Reporter: David Arthur
>Assignee: David Arthur
>Priority: Minor
>
> Trogdor should allow us to simulate degraded networks, similar to the network 
> partition spec.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7760) Add broker configuration to set minimum value for segment.bytes and segment.ms

2019-06-10 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860287#comment-16860287
 ] 

ASF GitHub Bot commented on KAFKA-7760:
---

dulvinw commented on pull request #6911: KAFKA-7760
URL: https://github.com/apache/kafka/pull/6911
 
 
   *Add broker configuration to set minimum value for segment.bytes and 
segment.ms.*
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add broker configuration to set minimum value for segment.bytes and segment.ms
> --
>
> Key: KAFKA-7760
> URL: https://issues.apache.org/jira/browse/KAFKA-7760
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Badai Aqrandista
>Assignee: Dulvin Witharane
>Priority: Major
>  Labels: kip, newbie
>
> If someone set segment.bytes or segment.ms at topic level to a very small 
> value (e.g. segment.bytes=1000 or segment.ms=1000), Kafka will generate a 
> very high number of segment files. This can bring down the whole broker due 
> to hitting the maximum open file (for log) or maximum number of mmap-ed file 
> (for index).
> To prevent that from happening, I would like to suggest adding two new items 
> to the broker configuration:
>  * min.topic.segment.bytes, defaults to 1048576: The minimum value for 
> segment.bytes. When someone sets topic configuration segment.bytes to a value 
> lower than this, Kafka throws an error INVALID VALUE.
>  * min.topic.segment.ms, defaults to 360: The minimum value for 
> segment.ms. When someone sets topic configuration segment.ms to a value lower 
> than this, Kafka throws an error INVALID VALUE.
> Thanks
> Badai



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8500) member.id should always update upon static member rejoin despite of group state

2019-06-10 Thread Boyang Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-8500:
---
Issue Type: Bug  (was: Improvement)

> member.id should always update upon static member rejoin despite of group 
> state
> ---
>
> Key: KAFKA-8500
> URL: https://issues.apache.org/jira/browse/KAFKA-8500
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer, streams
>Affects Versions: 2.3.0
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Blocker
>
> A blocking bug was detected by [~guozhang] that the `member.id` wasn't get 
> updated upon static member rejoining when the group is not in stable state. 
> This could make duplicate member fencing harder and potentially yield 
> incorrect processing outputs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8487) Consumer should not resetGeneration upon REBALANCE_IN_PROGRESS in commit response handler

2019-06-10 Thread Boyang Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-8487:
---
Affects Version/s: 2.3

> Consumer should not resetGeneration upon REBALANCE_IN_PROGRESS in commit 
> response handler
> -
>
> Key: KAFKA-8487
> URL: https://issues.apache.org/jira/browse/KAFKA-8487
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 2.3
>Reporter: Guozhang Wang
>Assignee: Guozhang Wang
>Priority: Blocker
>
> In consumer, we handle the errors in sync / heartbeat / join response such 
> that:
> 1. UNKNOWN_MEMBER_ID / ILLEGAL_GENERATION: we reset the generation and 
> request re-join.
> 2. REBALANCE_IN_PROGRESS: do nothing if a rejoin will be executed, or request 
> re-join explicitly.
> However, for commit response, we require resetGeneration for 
> REBALANCE_IN_PROGRESS as well. This is a flaw in two folds:
> 1. As in KIP-345, with static members, reseting generation will lose the 
> member.id and hence may cause incorrect fencing.
> 2. As in KIP-429, resetting generation will cause partitions to be "lost" 
> unnecessarily before re-joining the group. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8487) Consumer should not resetGeneration upon REBALANCE_IN_PROGRESS in commit response handler

2019-06-10 Thread Boyang Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-8487:
---
Component/s: streams

> Consumer should not resetGeneration upon REBALANCE_IN_PROGRESS in commit 
> response handler
> -
>
> Key: KAFKA-8487
> URL: https://issues.apache.org/jira/browse/KAFKA-8487
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer, streams
>Affects Versions: 2.3
>Reporter: Guozhang Wang
>Assignee: Guozhang Wang
>Priority: Blocker
>
> In consumer, we handle the errors in sync / heartbeat / join response such 
> that:
> 1. UNKNOWN_MEMBER_ID / ILLEGAL_GENERATION: we reset the generation and 
> request re-join.
> 2. REBALANCE_IN_PROGRESS: do nothing if a rejoin will be executed, or request 
> re-join explicitly.
> However, for commit response, we require resetGeneration for 
> REBALANCE_IN_PROGRESS as well. This is a flaw in two folds:
> 1. As in KIP-345, with static members, reseting generation will lose the 
> member.id and hence may cause incorrect fencing.
> 2. As in KIP-429, resetting generation will cause partitions to be "lost" 
> unnecessarily before re-joining the group. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7937) Flaky Test ResetConsumerGroupOffsetTest.testResetOffsetsNotExistingGroup

2019-06-10 Thread Matthias J. Sax (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860318#comment-16860318
 ] 

Matthias J. Sax commented on KAFKA-7937:


Another failure in trunk: 
[https://builds.apache.org/blue/organizations/jenkins/kafka-trunk-jdk11/detail/kafka-trunk-jdk11/616/tests]

> Flaky Test ResetConsumerGroupOffsetTest.testResetOffsetsNotExistingGroup
> 
>
> Key: KAFKA-7937
> URL: https://issues.apache.org/jira/browse/KAFKA-7937
> Project: Kafka
>  Issue Type: Bug
>  Components: admin, clients, unit tests
>Affects Versions: 2.2.0, 2.1.1, 2.3.0
>Reporter: Matthias J. Sax
>Assignee: Gwen Shapira
>Priority: Critical
> Fix For: 2.4.0
>
>
> To get stable nightly builds for `2.2` release, I create tickets for all 
> observed test failures.
> https://builds.apache.org/blue/organizations/jenkins/kafka-2.2-jdk8/detail/kafka-2.2-jdk8/19/pipeline
> {quote}kafka.admin.ResetConsumerGroupOffsetTest > 
> testResetOffsetsNotExistingGroup FAILED 
> java.util.concurrent.ExecutionException: 
> org.apache.kafka.common.errors.CoordinatorNotAvailableException: The 
> coordinator is not available. at 
> org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45)
>  at 
> org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32)
>  at 
> org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89)
>  at 
> org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:260)
>  at 
> kafka.admin.ConsumerGroupCommand$ConsumerGroupService.resetOffsets(ConsumerGroupCommand.scala:306)
>  at 
> kafka.admin.ResetConsumerGroupOffsetTest.testResetOffsetsNotExistingGroup(ResetConsumerGroupOffsetTest.scala:89)
>  Caused by: org.apache.kafka.common.errors.CoordinatorNotAvailableException: 
> The coordinator is not available.{quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8520) TimeoutException in client side doesn't have stack trace

2019-06-10 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created KAFKA-8520:
---

 Summary: TimeoutException in client side doesn't have stack trace
 Key: KAFKA-8520
 URL: https://issues.apache.org/jira/browse/KAFKA-8520
 Project: Kafka
  Issue Type: New Feature
  Components: clients
Reporter: Shixiong Zhu


When a TimeoutException is thrown directly in the client side, it doesn't have 
any stack trace because it inherits 
"org.apache.kafka.common.errors.ApiException". This makes the user hard to 
debug timeout issues, because it's hard to know which line in the user codes 
throwing this TimeoutException.

It would be great that adding a new client side TimeoutException which contains 
the stack trace.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8521) Client unable to get a complete transaction set of messages using a single poll call

2019-06-10 Thread Boris Rybalkin (JIRA)
Boris Rybalkin created KAFKA-8521:
-

 Summary: Client unable to get a complete transaction set of 
messages using a single poll call
 Key: KAFKA-8521
 URL: https://issues.apache.org/jira/browse/KAFKA-8521
 Project: Kafka
  Issue Type: Bug
  Components: clients
Reporter: Boris Rybalkin


I am unable to reliably get a complete list of messages from a successful 
transaction on a client side.

What I get instead sometimes is a subset of a complete transaction in one poll 
and a second half of a transaction in a second poll.

Am I right that poll should always give me a full transaction message set if a 
transaction was committed and client uses read_committed isolation level or not?

 

Pseudo code:

Server:

begin transaction

send (1, "test1")

send (2, "test2")

commit transaction

 

Client:

isolation level: read_committed

poll -> [ 1 ]

poll -> [ 2 ]

 

What I want is:

poll -> [1, 2]

 

Also what I observed, when keys are the same for the messages in the 
transaction I always get a complete message set in one poll, but when keys are 
very different in inside transaction I usually get transaction spread across 
multiple polls.

 

I can provide a working example if you think that this is a bug and not a 
misunderstanding of how poll works.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8516) Consider allowing all replicas to have read/write permissions

2019-06-10 Thread Richard Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860350#comment-16860350
 ] 

Richard Yu commented on KAFKA-8516:
---

Thanks for the heads up! Will add the link to issue.

They are pretty close, but I think that KIP is only covering the addition of 
read permissions to replicas, not write permissions (i.e. fetches = reads). I 
think implementing read permissions for all replicas is considerably easier 
than write permissions (since we have to guarantee consistency). 

> Consider allowing all replicas to have read/write permissions
> -
>
> Key: KAFKA-8516
> URL: https://issues.apache.org/jira/browse/KAFKA-8516
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Richard Yu
>Priority: Major
>
> Currently, in Kafka internals, a leader is responsible for all the read and 
> write operations requested by the user. This naturally incurs a bottleneck 
> since one replica, as the leader, would experience a significantly heavier 
> workload than other replicas and also means that all client commands must 
> pass through a chokepoint. If a leader fails, all processing effectively 
> comes to a halt until another leader election. In order to help solve this 
> problem, we could think about redesigning Kafka core so that any replica is 
> able to do read and write operations as well. That is, the system be changed 
> so that _all_ replicas have read/write permissions.
>   
>  This has multiple positives. Notably the following:
>  * Workload can be more evenly distributed since leader replicas are weighted 
> more than follower replicas (in this new design, all replicas are equal)
>  * Some failures would not be as catastrophic as in the leader-follower 
> paradigm. There is no one single "leader". If one replica goes down, others 
> are still able to read/write as needed. Processing could continue without 
> interruption.
> The implementation for such a change like this will be very extensive and 
> discussion would be needed to decide if such an improvement as described 
> above would warrant such a drastic redesign of Kafka internals.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8516) Consider allowing all replicas to have read/write permissions

2019-06-10 Thread Richard Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Yu updated KAFKA-8516:
--
Description: 
Currently, in Kafka internals, a leader is responsible for all the read and 
write operations requested by the user. This naturally incurs a bottleneck 
since one replica, as the leader, would experience a significantly heavier 
workload than other replicas and also means that all client commands must pass 
through a chokepoint. If a leader fails, all processing effectively comes to a 
halt until another leader election. In order to help solve this problem, we 
could think about redesigning Kafka core so that any replica is able to do read 
and write operations as well. That is, the system be changed so that _all_ 
replicas have read/write permissions.
  
 This has multiple positives. Notably the following:
 * Workload can be more evenly distributed since leader replicas are weighted 
more than follower replicas (in this new design, all replicas are equal)
 * Some failures would not be as catastrophic as in the leader-follower 
paradigm. There is no one single "leader". If one replica goes down, others are 
still able to read/write as needed. Processing could continue without 
interruption.

The implementation for such a change like this will be very extensive and 
discussion would be needed to decide if such an improvement as described above 
would warrant such a drastic redesign of Kafka internals.

Relevant KIP for read permissions can be found here:

[KIP-392|[https://cwiki.apache.org/confluence/display/KAFKA/KIP-392%3A+Allow+consumers+to+fetch+from+closest+replica]]

  was:
Currently, in Kafka internals, a leader is responsible for all the read and 
write operations requested by the user. This naturally incurs a bottleneck 
since one replica, as the leader, would experience a significantly heavier 
workload than other replicas and also means that all client commands must pass 
through a chokepoint. If a leader fails, all processing effectively comes to a 
halt until another leader election. In order to help solve this problem, we 
could think about redesigning Kafka core so that any replica is able to do read 
and write operations as well. That is, the system be changed so that _all_ 
replicas have read/write permissions.
  
 This has multiple positives. Notably the following:
 * Workload can be more evenly distributed since leader replicas are weighted 
more than follower replicas (in this new design, all replicas are equal)
 * Some failures would not be as catastrophic as in the leader-follower 
paradigm. There is no one single "leader". If one replica goes down, others are 
still able to read/write as needed. Processing could continue without 
interruption.

The implementation for such a change like this will be very extensive and 
discussion would be needed to decide if such an improvement as described above 
would warrant such a drastic redesign of Kafka internals.


> Consider allowing all replicas to have read/write permissions
> -
>
> Key: KAFKA-8516
> URL: https://issues.apache.org/jira/browse/KAFKA-8516
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Richard Yu
>Priority: Major
>
> Currently, in Kafka internals, a leader is responsible for all the read and 
> write operations requested by the user. This naturally incurs a bottleneck 
> since one replica, as the leader, would experience a significantly heavier 
> workload than other replicas and also means that all client commands must 
> pass through a chokepoint. If a leader fails, all processing effectively 
> comes to a halt until another leader election. In order to help solve this 
> problem, we could think about redesigning Kafka core so that any replica is 
> able to do read and write operations as well. That is, the system be changed 
> so that _all_ replicas have read/write permissions.
>   
>  This has multiple positives. Notably the following:
>  * Workload can be more evenly distributed since leader replicas are weighted 
> more than follower replicas (in this new design, all replicas are equal)
>  * Some failures would not be as catastrophic as in the leader-follower 
> paradigm. There is no one single "leader". If one replica goes down, others 
> are still able to read/write as needed. Processing could continue without 
> interruption.
> The implementation for such a change like this will be very extensive and 
> discussion would be needed to decide if such an improvement as described 
> above would warrant such a drastic redesign of Kafka internals.
> Relevant KIP for read permissions can be found here:
> [KIP-392|[https://cwiki.apache.org/confluence/display/KAFKA/KIP-392%3A+Allow+consumers+to+fetch+from+closest+replica]]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (KAFKA-8516) Consider allowing all replicas to have read/write permissions

2019-06-10 Thread Richard Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Yu updated KAFKA-8516:
--
Description: 
Currently, in Kafka internals, a leader is responsible for all the read and 
write operations requested by the user. This naturally incurs a bottleneck 
since one replica, as the leader, would experience a significantly heavier 
workload than other replicas and also means that all client commands must pass 
through a chokepoint. If a leader fails, all processing effectively comes to a 
halt until another leader election. In order to help solve this problem, we 
could think about redesigning Kafka core so that any replica is able to do read 
and write operations as well. That is, the system be changed so that _all_ 
replicas have read/write permissions.
  
 This has multiple positives. Notably the following:
 * Workload can be more evenly distributed since leader replicas are weighted 
more than follower replicas (in this new design, all replicas are equal)
 * Some failures would not be as catastrophic as in the leader-follower 
paradigm. There is no one single "leader". If one replica goes down, others are 
still able to read/write as needed. Processing could continue without 
interruption.

The implementation for such a change like this will be very extensive and 
discussion would be needed to decide if such an improvement as described above 
would warrant such a drastic redesign of Kafka internals.

Relevant KIP for read permissions can be found here:

[https://cwiki.apache.org/confluence/display/KAFKA/KIP-392%3A+Allow+consumers+to+fetch+from+closest+replica]

  was:
Currently, in Kafka internals, a leader is responsible for all the read and 
write operations requested by the user. This naturally incurs a bottleneck 
since one replica, as the leader, would experience a significantly heavier 
workload than other replicas and also means that all client commands must pass 
through a chokepoint. If a leader fails, all processing effectively comes to a 
halt until another leader election. In order to help solve this problem, we 
could think about redesigning Kafka core so that any replica is able to do read 
and write operations as well. That is, the system be changed so that _all_ 
replicas have read/write permissions.
  
 This has multiple positives. Notably the following:
 * Workload can be more evenly distributed since leader replicas are weighted 
more than follower replicas (in this new design, all replicas are equal)
 * Some failures would not be as catastrophic as in the leader-follower 
paradigm. There is no one single "leader". If one replica goes down, others are 
still able to read/write as needed. Processing could continue without 
interruption.

The implementation for such a change like this will be very extensive and 
discussion would be needed to decide if such an improvement as described above 
would warrant such a drastic redesign of Kafka internals.

Relevant KIP for read permissions can be found here:

[KIP-392|[https://cwiki.apache.org/confluence/display/KAFKA/KIP-392%3A+Allow+consumers+to+fetch+from+closest+replica]]


> Consider allowing all replicas to have read/write permissions
> -
>
> Key: KAFKA-8516
> URL: https://issues.apache.org/jira/browse/KAFKA-8516
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Richard Yu
>Priority: Major
>
> Currently, in Kafka internals, a leader is responsible for all the read and 
> write operations requested by the user. This naturally incurs a bottleneck 
> since one replica, as the leader, would experience a significantly heavier 
> workload than other replicas and also means that all client commands must 
> pass through a chokepoint. If a leader fails, all processing effectively 
> comes to a halt until another leader election. In order to help solve this 
> problem, we could think about redesigning Kafka core so that any replica is 
> able to do read and write operations as well. That is, the system be changed 
> so that _all_ replicas have read/write permissions.
>   
>  This has multiple positives. Notably the following:
>  * Workload can be more evenly distributed since leader replicas are weighted 
> more than follower replicas (in this new design, all replicas are equal)
>  * Some failures would not be as catastrophic as in the leader-follower 
> paradigm. There is no one single "leader". If one replica goes down, others 
> are still able to read/write as needed. Processing could continue without 
> interruption.
> The implementation for such a change like this will be very extensive and 
> discussion would be needed to decide if such an improvement as described 
> above would warrant such a drastic redesign of Kafka internals.
> Relevant KIP for read permissions can be found here:
> [https://cwiki.

[jira] [Commented] (KAFKA-8519) Trogdor should support network degradation

2019-06-10 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860371#comment-16860371
 ] 

ASF GitHub Bot commented on KAFKA-8519:
---

mumrah commented on pull request #6912: KAFKA-8519 Add trogdor action to slow 
down a network
URL: https://github.com/apache/kafka/pull/6912
 
 
   TODO
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Trogdor should support network degradation
> --
>
> Key: KAFKA-8519
> URL: https://issues.apache.org/jira/browse/KAFKA-8519
> Project: Kafka
>  Issue Type: Improvement
>  Components: system tests
>Reporter: David Arthur
>Assignee: David Arthur
>Priority: Minor
>
> Trogdor should allow us to simulate degraded networks, similar to the network 
> partition spec.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8513) Add kafka-streams-application-reset.bat for Windows platform

2019-06-10 Thread Guozhang Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860465#comment-16860465
 ] 

Guozhang Wang commented on KAFKA-8513:
--

I think this is not worth discussing with a KIP as it is just proposing to add 
a wrapper of an existing java tool. [~mjsax] If you've reviewed it and think 
it's mergable.

> Add kafka-streams-application-reset.bat for Windows platform
> 
>
> Key: KAFKA-8513
> URL: https://issues.apache.org/jira/browse/KAFKA-8513
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams, tools
>Reporter: Kengo Seki
>Assignee: Kengo Seki
>Priority: Minor
>
> For improving Windows support, it'd be nice if there were a batch file 
> corresponding to bin/kafka-streams-application-reset.sh.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8333) Load high watermark checkpoint only once when handling LeaderAndIsr requests

2019-06-10 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860471#comment-16860471
 ] 

ASF GitHub Bot commented on KAFKA-8333:
---

hachikuji commented on pull request #6800: KAFKA-8333; Load high watermark 
checkpoint lazily when initializing replicas
URL: https://github.com/apache/kafka/pull/6800
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Load high watermark checkpoint only once when handling LeaderAndIsr requests
> 
>
> Key: KAFKA-8333
> URL: https://issues.apache.org/jira/browse/KAFKA-8333
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>Priority: Major
>
> Currently we reload the checkpoint file separately for every partition that 
> is first initialized on the broker. It would be more efficient to do this one 
> time only when we receive the LeaderAndIsr request and to reuse the state.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8333) Load high watermark checkpoint only once when handling LeaderAndIsr requests

2019-06-10 Thread Jason Gustafson (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson updated KAFKA-8333:
---
Issue Type: Improvement  (was: Bug)

> Load high watermark checkpoint only once when handling LeaderAndIsr requests
> 
>
> Key: KAFKA-8333
> URL: https://issues.apache.org/jira/browse/KAFKA-8333
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>Priority: Major
> Fix For: 2.4.0
>
>
> Currently we reload the checkpoint file separately for every partition that 
> is first initialized on the broker. It would be more efficient to do this one 
> time only when we receive the LeaderAndIsr request and to reuse the state.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-8333) Load high watermark checkpoint only once when handling LeaderAndIsr requests

2019-06-10 Thread Jason Gustafson (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson resolved KAFKA-8333.

   Resolution: Fixed
Fix Version/s: 2.4.0

> Load high watermark checkpoint only once when handling LeaderAndIsr requests
> 
>
> Key: KAFKA-8333
> URL: https://issues.apache.org/jira/browse/KAFKA-8333
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>Priority: Major
> Fix For: 2.4.0
>
>
> Currently we reload the checkpoint file separately for every partition that 
> is first initialized on the broker. It would be more efficient to do this one 
> time only when we receive the LeaderAndIsr request and to reuse the state.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8513) Add kafka-streams-application-reset.bat for Windows platform

2019-06-10 Thread Kengo Seki (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860483#comment-16860483
 ] 

Kengo Seki commented on KAFKA-8513:
---

Thank you for the comments [~mjsax] and [~guozhang]! I also think that KIP is 
unnecessary in this case just as KAFKA-5143 and KAFKA-8349, because its purpose 
and implementation are simple and straightforward, and it's not a change of the 
existing features so it doesn't break current behaviour.

> Add kafka-streams-application-reset.bat for Windows platform
> 
>
> Key: KAFKA-8513
> URL: https://issues.apache.org/jira/browse/KAFKA-8513
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams, tools
>Reporter: Kengo Seki
>Assignee: Kengo Seki
>Priority: Minor
>
> For improving Windows support, it'd be nice if there were a batch file 
> corresponding to bin/kafka-streams-application-reset.sh.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-8349) Add Windows batch files corresponding to kafka-delete-records.sh and kafka-log-dirs.sh

2019-06-10 Thread Kengo Seki (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kengo Seki resolved KAFKA-8349.
---
Resolution: Fixed

Closing this since it's been already merged. Thanks [~hachikuji]!

> Add Windows batch files corresponding to kafka-delete-records.sh and 
> kafka-log-dirs.sh
> --
>
> Key: KAFKA-8349
> URL: https://issues.apache.org/jira/browse/KAFKA-8349
> Project: Kafka
>  Issue Type: Improvement
>  Components: tools
>Reporter: Kengo Seki
>Assignee: Kengo Seki
>Priority: Minor
>
> Some shell scripts don't have corresponding batch files in bin\windows.
> For improving Windows platform support, I'd like to add the following batch 
> files:
> - bin\windows\kafka-delete-records.bat
> - bin\windows\kafka-log-dirs.bat



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8522) Tombstones can survive forever

2019-06-10 Thread Evelyn Bayes (JIRA)
Evelyn Bayes created KAFKA-8522:
---

 Summary: Tombstones can survive forever
 Key: KAFKA-8522
 URL: https://issues.apache.org/jira/browse/KAFKA-8522
 Project: Kafka
  Issue Type: Bug
  Components: log cleaner
Reporter: Evelyn Bayes


This is a bit grey zone as to whether it's a "bug" but it is certainly 
unintended behaviour.

 

Under specific conditions tombstones effectively survive forever:
 * Small amount of throughput;

 * min.cleanable.dirty.ratio near or at 0; and

 * Other parameters at default.

What  happens is all the data continuously gets cycled into the oldest segment. 
Old records get compacted away, but the new records continuously update the 
timestamp of the oldest segment reseting the countdown for deleting tombstones.

So tombstones build up in the oldest segment forever.

 

While you could "fix" this by reducing the segment size, this can be 
undesirable as a sudden change in throughput could cause a dangerous number of 
segments to be created.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8516) Consider allowing all replicas to have read/write permissions

2019-06-10 Thread Matthias J. Sax (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860587#comment-16860587
 ] 

Matthias J. Sax commented on KAFKA-8516:


Agreed.

In fact, I am not even sure if we can (or want) to allow writing to different 
replicas at all. Solving the consistency problem is very, very(!) hard, and 
might not be possible without a major performance hit. Hence, I tend to think 
that it will never be implemented.

> Consider allowing all replicas to have read/write permissions
> -
>
> Key: KAFKA-8516
> URL: https://issues.apache.org/jira/browse/KAFKA-8516
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Richard Yu
>Priority: Major
>
> Currently, in Kafka internals, a leader is responsible for all the read and 
> write operations requested by the user. This naturally incurs a bottleneck 
> since one replica, as the leader, would experience a significantly heavier 
> workload than other replicas and also means that all client commands must 
> pass through a chokepoint. If a leader fails, all processing effectively 
> comes to a halt until another leader election. In order to help solve this 
> problem, we could think about redesigning Kafka core so that any replica is 
> able to do read and write operations as well. That is, the system be changed 
> so that _all_ replicas have read/write permissions.
>   
>  This has multiple positives. Notably the following:
>  * Workload can be more evenly distributed since leader replicas are weighted 
> more than follower replicas (in this new design, all replicas are equal)
>  * Some failures would not be as catastrophic as in the leader-follower 
> paradigm. There is no one single "leader". If one replica goes down, others 
> are still able to read/write as needed. Processing could continue without 
> interruption.
> The implementation for such a change like this will be very extensive and 
> discussion would be needed to decide if such an improvement as described 
> above would warrant such a drastic redesign of Kafka internals.
> Relevant KIP for read permissions can be found here:
> [https://cwiki.apache.org/confluence/display/KAFKA/KIP-392%3A+Allow+consumers+to+fetch+from+closest+replica]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8513) Add kafka-streams-application-reset.bat for Windows platform

2019-06-10 Thread Matthias J. Sax (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860588#comment-16860588
 ] 

Matthias J. Sax commented on KAFKA-8513:


Ack. Fine with me. The PR looks good in general.

> Add kafka-streams-application-reset.bat for Windows platform
> 
>
> Key: KAFKA-8513
> URL: https://issues.apache.org/jira/browse/KAFKA-8513
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams, tools
>Reporter: Kengo Seki
>Assignee: Kengo Seki
>Priority: Minor
>
> For improving Windows support, it'd be nice if there were a batch file 
> corresponding to bin/kafka-streams-application-reset.sh.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8450) Augment processed in MockProcessor as KeyValueAndTimestamp

2019-06-10 Thread Matthias J. Sax (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860589#comment-16860589
 ] 

Matthias J. Sax commented on KAFKA-8450:


Yes. Instead of calling `makeRecord` that returns a String, the idea is to use 
`KeyValueTimestamp` instead.

> Augment processed in MockProcessor as KeyValueAndTimestamp
> --
>
> Key: KAFKA-8450
> URL: https://issues.apache.org/jira/browse/KAFKA-8450
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams, unit tests
>Reporter: Guozhang Wang
>Assignee: SuryaTeja Duggi
>Priority: Major
>  Labels: newbie
>
> Today the book-keeping list of `processed` records in MockProcessor is in the 
> form of String, in which we just call the key / value type's toString 
> function in order to book-keep, this loses the type information as well as 
> not having timestamp associated with it.
> It's better to translate its type to `KeyValueAndTimestamp` and refactor 
> impacted unit tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8516) Consider allowing all replicas to have read/write permissions

2019-06-10 Thread Richard Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860591#comment-16860591
 ] 

Richard Yu commented on KAFKA-8516:
---

Well, this is when we start straying into an area called "consensus 
algorithms". Kafka's current leader-replica model closely follows an algorithm 
referred to as Raft (research paper here: [https://raft.github.io/raft.pdf] ). 
If we wish to implement the write permissions part (which looks like a pretty 
big if), then we would perhaps have to consider something along the lines of 
EPaxos ( [https://www.cs.cmu.edu/~dga/papers/epaxos-sosp2013.pdf] ). 

cc [~hachikuji] your thoughts on this?

> Consider allowing all replicas to have read/write permissions
> -
>
> Key: KAFKA-8516
> URL: https://issues.apache.org/jira/browse/KAFKA-8516
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Richard Yu
>Priority: Major
>
> Currently, in Kafka internals, a leader is responsible for all the read and 
> write operations requested by the user. This naturally incurs a bottleneck 
> since one replica, as the leader, would experience a significantly heavier 
> workload than other replicas and also means that all client commands must 
> pass through a chokepoint. If a leader fails, all processing effectively 
> comes to a halt until another leader election. In order to help solve this 
> problem, we could think about redesigning Kafka core so that any replica is 
> able to do read and write operations as well. That is, the system be changed 
> so that _all_ replicas have read/write permissions.
>   
>  This has multiple positives. Notably the following:
>  * Workload can be more evenly distributed since leader replicas are weighted 
> more than follower replicas (in this new design, all replicas are equal)
>  * Some failures would not be as catastrophic as in the leader-follower 
> paradigm. There is no one single "leader". If one replica goes down, others 
> are still able to read/write as needed. Processing could continue without 
> interruption.
> The implementation for such a change like this will be very extensive and 
> discussion would be needed to decide if such an improvement as described 
> above would warrant such a drastic redesign of Kafka internals.
> Relevant KIP for read permissions can be found here:
> [https://cwiki.apache.org/confluence/display/KAFKA/KIP-392%3A+Allow+consumers+to+fetch+from+closest+replica]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (KAFKA-8516) Consider allowing all replicas to have read/write permissions

2019-06-10 Thread Richard Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860591#comment-16860591
 ] 

Richard Yu edited comment on KAFKA-8516 at 6/11/19 6:15 AM:


Well, this is when we start straying into an area called "consensus 
algorithms". Kafka's current leader-replica model closely follows an algorithm 
referred to as Raft (research paper here: [https://raft.github.io/raft.pdf] ). 
If we wish to implement the write permissions part (which looks like a pretty 
big if), then we would perhaps have to consider something along the lines of 
EPaxos ( [https://www.cs.cmu.edu/~dga/papers/epaxos-sosp2013.pdf] ).

cc [~hachikuji] your thoughts on this?

Edit: If implemented correctly, there should be a pretty good performance gain 
(results in RAFT paper anyways seem to indicate this).


was (Author: yohan123):
Well, this is when we start straying into an area called "consensus 
algorithms". Kafka's current leader-replica model closely follows an algorithm 
referred to as Raft (research paper here: [https://raft.github.io/raft.pdf] ). 
If we wish to implement the write permissions part (which looks like a pretty 
big if), then we would perhaps have to consider something along the lines of 
EPaxos ( [https://www.cs.cmu.edu/~dga/papers/epaxos-sosp2013.pdf] ). 

cc [~hachikuji] your thoughts on this?

> Consider allowing all replicas to have read/write permissions
> -
>
> Key: KAFKA-8516
> URL: https://issues.apache.org/jira/browse/KAFKA-8516
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Richard Yu
>Priority: Major
>
> Currently, in Kafka internals, a leader is responsible for all the read and 
> write operations requested by the user. This naturally incurs a bottleneck 
> since one replica, as the leader, would experience a significantly heavier 
> workload than other replicas and also means that all client commands must 
> pass through a chokepoint. If a leader fails, all processing effectively 
> comes to a halt until another leader election. In order to help solve this 
> problem, we could think about redesigning Kafka core so that any replica is 
> able to do read and write operations as well. That is, the system be changed 
> so that _all_ replicas have read/write permissions.
>   
>  This has multiple positives. Notably the following:
>  * Workload can be more evenly distributed since leader replicas are weighted 
> more than follower replicas (in this new design, all replicas are equal)
>  * Some failures would not be as catastrophic as in the leader-follower 
> paradigm. There is no one single "leader". If one replica goes down, others 
> are still able to read/write as needed. Processing could continue without 
> interruption.
> The implementation for such a change like this will be very extensive and 
> discussion would be needed to decide if such an improvement as described 
> above would warrant such a drastic redesign of Kafka internals.
> Relevant KIP for read permissions can be found here:
> [https://cwiki.apache.org/confluence/display/KAFKA/KIP-392%3A+Allow+consumers+to+fetch+from+closest+replica]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (KAFKA-8516) Consider allowing all replicas to have read/write permissions

2019-06-10 Thread Richard Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860591#comment-16860591
 ] 

Richard Yu edited comment on KAFKA-8516 at 6/11/19 6:16 AM:


Well, this is when we start straying into an area called "consensus 
algorithms". Kafka's current leader-replica model loosely follows an algorithm 
referred to as Raft (research paper here: [https://raft.github.io/raft.pdf] ). 
If we wish to implement the write permissions part (which looks like a pretty 
big if), then we would perhaps have to consider something along the lines of 
EPaxos ( [https://www.cs.cmu.edu/~dga/papers/epaxos-sosp2013.pdf] ).

cc [~hachikuji] your thoughts on this?

Edit: If implemented correctly, there should be a pretty good performance gain 
(results in RAFT paper anyways seem to indicate this).


was (Author: yohan123):
Well, this is when we start straying into an area called "consensus 
algorithms". Kafka's current leader-replica model closely follows an algorithm 
referred to as Raft (research paper here: [https://raft.github.io/raft.pdf] ). 
If we wish to implement the write permissions part (which looks like a pretty 
big if), then we would perhaps have to consider something along the lines of 
EPaxos ( [https://www.cs.cmu.edu/~dga/papers/epaxos-sosp2013.pdf] ).

cc [~hachikuji] your thoughts on this?

Edit: If implemented correctly, there should be a pretty good performance gain 
(results in RAFT paper anyways seem to indicate this).

> Consider allowing all replicas to have read/write permissions
> -
>
> Key: KAFKA-8516
> URL: https://issues.apache.org/jira/browse/KAFKA-8516
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Richard Yu
>Priority: Major
>
> Currently, in Kafka internals, a leader is responsible for all the read and 
> write operations requested by the user. This naturally incurs a bottleneck 
> since one replica, as the leader, would experience a significantly heavier 
> workload than other replicas and also means that all client commands must 
> pass through a chokepoint. If a leader fails, all processing effectively 
> comes to a halt until another leader election. In order to help solve this 
> problem, we could think about redesigning Kafka core so that any replica is 
> able to do read and write operations as well. That is, the system be changed 
> so that _all_ replicas have read/write permissions.
>   
>  This has multiple positives. Notably the following:
>  * Workload can be more evenly distributed since leader replicas are weighted 
> more than follower replicas (in this new design, all replicas are equal)
>  * Some failures would not be as catastrophic as in the leader-follower 
> paradigm. There is no one single "leader". If one replica goes down, others 
> are still able to read/write as needed. Processing could continue without 
> interruption.
> The implementation for such a change like this will be very extensive and 
> discussion would be needed to decide if such an improvement as described 
> above would warrant such a drastic redesign of Kafka internals.
> Relevant KIP for read permissions can be found here:
> [https://cwiki.apache.org/confluence/display/KAFKA/KIP-392%3A+Allow+consumers+to+fetch+from+closest+replica]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8041) Flaky Test LogDirFailureTest#testIOExceptionDuringLogRoll

2019-06-10 Thread Matthias J. Sax (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860594#comment-16860594
 ] 

Matthias J. Sax commented on KAFKA-8041:


Failed again: 
[https://builds.apache.org/blue/organizations/jenkins/kafka-2.0-jdk8/detail/kafka-2.0-jdk8/275/tests]

> Flaky Test LogDirFailureTest#testIOExceptionDuringLogRoll
> -
>
> Key: KAFKA-8041
> URL: https://issues.apache.org/jira/browse/KAFKA-8041
> Project: Kafka
>  Issue Type: Bug
>  Components: core, unit tests
>Affects Versions: 2.0.1, 2.3.0
>Reporter: Matthias J. Sax
>Priority: Critical
>  Labels: flaky-test
> Fix For: 2.4.0
>
>
> [https://builds.apache.org/blue/organizations/jenkins/kafka-2.0-jdk8/detail/kafka-2.0-jdk8/236/tests]
> {quote}java.lang.AssertionError: Expected some messages
> at kafka.utils.TestUtils$.fail(TestUtils.scala:357)
> at kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:787)
> at 
> kafka.server.LogDirFailureTest.testProduceAfterLogDirFailureOnLeader(LogDirFailureTest.scala:189)
> at 
> kafka.server.LogDirFailureTest.testIOExceptionDuringLogRoll(LogDirFailureTest.scala:63){quote}
> STDOUT
> {quote}[2019-03-05 03:44:58,614] ERROR [ReplicaFetcher replicaId=1, 
> leaderId=0, fetcherId=0] Error for partition topic-6 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-03-05 03:44:58,614] ERROR [ReplicaFetcher replicaId=1, leaderId=0, 
> fetcherId=0] Error for partition topic-0 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-03-05 03:44:58,615] ERROR [ReplicaFetcher replicaId=1, leaderId=0, 
> fetcherId=0] Error for partition topic-10 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-03-05 03:44:58,615] ERROR [ReplicaFetcher replicaId=1, leaderId=0, 
> fetcherId=0] Error for partition topic-4 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-03-05 03:44:58,615] ERROR [ReplicaFetcher replicaId=1, leaderId=0, 
> fetcherId=0] Error for partition topic-8 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-03-05 03:44:58,615] ERROR [ReplicaFetcher replicaId=1, leaderId=0, 
> fetcherId=0] Error for partition topic-2 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-03-05 03:45:00,248] ERROR Error while rolling log segment for topic-0 
> in dir 
> /home/jenkins/jenkins-slave/workspace/kafka-2.0-jdk8/core/data/kafka-3869208920357262216
>  (kafka.server.LogDirFailureChannel:76)
> java.io.FileNotFoundException: 
> /home/jenkins/jenkins-slave/workspace/kafka-2.0-jdk8/core/data/kafka-3869208920357262216/topic-0/.index
>  (Not a directory)
> at java.io.RandomAccessFile.open0(Native Method)
> at java.io.RandomAccessFile.open(RandomAccessFile.java:316)
> at java.io.RandomAccessFile.(RandomAccessFile.java:243)
> at kafka.log.AbstractIndex.$anonfun$resize$1(AbstractIndex.scala:121)
> at scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:12)
> at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:251)
> at kafka.log.AbstractIndex.resize(AbstractIndex.scala:115)
> at kafka.log.AbstractIndex.$anonfun$trimToValidSize$1(AbstractIndex.scala:184)
> at scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:12)
> at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:251)
> at kafka.log.AbstractIndex.trimToValidSize(AbstractIndex.scala:184)
> at kafka.log.LogSegment.onBecomeInactiveSegment(LogSegment.scala:501)
> at kafka.log.Log.$anonfun$roll$8(Log.scala:1520)
> at kafka.log.Log.$anonfun$roll$8$adapted(Log.scala:1520)
> at scala.Option.foreach(Option.scala:257)
> at kafka.log.Log.$anonfun$roll$2(Log.scala:1520)
> at kafka.log.Log.maybeHandleIOException(Log.scala:1881)
> at kafka.log.Log.roll(Log.scala:1484)
> at 
> kafka.server.LogDirFailureTest.testProduceAfterLogDirFailureOnLeader(LogDirFailureTest.scala:154)
> at 
> kafka.server.LogDirFailureTest.testIOExceptionDuringLogRoll(LogDirFailureTest.scala:63)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.Del

[jira] [Commented] (KAFKA-8078) Flaky Test TableTableJoinIntegrationTest#testInnerInner

2019-06-10 Thread Matthias J. Sax (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860596#comment-16860596
 ] 

Matthias J. Sax commented on KAFKA-8078:


Failed again with timeout: 
[https://builds.apache.org/blue/organizations/jenkins/kafka-2.3-jdk8/detail/kafka-2.3-jdk8/47/tests]

testLeftOuter, caching enabled

> Flaky Test TableTableJoinIntegrationTest#testInnerInner
> ---
>
> Key: KAFKA-8078
> URL: https://issues.apache.org/jira/browse/KAFKA-8078
> Project: Kafka
>  Issue Type: Bug
>  Components: streams, unit tests
>Affects Versions: 2.3.0
>Reporter: Matthias J. Sax
>Priority: Major
>  Labels: flaky-test
> Fix For: 2.4.0
>
>
> [https://builds.apache.org/blue/organizations/jenkins/kafka-trunk-jdk8/detail/kafka-trunk-jdk8/3445/tests]
> {quote}java.lang.AssertionError: Condition not met within timeout 15000. 
> Never received expected final result.
> at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:365)
> at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:325)
> at 
> org.apache.kafka.streams.integration.AbstractJoinIntegrationTest.runTest(AbstractJoinIntegrationTest.java:246)
> at 
> org.apache.kafka.streams.integration.TableTableJoinIntegrationTest.testInnerInner(TableTableJoinIntegrationTest.java:196){quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7988) Flaky Test DynamicBrokerReconfigurationTest#testThreadPoolResize

2019-06-10 Thread Matthias J. Sax (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-7988:
---
Affects Version/s: 2.3.0

> Flaky Test DynamicBrokerReconfigurationTest#testThreadPoolResize
> 
>
> Key: KAFKA-7988
> URL: https://issues.apache.org/jira/browse/KAFKA-7988
> Project: Kafka
>  Issue Type: Bug
>  Components: core, unit tests
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Matthias J. Sax
>Assignee: Rajini Sivaram
>Priority: Critical
>  Labels: flaky-test
> Fix For: 2.4.0
>
>
> To get stable nightly builds for `2.2` release, I create tickets for all 
> observed test failures.
> [https://builds.apache.org/blue/organizations/jenkins/kafka-2.2-jdk8/detail/kafka-2.2-jdk8/30/]
> {quote}kafka.server.DynamicBrokerReconfigurationTest > testThreadPoolResize 
> FAILED java.lang.AssertionError: Invalid threads: expected 6, got 5: 
> List(ReplicaFetcherThread-0-0, ReplicaFetcherThread-0-1, 
> ReplicaFetcherThread-0-0, ReplicaFetcherThread-0-2, ReplicaFetcherThread-0-1) 
> at org.junit.Assert.fail(Assert.java:88) at 
> org.junit.Assert.assertTrue(Assert.java:41) at 
> kafka.server.DynamicBrokerReconfigurationTest.verifyThreads(DynamicBrokerReconfigurationTest.scala:1260)
>  at 
> kafka.server.DynamicBrokerReconfigurationTest.maybeVerifyThreadPoolSize$1(DynamicBrokerReconfigurationTest.scala:531)
>  at 
> kafka.server.DynamicBrokerReconfigurationTest.resizeThreadPool$1(DynamicBrokerReconfigurationTest.scala:550)
>  at 
> kafka.server.DynamicBrokerReconfigurationTest.reducePoolSize$1(DynamicBrokerReconfigurationTest.scala:536)
>  at 
> kafka.server.DynamicBrokerReconfigurationTest.$anonfun$testThreadPoolResize$3(DynamicBrokerReconfigurationTest.scala:559)
>  at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:158) at 
> kafka.server.DynamicBrokerReconfigurationTest.verifyThreadPoolResize$1(DynamicBrokerReconfigurationTest.scala:558)
>  at 
> kafka.server.DynamicBrokerReconfigurationTest.testThreadPoolResize(DynamicBrokerReconfigurationTest.scala:572){quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7988) Flaky Test DynamicBrokerReconfigurationTest#testThreadPoolResize

2019-06-10 Thread Matthias J. Sax (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860597#comment-16860597
 ] 

Matthias J. Sax commented on KAFKA-7988:


Failed in 2.3: 
[https://builds.apache.org/blue/organizations/jenkins/kafka-2.3-jdk8/detail/kafka-2.3-jdk8/47/tests]
{code:java}
java.lang.AssertionError: expected:<{0=10, 1=11, 2=12, 3=13, 4=4, 5=5, 6=6, 
7=7, 8=8, 9=9}> but was:<{0=10, 1=11, 2=12, 3=13, 4=14, 5=5, 6=6, 7=7, 8=8, 
9=9}>
at org.junit.Assert.fail(Assert.java:89)
at org.junit.Assert.failNotEquals(Assert.java:835)
at org.junit.Assert.assertEquals(Assert.java:120)
at org.junit.Assert.assertEquals(Assert.java:146)
at 
kafka.server.DynamicBrokerReconfigurationTest.stopAndVerifyProduceConsume(DynamicBrokerReconfigurationTest.scala:1353)
at 
kafka.server.DynamicBrokerReconfigurationTest.verifyThreadPoolResize$1(DynamicBrokerReconfigurationTest.scala:615)
at 
kafka.server.DynamicBrokerReconfigurationTest.testThreadPoolResize(DynamicBrokerReconfigurationTest.scala:629)
{code}

> Flaky Test DynamicBrokerReconfigurationTest#testThreadPoolResize
> 
>
> Key: KAFKA-7988
> URL: https://issues.apache.org/jira/browse/KAFKA-7988
> Project: Kafka
>  Issue Type: Bug
>  Components: core, unit tests
>Affects Versions: 2.2.0
>Reporter: Matthias J. Sax
>Assignee: Rajini Sivaram
>Priority: Critical
>  Labels: flaky-test
> Fix For: 2.4.0
>
>
> To get stable nightly builds for `2.2` release, I create tickets for all 
> observed test failures.
> [https://builds.apache.org/blue/organizations/jenkins/kafka-2.2-jdk8/detail/kafka-2.2-jdk8/30/]
> {quote}kafka.server.DynamicBrokerReconfigurationTest > testThreadPoolResize 
> FAILED java.lang.AssertionError: Invalid threads: expected 6, got 5: 
> List(ReplicaFetcherThread-0-0, ReplicaFetcherThread-0-1, 
> ReplicaFetcherThread-0-0, ReplicaFetcherThread-0-2, ReplicaFetcherThread-0-1) 
> at org.junit.Assert.fail(Assert.java:88) at 
> org.junit.Assert.assertTrue(Assert.java:41) at 
> kafka.server.DynamicBrokerReconfigurationTest.verifyThreads(DynamicBrokerReconfigurationTest.scala:1260)
>  at 
> kafka.server.DynamicBrokerReconfigurationTest.maybeVerifyThreadPoolSize$1(DynamicBrokerReconfigurationTest.scala:531)
>  at 
> kafka.server.DynamicBrokerReconfigurationTest.resizeThreadPool$1(DynamicBrokerReconfigurationTest.scala:550)
>  at 
> kafka.server.DynamicBrokerReconfigurationTest.reducePoolSize$1(DynamicBrokerReconfigurationTest.scala:536)
>  at 
> kafka.server.DynamicBrokerReconfigurationTest.$anonfun$testThreadPoolResize$3(DynamicBrokerReconfigurationTest.scala:559)
>  at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:158) at 
> kafka.server.DynamicBrokerReconfigurationTest.verifyThreadPoolResize$1(DynamicBrokerReconfigurationTest.scala:558)
>  at 
> kafka.server.DynamicBrokerReconfigurationTest.testThreadPoolResize(DynamicBrokerReconfigurationTest.scala:572){quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8418) Connect System tests are not waiting for REST resources to be registered

2019-06-10 Thread Matthias J. Sax (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-8418:
---
Fix Version/s: (was: 2.3)

> Connect System tests are not waiting for REST resources to be registered
> 
>
> Key: KAFKA-8418
> URL: https://issues.apache.org/jira/browse/KAFKA-8418
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 2.2.0
>Reporter: Oleksandr Diachenko
>Assignee: Oleksandr Diachenko
>Priority: Blocker
> Fix For: 1.0.3, 1.1.2, 2.0.2, 2.3.0, 2.1.2, 2.2.1
>
>
> I am getting an error while running Kafka tests:
> {code}
> Traceback (most recent call last): File 
> "/home/jenkins/workspace/system-test-kafka_5.3.x/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.7.5-py2.7.egg/ducktape/tests/runner_client.py",
>  line 132, in run data = self.run_test() File 
> "/home/jenkins/workspace/system-test-kafka_5.3.x/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.7.5-py2.7.egg/ducktape/tests/runner_client.py",
>  line 189, in run_test return self.test_context.function(self.test) File 
> "/home/jenkins/workspace/system-test-kafka_5.3.x/kafka/tests/kafkatest/tests/connect/connect_rest_test.py",
>  line 89, in test_rest_api assert set([connector_plugin['class'] for 
> connector_plugin in self.cc.list_connector_plugins()]) == 
> \{self.FILE_SOURCE_CONNECTOR, self.FILE_SINK_CONNECTOR} File 
> "/home/jenkins/workspace/system-test-kafka_5.3.x/kafka/tests/kafkatest/services/connect.py",
>  line 218, in list_connector_plugins return self._rest('/connector-plugins/', 
> node=node) File 
> "/home/jenkins/workspace/system-test-kafka_5.3.x/kafka/tests/kafkatest/services/connect.py",
>  line 234, in _rest raise ConnectRestError(resp.status_code, resp.text, 
> resp.url) ConnectRestError
> {code}
> From the logs, I see two messages:
> {code}
> [2019-05-23 16:09:59,373] INFO REST server listening at 
> http://172.31.39.205:8083/, advertising URL http://worker1:8083/ 
> (org.apache.kafka.connect.runtime.rest.RestServer)
> {code}
> and {code}
> [2019-05-23 16:10:00,738] INFO REST resources initialized; server is started 
> and ready to handle requests 
> (org.apache.kafka.connect.runtime.rest.RestServer)
> {code}
>  it takes 1365 ms to actually load REST resources, but the test is waiting on 
> a port to be listening. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8265) Connect Client Config Override policy

2019-06-10 Thread Matthias J. Sax (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-8265:
---
Fix Version/s: (was: 2.3)
   2.3.0

> Connect Client Config Override policy
> -
>
> Key: KAFKA-8265
> URL: https://issues.apache.org/jira/browse/KAFKA-8265
> Project: Kafka
>  Issue Type: New Feature
>  Components: KafkaConnect
>Reporter: Magesh kumar Nandakumar
>Assignee: Magesh kumar Nandakumar
>Priority: Major
> Fix For: 2.3.0
>
>
> Right now, each source connector and sink connector inherit their client 
> configurations from the worker properties. Within the worker properties, all 
> configurations that have a prefix of "producer." or "consumer." are applied 
> to all source connectors and sink connectors respectively.
> We should allow the  "producer." or "consumer." to be overridden in 
> accordance to an override policy determined by the administrator.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8473) Adjust Connect system tests for incremental cooperative rebalancing and enable them for both eager and incremental cooperative rebalancing

2019-06-10 Thread Matthias J. Sax (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-8473:
---
Affects Version/s: (was: 2.3)
   2.3.0

> Adjust Connect system tests for incremental cooperative rebalancing and 
> enable them for both eager and incremental cooperative rebalancing
> --
>
> Key: KAFKA-8473
> URL: https://issues.apache.org/jira/browse/KAFKA-8473
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 2.3.0
>Reporter: Konstantine Karantasis
>Assignee: Konstantine Karantasis
>Priority: Critical
> Fix For: 2.3.0
>
>
>  
> {{connect.protocol=compatible}} that enables incremental cooperative 
> rebalancing if all workers are in that version is now the default option. 
> System tests should be parameterized to keep running the for eager 
> rebalancing protocol as well to make sure no regression have happened. 
> Also, for the incremental cooperative protocol, a few tests need to be 
> adjusted to have a lower rebalancing delay (the delay that is applied to wait 
> for returning workers) 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8475) Temporarily restore SslFactory.sslContext() helper

2019-06-10 Thread Matthias J. Sax (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-8475:
---
Fix Version/s: (was: 2.3)
   2.3.0

> Temporarily restore SslFactory.sslContext() helper
> --
>
> Key: KAFKA-8475
> URL: https://issues.apache.org/jira/browse/KAFKA-8475
> Project: Kafka
>  Issue Type: Bug
>Reporter: Colin P. McCabe
>Assignee: Randall Hauch
>Priority: Blocker
> Fix For: 2.3.0
>
>
> Temporarily restore the SslFactory.sslContext() function, which some 
> connectors use.  This function is not a public API and it will be removed 
> eventually.  For now, we will mark it as deprecated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8473) Adjust Connect system tests for incremental cooperative rebalancing and enable them for both eager and incremental cooperative rebalancing

2019-06-10 Thread Matthias J. Sax (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-8473:
---
Fix Version/s: (was: 2.3)
   2.3.0

> Adjust Connect system tests for incremental cooperative rebalancing and 
> enable them for both eager and incremental cooperative rebalancing
> --
>
> Key: KAFKA-8473
> URL: https://issues.apache.org/jira/browse/KAFKA-8473
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 2.3
>Reporter: Konstantine Karantasis
>Assignee: Konstantine Karantasis
>Priority: Critical
> Fix For: 2.3.0
>
>
>  
> {{connect.protocol=compatible}} that enables incremental cooperative 
> rebalancing if all workers are in that version is now the default option. 
> System tests should be parameterized to keep running the for eager 
> rebalancing protocol as well to make sure no regression have happened. 
> Also, for the incremental cooperative protocol, a few tests need to be 
> adjusted to have a lower rebalancing delay (the delay that is applied to wait 
> for returning workers) 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8263) Flaky Test MetricsIntegrationTest#testStreamMetricOfWindowStore

2019-06-10 Thread Matthias J. Sax (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860599#comment-16860599
 ] 

Matthias J. Sax commented on KAFKA-8263:


[https://builds.apache.org/blue/organizations/jenkins/kafka-trunk-jdk8/detail/kafka-trunk-jdk8/3714/tests]

> Flaky Test MetricsIntegrationTest#testStreamMetricOfWindowStore
> ---
>
> Key: KAFKA-8263
> URL: https://issues.apache.org/jira/browse/KAFKA-8263
> Project: Kafka
>  Issue Type: Bug
>  Components: streams, unit tests
>Affects Versions: 2.3.0
>Reporter: Matthias J. Sax
>Assignee: Bruno Cadonna
>Priority: Major
>  Labels: flaky-test
> Fix For: 2.4.0
>
>
> [https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/3900/testReport/junit/org.apache.kafka.streams.integration/MetricsIntegrationTest/testStreamMetricOfWindowStore/]
> {quote}java.lang.AssertionError: Condition not met within timeout 1. 
> testStoreMetricWindow -> Size of metrics of type:'put-latency-avg' must be 
> equal to:2 but it's equal to 0 expected:<2> but was:<0> at 
> org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:361){quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-8193) Flaky Test MetricsIntegrationTest#testStreamMetricOfWindowStore

2019-06-10 Thread Matthias J. Sax (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax resolved KAFKA-8193.

Resolution: Duplicate

> Flaky Test MetricsIntegrationTest#testStreamMetricOfWindowStore
> ---
>
> Key: KAFKA-8193
> URL: https://issues.apache.org/jira/browse/KAFKA-8193
> Project: Kafka
>  Issue Type: Bug
>  Components: streams, unit tests
>Affects Versions: 2.3.0
>Reporter: Konstantine Karantasis
>Assignee: Guozhang Wang
>Priority: Major
>  Labels: flaky-test
> Fix For: 2.4.0
>
>
> [https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/3576/console]
>  *14:14:48* org.apache.kafka.streams.integration.MetricsIntegrationTest > 
> testStreamMetricOfWindowStore STARTED
> *14:14:59* 
> org.apache.kafka.streams.integration.MetricsIntegrationTest.testStreamMetricOfWindowStore
>  failed, log available in 
> /home/jenkins/jenkins-slave/workspace/kafka-pr-jdk11-scala2.12/streams/build/reports/testOutput/org.apache.kafka.streams.integration.MetricsIntegrationTest.testStreamMetricOfWindowStore.test.stdout
> *14:14:59* 
> *14:14:59* org.apache.kafka.streams.integration.MetricsIntegrationTest > 
> testStreamMetricOfWindowStore FAILED
> *14:14:59* java.lang.AssertionError: Condition not met within timeout 1. 
> testStoreMetricWindow -> Size of metrics of type:'put-latency-avg' must be 
> equal to:2 but it's equal to 0 expected:<2> but was:<0>
> *14:14:59* at 
> org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:361)
> *14:14:59* at 
> org.apache.kafka.streams.integration.MetricsIntegrationTest.testStreamMetricOfWindowStore(MetricsIntegrationTest.java:260)
> *14:15:01*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8193) Flaky Test MetricsIntegrationTest#testStreamMetricOfWindowStore

2019-06-10 Thread Matthias J. Sax (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-8193:
---
Fix Version/s: (was: 2.4.0)

> Flaky Test MetricsIntegrationTest#testStreamMetricOfWindowStore
> ---
>
> Key: KAFKA-8193
> URL: https://issues.apache.org/jira/browse/KAFKA-8193
> Project: Kafka
>  Issue Type: Bug
>  Components: streams, unit tests
>Affects Versions: 2.3.0
>Reporter: Konstantine Karantasis
>Assignee: Guozhang Wang
>Priority: Major
>  Labels: flaky-test
>
> [https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/3576/console]
>  *14:14:48* org.apache.kafka.streams.integration.MetricsIntegrationTest > 
> testStreamMetricOfWindowStore STARTED
> *14:14:59* 
> org.apache.kafka.streams.integration.MetricsIntegrationTest.testStreamMetricOfWindowStore
>  failed, log available in 
> /home/jenkins/jenkins-slave/workspace/kafka-pr-jdk11-scala2.12/streams/build/reports/testOutput/org.apache.kafka.streams.integration.MetricsIntegrationTest.testStreamMetricOfWindowStore.test.stdout
> *14:14:59* 
> *14:14:59* org.apache.kafka.streams.integration.MetricsIntegrationTest > 
> testStreamMetricOfWindowStore FAILED
> *14:14:59* java.lang.AssertionError: Condition not met within timeout 1. 
> testStoreMetricWindow -> Size of metrics of type:'put-latency-avg' must be 
> equal to:2 but it's equal to 0 expected:<2> but was:<0>
> *14:14:59* at 
> org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:361)
> *14:14:59* at 
> org.apache.kafka.streams.integration.MetricsIntegrationTest.testStreamMetricOfWindowStore(MetricsIntegrationTest.java:260)
> *14:15:01*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)