[jira] [Commented] (KAFKA-4400) Prefix for sink task consumer groups should be configurable

2017-06-08 Thread Paolo Patierno (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16042325#comment-16042325
 ] 

Paolo Patierno commented on KAFKA-4400:
---

[~ewencp] I don't see activity on this since few months. I'd like to start to 
contribute to the Kafka project so this JIRA could be a starting point for 
doing that. What do you think ? Can you add me to the contributors list so that 
I can assign JIRAs to myself ?

Thanks,
Paolo.

> Prefix for sink task consumer groups should be configurable
> ---
>
> Key: KAFKA-4400
> URL: https://issues.apache.org/jira/browse/KAFKA-4400
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.1.0
>Reporter: Ewen Cheslack-Postava
>  Labels: newbie
>
> Currently the prefix for creating consumer groups is fixed. This means that 
> if you run multiple Connect clusters using the same Kafka cluster and create 
> connectors with the same name, sink tasks in different clusters will join the 
> same group. Making this prefix configurable at the worker level would protect 
> against this.
> An alternative would be to define unique cluster IDs for each connect 
> cluster, which would allow us to construct a unique name for the group 
> without requiring yet another config (but presents something of a 
> compatibility challenge).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5153) KAFKA Cluster : 0.10.2.0 : Servers Getting disconnected : Service Impacting

2017-06-08 Thread Qingsong Xie (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16042350#comment-16042350
 ] 

Qingsong Xie commented on KAFKA-5153:
-

[~dhirajpraj] hi, this bug has been fixed in 0.10.1.1 . I've upgraded kafka to 
0.10.1.1 and kafka runs normally for 2 weeks now~ 

> KAFKA Cluster : 0.10.2.0 : Servers Getting disconnected : Service Impacting
> ---
>
> Key: KAFKA-5153
> URL: https://issues.apache.org/jira/browse/KAFKA-5153
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.2.0
> Environment: RHEL 6
> Java Version  1.8.0_91-b14
>Reporter: Arpan
>Priority: Critical
> Attachments: server_1_72server.log, server_2_73_server.log, 
> server_3_74Server.log, server.properties, ThreadDump_1493564142.dump, 
> ThreadDump_1493564177.dump, ThreadDump_1493564249.dump
>
>
> Hi Team, 
> I was earlier referring to issue KAFKA-4477 because the problem i am facing 
> is similar. I tried to search the same reference in release docs as well but 
> did not get anything in 0.10.1.1 or 0.10.2.0. I am currently using 
> 2.11_0.10.2.0.
> I am have 3 node cluster for KAFKA and cluster for ZK as well on the same set 
> of servers in cluster mode. We are having around 240GB of data getting 
> transferred through KAFKA everyday. What we are observing is disconnect of 
> the server from cluster and ISR getting reduced and it starts impacting 
> service.
> I have also observed file descriptor count getting increased a bit, in normal 
> circumstances we have not observed FD count more than 500 but when issue 
> started we were observing it in the range of 650-700 on all 3 servers. 
> Attaching thread dumps of all 3 servers when we started facing the issue 
> recently.
> The issue get vanished once you bounce the nodes and the set up is not 
> working more than 5 days without this issue. Attaching server logs as well.
> Kindly let me know if you need any additional information. Attaching 
> server.properties as well for one of the server (It's similar on all 3 
> serversP)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-4633) Always use regex pattern subscription to avoid auto create topics

2017-06-08 Thread Andres Gomez Ferrer (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16042360#comment-16042360
 ] 

Andres Gomez Ferrer commented on KAFKA-4633:


[~mjsax] ok!! I understand, so I will use auto-repartitioning. Thanks a lot for 
the explanation.

> Always use regex pattern subscription to avoid auto create topics
> -
>
> Key: KAFKA-4633
> URL: https://issues.apache.org/jira/browse/KAFKA-4633
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Guozhang Wang
>Assignee: Guozhang Wang
>  Labels: architecture
> Fix For: 0.10.2.0
>
>
> In {{KafkaConsumer}}, a metadata update is requested whenever 
> {{subscribe(List topics ..)}} is called. And when such a metadata 
> request is sent to the broker upon the first {{poll}} call, it will cause the 
> broker to auto-create any topics that do not exist if the broker-side config 
> {{topic.auto.create}} is turned on.
> In order to work around this issue until the config is default to false and 
> gradually be deprecated, we will let Streams to always use the other 
> {{subscribe}} function with regex pattern, which will send the metadata 
> request with empty topic list and hence won't trigger broker-side auto topic 
> creation.
> The side-effect is that the metadata response will be larger, since it 
> contains all the topic infos; but since we only refresh it infrequently this 
> will add negligible overhead.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KAFKA-5408) Using the ConsumerRecord instead of BaseConsumerRecord in the ConsoleConsumer

2017-06-08 Thread Paolo Patierno (JIRA)
Paolo Patierno created KAFKA-5408:
-

 Summary: Using the ConsumerRecord instead of BaseConsumerRecord in 
the ConsoleConsumer
 Key: KAFKA-5408
 URL: https://issues.apache.org/jira/browse/KAFKA-5408
 Project: Kafka
  Issue Type: Improvement
  Components: tools
Reporter: Paolo Patierno
Priority: Minor


Hi,
because the BaseConsumerRecord is marked as deprecated and will be removed in 
future versions, it could worth to start removing its usage in the 
ConsoleConsumer. 
If it makes sense for you, I'd like to work on that starting to contribute to 
the project.

Thanks,
Paolo.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-5408) Using the ConsumerRecord instead of BaseConsumerRecord in the ConsoleConsumer

2017-06-08 Thread Paolo Patierno (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paolo Patierno updated KAFKA-5408:
--
Description: 
Hi,
because the BaseConsumerRecord is marked as deprecated and will be removed in 
future versions, it could worth to start removing its usage in the 
ConsoleConsumer. 
If it makes sense to you, I'd like to work on that starting to contribute to 
the project.

Thanks,
Paolo.

  was:
Hi,
because the BaseConsumerRecord is marked as deprecated and will be removed in 
future versions, it could worth to start removing its usage in the 
ConsoleConsumer. 
If it makes sense for you, I'd like to work on that starting to contribute to 
the project.

Thanks,
Paolo.


> Using the ConsumerRecord instead of BaseConsumerRecord in the ConsoleConsumer
> -
>
> Key: KAFKA-5408
> URL: https://issues.apache.org/jira/browse/KAFKA-5408
> Project: Kafka
>  Issue Type: Improvement
>  Components: tools
>Reporter: Paolo Patierno
>Priority: Minor
>
> Hi,
> because the BaseConsumerRecord is marked as deprecated and will be removed in 
> future versions, it could worth to start removing its usage in the 
> ConsoleConsumer. 
> If it makes sense to you, I'd like to work on that starting to contribute to 
> the project.
> Thanks,
> Paolo.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #3267: MINOR: more details in error message descriptions

2017-06-08 Thread reftel
GitHub user reftel opened a pull request:

https://github.com/apache/kafka/pull/3267

MINOR: more details in error message descriptions



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/reftel/kafka feature/error_descriptions

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3267.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3267


commit 01f9cf1fbba97f01de717adfda3b5980fe321ef4
Author: Magnus Reftel 
Date:   2017-06-08T08:28:38Z

MINOR: more details in error message descriptions




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (KAFKA-5409) Providing a custom client-id to the ConsoleProducer tool

2017-06-08 Thread Paolo Patierno (JIRA)
Paolo Patierno created KAFKA-5409:
-

 Summary: Providing a custom client-id to the ConsoleProducer tool
 Key: KAFKA-5409
 URL: https://issues.apache.org/jira/browse/KAFKA-5409
 Project: Kafka
  Issue Type: Improvement
  Components: tools
Reporter: Paolo Patierno
Priority: Minor


Hi,

I see that the client-id properties for the ConsoleProducer tool is always 
"console-producer". It could be useful having it as parameter on the command 
line or generating a random one like happens for the ConsolerConsumer.
If it makes sense to you, I can work on that.

Thanks,
Paolo.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-4501) Support Java 9

2017-06-08 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16042438#comment-16042438
 ] 

Ismael Juma commented on KAFKA-4501:


I had a quick look at making the build work with Java 9 by using the various 
escape hatches available. We get a NPE from Zinc (Scala incremental compiler 
used by Gradle):

{code}
10:11:11.328 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter] 
Caused by: java.lang.NullPointerException
10:11:11.328 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter]   
at sbt.IO$.pathSplit(IO.scala:744)
10:11:11.328 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter]   
at sbt.IO$.parseClasspath(IO.scala:859)
10:11:11.328 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter]   
at sbt.compiler.CompilerArguments.extClasspath(CompilerArguments.scala:62)
10:11:11.328 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter]   
at sbt.compiler.AggressiveCompile.withBootclasspath(AggressiveCompile.scala:50)
10:11:11.328 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter]   
at sbt.compiler.AggressiveCompile.compile2(AggressiveCompile.scala:83)
10:11:11.328 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter]   
at sbt.compiler.AggressiveCompile.compile1(AggressiveCompile.scala:70)
10:11:11.328 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter]   
at com.typesafe.zinc.Compiler.compile(Compiler.scala:201)
10:11:11.328 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter]   
at com.typesafe.zinc.Compiler.compile(Compiler.scala:183)
10:11:11.329 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter]   
at com.typesafe.zinc.Compiler.compile(Compiler.scala:174)
10:11:11.329 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter]   
at com.typesafe.zinc.Compiler.compile(Compiler.scala:165)
10:11:11.329 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter]   
at 
org.gradle.api.internal.tasks.scala.ZincScalaCompiler$Compiler.execute(ZincScalaCompiler.java:81)
10:11:11.329 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter]   
at 
org.gradle.api.internal.tasks.scala.ZincScalaCompiler.execute(ZincScalaCompiler.java:52)
10:11:11.329 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter]   
at 
org.gradle.api.internal.tasks.scala.ZincScalaCompiler.execute(ZincScalaCompiler.java:38)
10:11:11.329 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter]   
at 
org.gradle.api.internal.tasks.compile.daemon.AbstractDaemonCompiler$CompilerWorkerAdapter.execute(AbstractDaemonCompiler.java:73)
10:11:11.329 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter]   
at 
org.gradle.api.internal.tasks.compile.daemon.AbstractDaemonCompiler$CompilerWorkerAdapter.execute(AbstractDaemonCompiler.java:64)
10:11:11.329 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter]   
at 
org.gradle.workers.internal.WorkerDaemonServer.execute(WorkerDaemonServer.java:29)
10:11:11.329 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter]   
at 
org.gradle.api.internal.tasks.compile.daemon.AbstractDaemonCompiler$CompilerDaemonServer.execute(AbstractDaemonCompiler.java:91)
10:11:11.329 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter]   
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)
10:11:11.329 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter]   
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
10:11:11.329 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter]   
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
10:11:11.330 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter]   
at 
org.gradle.process.internal.worker.request.WorkerAction.run(WorkerAction.java:88)
10:11:11.330 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter]   
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)
10:11:11.330 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter]   
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
10:11:11.330 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter]   
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
10:11:11.330 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter]   
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
10:11:11.330 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter]   
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
10:11:11.330 [ERROR] [org.gradle.internal.buildevents.BuildExceptionReporter]   
at 
org.gradle.internal.remote

[jira] [Updated] (KAFKA-5257) Change Default unclean.leader.election.enabled from True to False

2017-06-08 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-5257:
---
Summary: Change Default unclean.leader.election.enabled from True to False  
(was: CLONE - Change Default unclean.leader.election.enabled from True to False)

> Change Default unclean.leader.election.enabled from True to False
> -
>
> Key: KAFKA-5257
> URL: https://issues.apache.org/jira/browse/KAFKA-5257
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Ben Stopford
>Assignee: Sharad
> Fix For: 0.11.0.0
>
>
> See KIP-106
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-106+-+Change+Default+unclean.leader.election.enabled+from+True+to+False



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KAFKA-5410) Fix taskClass() method name in Connector and flush() signature in SinkTask

2017-06-08 Thread Paolo Patierno (JIRA)
Paolo Patierno created KAFKA-5410:
-

 Summary: Fix taskClass() method name in Connector and flush() 
signature in SinkTask
 Key: KAFKA-5410
 URL: https://issues.apache.org/jira/browse/KAFKA-5410
 Project: Kafka
  Issue Type: Improvement
  Components: documentation
Reporter: Paolo Patierno


Hi,
the current documentation refers to getTaskClass() for the Connector class 
during the file example. At same time, a different signature is showed for the 
flush() method in SinkTask which now has OffsetMetadata as well.

Thanks,
Paolo.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5410) Fix taskClass() method name in Connector and flush() signature in SinkTask

2017-06-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16042538#comment-16042538
 ] 

ASF GitHub Bot commented on KAFKA-5410:
---

GitHub user ppatierno opened a pull request:

https://github.com/apache/kafka/pull/3268

KAFKA-5410: Fix taskClass() method name in Connector and flush() signature 
in SinkTask 



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ppatierno/kafka connect-doc

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3268.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3268


commit 52ecadd6ede849903cd705f5b5e2b470a8e138bb
Author: ppatierno 
Date:   2017-06-07T12:09:53Z

Updated .gitignore for excluding out dir

commit 4b32e4109f67a3b586c3d018277c502fbfad21a2
Author: ppatierno 
Date:   2017-06-08T10:34:52Z

Merge remote-tracking branch 'upstream/trunk' into trunk

commit 812439d03974ef1794601b85aec5b6c1d54b1e54
Author: ppatierno 
Date:   2017-06-08T10:45:21Z

Fixed method name for taskClass() in Connector class
Fixed method signature for flush() in SinkTask class




> Fix taskClass() method name in Connector and flush() signature in SinkTask
> --
>
> Key: KAFKA-5410
> URL: https://issues.apache.org/jira/browse/KAFKA-5410
> Project: Kafka
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Paolo Patierno
>
> Hi,
> the current documentation refers to getTaskClass() for the Connector class 
> during the file example. At same time, a different signature is showed for 
> the flush() method in SinkTask which now has OffsetMetadata as well.
> Thanks,
> Paolo.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #3268: KAFKA-5410: Fix taskClass() method name in Connect...

2017-06-08 Thread ppatierno
GitHub user ppatierno opened a pull request:

https://github.com/apache/kafka/pull/3268

KAFKA-5410: Fix taskClass() method name in Connector and flush() signature 
in SinkTask 



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ppatierno/kafka connect-doc

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3268.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3268


commit 52ecadd6ede849903cd705f5b5e2b470a8e138bb
Author: ppatierno 
Date:   2017-06-07T12:09:53Z

Updated .gitignore for excluding out dir

commit 4b32e4109f67a3b586c3d018277c502fbfad21a2
Author: ppatierno 
Date:   2017-06-08T10:34:52Z

Merge remote-tracking branch 'upstream/trunk' into trunk

commit 812439d03974ef1794601b85aec5b6c1d54b1e54
Author: ppatierno 
Date:   2017-06-08T10:45:21Z

Fixed method name for taskClass() in Connector class
Fixed method signature for flush() in SinkTask class




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-2729) Cached zkVersion not equal to that in zookeeper, broker not recovering.

2017-06-08 Thread Pablo (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16042543#comment-16042543
 ] 

Pablo commented on KAFKA-2729:
--

Guys, this issue is not only affecting to 0.8.2.1 as many people here are 
saying. We had this same problem on a 0.10.2 during an upgrade from 0.8.2, we 
workaround it increasing the zk session and connection timeouts and worked 
fine, but we don't feel very safe.

I suggest to add all the affected versions people are saying.

> Cached zkVersion not equal to that in zookeeper, broker not recovering.
> ---
>
> Key: KAFKA-2729
> URL: https://issues.apache.org/jira/browse/KAFKA-2729
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.1
>Reporter: Danil Serdyuchenko
>
> After a small network wobble where zookeeper nodes couldn't reach each other, 
> we started seeing a large number of undereplicated partitions. The zookeeper 
> cluster recovered, however we continued to see a large number of 
> undereplicated partitions. Two brokers in the kafka cluster were showing this 
> in the logs:
> {code}
> [2015-10-27 11:36:00,888] INFO Partition 
> [__samza_checkpoint_event-creation_1,3] on broker 5: Shrinking ISR for 
> partition [__samza_checkpoint_event-creation_1,3] from 6,5 to 5 
> (kafka.cluster.Partition)
> [2015-10-27 11:36:00,891] INFO Partition 
> [__samza_checkpoint_event-creation_1,3] on broker 5: Cached zkVersion [66] 
> not equal to that in zookeeper, skip updating ISR (kafka.cluster.Partition)
> {code}
> For all of the topics on the effected brokers. Both brokers only recovered 
> after a restart. Our own investigation yielded nothing, I was hoping you 
> could shed some light on this issue. Possibly if it's related to: 
> https://issues.apache.org/jira/browse/KAFKA-1382 , however we're using 
> 0.8.2.1.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #3256: MINOR: Updated .gitignore for excluding out direct...

2017-06-08 Thread ppatierno
Github user ppatierno closed the pull request at:

https://github.com/apache/kafka/pull/3256


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #3268: KAFKA-5410: Fix taskClass() method name in Connect...

2017-06-08 Thread ppatierno
Github user ppatierno closed the pull request at:

https://github.com/apache/kafka/pull/3268


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-5410) Fix taskClass() method name in Connector and flush() signature in SinkTask

2017-06-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16042585#comment-16042585
 ] 

ASF GitHub Bot commented on KAFKA-5410:
---

Github user ppatierno closed the pull request at:

https://github.com/apache/kafka/pull/3268


> Fix taskClass() method name in Connector and flush() signature in SinkTask
> --
>
> Key: KAFKA-5410
> URL: https://issues.apache.org/jira/browse/KAFKA-5410
> Project: Kafka
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Paolo Patierno
>
> Hi,
> the current documentation refers to getTaskClass() for the Connector class 
> during the file example. At same time, a different signature is showed for 
> the flush() method in SinkTask which now has OffsetMetadata as well.
> Thanks,
> Paolo.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #3269: KAFKA-5410: Fix taskClass() method name in Connect...

2017-06-08 Thread ppatierno
GitHub user ppatierno opened a pull request:

https://github.com/apache/kafka/pull/3269

KAFKA-5410: Fix taskClass() method name in Connector and flush() signature 
in SinkTask



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ppatierno/kafka connect-doc

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3269.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3269


commit 1cd05d9154bf64c84651bb0aadb380dafb749bfe
Author: ppatierno 
Date:   2017-06-08T11:50:45Z

Fixed method name for taskClass() in Connector class
Fixed method signature for flush() in SinkTask class




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-5410) Fix taskClass() method name in Connector and flush() signature in SinkTask

2017-06-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16042592#comment-16042592
 ] 

ASF GitHub Bot commented on KAFKA-5410:
---

GitHub user ppatierno opened a pull request:

https://github.com/apache/kafka/pull/3269

KAFKA-5410: Fix taskClass() method name in Connector and flush() signature 
in SinkTask



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ppatierno/kafka connect-doc

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3269.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3269


commit 1cd05d9154bf64c84651bb0aadb380dafb749bfe
Author: ppatierno 
Date:   2017-06-08T11:50:45Z

Fixed method name for taskClass() in Connector class
Fixed method signature for flush() in SinkTask class




> Fix taskClass() method name in Connector and flush() signature in SinkTask
> --
>
> Key: KAFKA-5410
> URL: https://issues.apache.org/jira/browse/KAFKA-5410
> Project: Kafka
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Paolo Patierno
>
> Hi,
> the current documentation refers to getTaskClass() for the Connector class 
> during the file example. At same time, a different signature is showed for 
> the flush() method in SinkTask which now has OffsetMetadata as well.
> Thanks,
> Paolo.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KAFKA-5411) Generate javadoc for AdminClient package

2017-06-08 Thread Ismael Juma (JIRA)
Ismael Juma created KAFKA-5411:
--

 Summary: Generate javadoc for AdminClient package
 Key: KAFKA-5411
 URL: https://issues.apache.org/jira/browse/KAFKA-5411
 Project: Kafka
  Issue Type: Improvement
Reporter: Ismael Juma
Assignee: Ismael Juma
Priority: Blocker
 Fix For: 0.11.0.0






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5399) Crash Kafka & Zookeper with an basic Nmap Scan

2017-06-08 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16042619#comment-16042619
 ] 

Rajini Sivaram commented on KAFKA-5399:
---

[~morozov...@gmail.com] Thank you for reporting this. Both ZooKeeper and Kafka 
catch the exception (from the description), generate warning/error, ignore the 
error and carry on. Were there other errors in the logs? Did the processes 
crash or were they hung? And does this happen every time you run nmap?

> Crash Kafka & Zookeper with an basic Nmap Scan
> --
>
> Key: KAFKA-5399
> URL: https://issues.apache.org/jira/browse/KAFKA-5399
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.2.1
> Environment: OS-X
>Reporter: Ivan Morozov
>
> Kafka running locally on OS-X can be crashed by an nmap scan. The cluster can 
> not be recovered and have to be restarted.
> Reproduce:
> 1.Start
> ```
> zookeeper-server-start /usr/local/etc/kafka/zookeeper.properties & 
> kafka-server-start /usr/local/etc/kafka/server.properties
> ```
> 2. Run scan
> ```
> nmap localhost
> ```
> Exceptions from Zookeeper:
> ```
> [2017-06-07 17:14:35,913] INFO Accepted socket connection from /0.0.0.80:0 
> (org.apache.zookeeper.server.NIOServerCnxnFactory)
> [2017-06-07 17:14:35,914] WARN Ignoring exception 
> (org.apache.zookeeper.server.NIOServerCnxnFactory)
> java.net.SocketException: Invalid argument
> at sun.nio.ch.Net.setIntOption0(Native Method)
> at sun.nio.ch.Net.setSocketOption(Net.java:334)
> at sun.nio.ch.SocketChannelImpl.setOption(SocketChannelImpl.java:190)
> at sun.nio.ch.SocketAdaptor.setBooleanOption(SocketAdaptor.java:271)
> at sun.nio.ch.SocketAdaptor.setTcpNoDelay(SocketAdaptor.java:306)
> at 
> org.apache.zookeeper.server.NIOServerCnxn.(NIOServerCnxn.java:105)
> at 
> org.apache.zookeeper.server.NIOServerCnxnFactory.createConnection(NIOServerCnxnFactory.java:156)
> at 
> org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:197)
> at java.lang.Thread.run(Thread.java:748)
> [2017-06-07 17:14:35,916] WARN Ignoring unexpected runtime exception 
> (org.apache.zookeeper.server.NIOServerCnxnFactory)
> java.lang.NullPointerException
> at 
> org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:185)
> at java.lang.Thread.run(Thread.java:748)
> ```
> Exceptions from Kafka:
> ```
> [2017-06-07 17:14:17,072] ERROR Error while accepting connection 
> (kafka.network.Acceptor)
> java.net.SocketException: Invalid argument
> at sun.nio.ch.Net.setIntOption0(Native Method)
> at sun.nio.ch.Net.setSocketOption(Net.java:334)
> at 
> sun.nio.ch.SocketChannelImpl.setOption(SocketChannelImpl.java:190)
> at 
> sun.nio.ch.SocketAdaptor.setBooleanOption(SocketAdaptor.java:271)
> at 
> sun.nio.ch.SocketAdaptor.setTcpNoDelay(SocketAdaptor.java:306)
> at kafka.network.Acceptor.accept(SocketServer.scala:344)
> at kafka.network.Acceptor.run(SocketServer.scala:283)
> at java.lang.Thread.run(Thread.java:748)
> ```



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5401) Kafka Over TLS Error - Failed to send SSL Close message - Broken Pipe

2017-06-08 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16042627#comment-16042627
 ] 

Rajini Sivaram commented on KAFKA-5401:
---

This is a duplicate of KAFKA-3702.

> Kafka Over TLS Error - Failed to send SSL Close message - Broken Pipe
> -
>
> Key: KAFKA-5401
> URL: https://issues.apache.org/jira/browse/KAFKA-5401
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.1.1
> Environment: SLES 11 , Kakaf Over TLS 
>Reporter: PaVan
>  Labels: security
>
> SLES 11 
> WARN Failed to send SSL Close message 
> (org.apache.kafka.common.network.SslTransportLayer)
> java.io.IOException: Broken pipe
>  at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>  at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
>  at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
>  at sun.nio.ch.IOUtil.write(IOUtil.java:65)
>  at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)
>  at 
> org.apache.kafka.common.network.SslTransportLayer.flush(SslTransportLayer.java:194)
>  at 
> org.apache.kafka.common.network.SslTransportLayer.close(SslTransportLayer.java:148)
>  at 
> org.apache.kafka.common.network.KafkaChannel.close(KafkaChannel.java:45)
>  at org.apache.kafka.common.network.Selector.close(Selector.java:442)
>  at org.apache.kafka.common.network.Selector.poll(Selector.java:310)
>  at 
> org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:256)
>  at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:216)
>  at 
> org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:128)
>  at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #3270: Fixed file URI path to log4j in bin/windows batch ...

2017-06-08 Thread iotmani
GitHub user iotmani opened a pull request:

https://github.com/apache/kafka/pull/3270

Fixed file URI path to log4j in bin/windows batch files

Similar to what was done here, just more issues:
https://github.com/apache/kafka/pull/3260

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/iotmani/kafka trunk

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3270.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3270


commit 900e6863981d5b3e73fc5da4c8019ac32607f397
Author: iotmani 
Date:   2017-06-07T17:57:42Z

Fixed path to log4j config file

Similar to https://github.com/apache/kafka/pull/3260

commit 95c2f1a27346b3e731604c10a2b100a9a5235d9f
Author: iotmani 
Date:   2017-06-08T12:32:31Z

Fixed path to log4j config file

commit a2fd3c5294c3cca6a825f305e5b2262f310ee2af
Author: iotmani 
Date:   2017-06-08T12:33:16Z

bin/windows/connect-distributed.bat

commit 63f9bb9b59954a07d5841a9408e8aa6f2a032d26
Author: iotmani 
Date:   2017-06-08T12:34:41Z

/bin/windows/kafka-run-class.bat




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (KAFKA-5411) Generate javadoc for AdminClient and show configs in documentation

2017-06-08 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-5411:
---
Description: Also fix the table of contents.

> Generate javadoc for AdminClient and show configs in documentation
> --
>
> Key: KAFKA-5411
> URL: https://issues.apache.org/jira/browse/KAFKA-5411
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Ismael Juma
>Assignee: Ismael Juma
>Priority: Blocker
> Fix For: 0.11.0.0
>
>
> Also fix the table of contents.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-5411) Generate javadoc for AdminClient and show configs in documentation

2017-06-08 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-5411:
---
Summary: Generate javadoc for AdminClient and show configs in documentation 
 (was: Generate javadoc for AdminClient package)

> Generate javadoc for AdminClient and show configs in documentation
> --
>
> Key: KAFKA-5411
> URL: https://issues.apache.org/jira/browse/KAFKA-5411
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Ismael Juma
>Assignee: Ismael Juma
>Priority: Blocker
> Fix For: 0.11.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[VOTE] 0.11.0.0 RC0

2017-06-08 Thread Ismael Juma
Hello Kafka users, developers and client-developers,

This is the first candidate for release of Apache Kafka 0.11.0.0. It's
worth noting that there are a small number of unresolved issues (including
documentation and system tests) related to the new AdminClient and
Exactly-once functionality[1] that we hope to resolve in the next few days.
To encourage early testing, we are releasing the first release candidate
now, but there will be at least one more release candidate.

Any and all testing is welcome, but the following areas are worth
highlighting:

1. Client developers should verify that their clients can produce/consume
to/from 0.11.0 brokers (ideally with compressed and uncompressed data).
Even though we have compatibility tests for older Java clients and we have
verified that librdkafka works fine, the only way to be sure is to test
every client.
2. Performance and stress testing. Heroku and LinkedIn have helped with
this in the past (and issues have been found and fixed).
3. End users can verify that their apps work correctly with the new release.

This is a major version release of Apache Kafka. It includes 32 new KIPs. See
the release notes and release plan (https://cwiki.apache.org/
confluence/display/KAFKA/Release+Plan+0.11.0.0) for more details. A few
feature highlights:

* Exactly-once delivery and transactional messaging
* Streams exactly-once semantics
* Admin client with support for topic, ACLs and config management
* Record headers
* Request rate quotas
* Improved resiliency: replication protocol improvement and single-threaded
controller
* Richer and more efficient message format

Release notes for the 0.11.0.0 release:
http://home.apache.org/~ijuma/kafka-0.11.0.0-rc0/RELEASE_NOTES.html

Kafka's KEYS file containing PGP keys we use to sign the release:
http://kafka.apache.org/KEYS

* Release artifacts to be voted upon (source and binary):
http://home.apache.org/~ijuma/kafka-0.11.0.0-rc0/

* Maven artifacts to be voted upon:
https://repository.apache.org/content/groups/staging/org/apache/kafka/

* Javadoc:
http://home.apache.org/~ijuma/kafka-0.11.0.0-rc0/javadoc/

* Tag to be voted upon (off 0.11.0 branch) is the 0.11.0.0 tag:
https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=
d5ee02b187fafe08b63deb52e6b07c8d1d12f18d

* Documentation:
http://kafka.apache.org/0110/documentation.html

* Protocol:
http://kafka.apache.org/0110/protocol.html

* Successful Jenkins builds for the 0.11.0 branch:
Unit/integration tests: https://builds.apache.org/job/kafka-0.11.0-jdk7/121/

Thanks,
Ismael

[1] https://issues.apache.org/jira/issues/?jql=project%20%
3D%20KAFKA%20AND%20fixVersion%20%3D%200.11.0.0%20AND%20resolution%20%3D%
20Unresolved%20ORDER%20BY%20due%20ASC%2C%20priority%
20DESC%2C%20created%20ASC


[GitHub] kafka pull request #3271: KAFKA-5411: AdminClient javadoc and documentation ...

2017-06-08 Thread ijuma
GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/3271

KAFKA-5411: AdminClient javadoc and documentation improvements

- Show AdminClient configs in the docs.
- Update Javadoc config so that public classes exposed by
the AdminClient are included.
- Version and table of contents fixes.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka 
kafka-5411-admin-client-javadoc-configs

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3271.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3271


commit 5f5f0a2339708d31fadafef1013057003ef6afdc
Author: Ismael Juma 
Date:   2017-06-08T15:15:34Z

KAFKA-5411: AdminClient javadoc and documentation improvements




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-5411) Generate javadoc for AdminClient and show configs in documentation

2017-06-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16042838#comment-16042838
 ] 

ASF GitHub Bot commented on KAFKA-5411:
---

GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/3271

KAFKA-5411: AdminClient javadoc and documentation improvements

- Show AdminClient configs in the docs.
- Update Javadoc config so that public classes exposed by
the AdminClient are included.
- Version and table of contents fixes.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka 
kafka-5411-admin-client-javadoc-configs

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3271.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3271


commit 5f5f0a2339708d31fadafef1013057003ef6afdc
Author: Ismael Juma 
Date:   2017-06-08T15:15:34Z

KAFKA-5411: AdminClient javadoc and documentation improvements




> Generate javadoc for AdminClient and show configs in documentation
> --
>
> Key: KAFKA-5411
> URL: https://issues.apache.org/jira/browse/KAFKA-5411
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Ismael Juma
>Assignee: Ismael Juma
>Priority: Blocker
> Fix For: 0.11.0.0
>
>
> Also fix the table of contents.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-5411) Generate javadoc for AdminClient and show configs in documentation

2017-06-08 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-5411:
---
Status: Patch Available  (was: Open)

> Generate javadoc for AdminClient and show configs in documentation
> --
>
> Key: KAFKA-5411
> URL: https://issues.apache.org/jira/browse/KAFKA-5411
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Ismael Juma
>Assignee: Ismael Juma
>Priority: Blocker
> Fix For: 0.11.0.0
>
>
> Also fix the table of contents.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-5391) Replace zkClient.delete* method with an equivalent zkUtils method

2017-06-08 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-5391:
---
   Resolution: Fixed
Fix Version/s: 0.11.1.0
   Status: Resolved  (was: Patch Available)

Issue resolved by pull request 3245
[https://github.com/apache/kafka/pull/3245]

> Replace zkClient.delete* method with an equivalent zkUtils method
> -
>
> Key: KAFKA-5391
> URL: https://issues.apache.org/jira/browse/KAFKA-5391
> Project: Kafka
>  Issue Type: Sub-task
>  Components: core
>Reporter: Balint Molnar
>Assignee: Balint Molnar
> Fix For: 0.11.1.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #3245: KAFKA-5391 Replace zkClient.delete* method with an...

2017-06-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3245


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-5391) Replace zkClient.delete* method with an equivalent zkUtils method

2017-06-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16042920#comment-16042920
 ] 

ASF GitHub Bot commented on KAFKA-5391:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3245


> Replace zkClient.delete* method with an equivalent zkUtils method
> -
>
> Key: KAFKA-5391
> URL: https://issues.apache.org/jira/browse/KAFKA-5391
> Project: Kafka
>  Issue Type: Sub-task
>  Components: core
>Reporter: Balint Molnar
>Assignee: Balint Molnar
> Fix For: 0.11.1.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Build failed in Jenkins: kafka-trunk-jdk8 #1677

2017-06-08 Thread Apache Jenkins Server
See 


Changes:

[ismael] KAFKA-5391; Replace zkClient.delete* method with an equivalent zkUtills

--
[...truncated 813.49 KB...]
at com.sun.proxy.$Proxy70.onOutput(Unknown Source)
at 
org.gradle.api.internal.tasks.testing.results.TestListenerAdapter.output(TestListenerAdapter.java:56)
at sun.reflect.GeneratedMethodAccessor281.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at 
org.gradle.internal.event.AbstractBroadcastDispatch.dispatch(AbstractBroadcastDispatch.java:42)
at 
org.gradle.internal.event.BroadcastDispatch$SingletonDispatch.dispatch(BroadcastDispatch.java:221)
at 
org.gradle.internal.event.BroadcastDispatch$SingletonDispatch.dispatch(BroadcastDispatch.java:145)
at 
org.gradle.internal.event.AbstractBroadcastDispatch.dispatch(AbstractBroadcastDispatch.java:58)
at 
org.gradle.internal.event.BroadcastDispatch$CompositeDispatch.dispatch(BroadcastDispatch.java:315)
at 
org.gradle.internal.event.BroadcastDispatch$CompositeDispatch.dispatch(BroadcastDispatch.java:225)
at 
org.gradle.internal.event.ListenerBroadcast.dispatch(ListenerBroadcast.java:138)
at 
org.gradle.internal.event.ListenerBroadcast.dispatch(ListenerBroadcast.java:35)
at 
org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
at com.sun.proxy.$Proxy71.output(Unknown Source)
at 
org.gradle.api.internal.tasks.testing.results.StateTrackingTestResultProcessor.output(StateTrackingTestResultProcessor.java:87)
at 
org.gradle.api.internal.tasks.testing.results.AttachParentTestResultProcessor.output(AttachParentTestResultProcessor.java:48)
at sun.reflect.GeneratedMethodAccessor280.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at 
org.gradle.internal.dispatch.FailureHandlingDispatch.dispatch(FailureHandlingDispatch.java:29)
at 
org.gradle.internal.dispatch.AsyncDispatch.dispatchMessages(AsyncDispatch.java:132)
at 
org.gradle.internal.dispatch.AsyncDispatch.access$000(AsyncDispatch.java:33)
at 
org.gradle.internal.dispatch.AsyncDispatch$1.run(AsyncDispatch.java:72)
at 
org.gradle.internal.concurrent.ExecutorPolicy$CatchAndRecordFailures.onExecute(ExecutorPolicy.java:63)
at 
org.gradle.internal.concurrent.StoppableExecutorImpl$1.run(StoppableExecutorImpl.java:46)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: No space left on device
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:326)
at com.esotericsoftware.kryo.io.Output.flush(Output.java:154)
... 52 more
java.io.IOException: No space left on device
com.esotericsoftware.kryo.KryoException: java.io.IOException: No space left on 
device
at com.esotericsoftware.kryo.io.Output.flush(Output.java:156)
at com.esotericsoftware.kryo.io.Output.require(Output.java:134)
at com.esotericsoftware.kryo.io.Output.writeBoolean(Output.java:578)
at 
org.gradle.internal.serialize.kryo.KryoBackedEncoder.writeBoolean(KryoBackedEncoder.java:63)
at 
org.gradle.api.internal.tasks.testing.junit.result.TestOutputStore$Writer.onOutput(TestOutputStore.java:99)
at 
org.gradle.api.internal.tasks.testing.junit.result.TestReportDataCollector.onOutput(TestReportDataCollector.java:141)
at sun.reflect.GeneratedMethodAccessor278.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at 
org.gradle.internal.event.AbstractBroadcastDispatch.dispatch(AbstractBroadcastDispatch.java:42)
at 
org.gradle.internal.event.BroadcastDispatch$SingletonDispatch

[jira] [Created] (KAFKA-5412) Using connect-console-sink/source.properties raises an exception related to "file" property not found

2017-06-08 Thread Paolo Patierno (JIRA)
Paolo Patierno created KAFKA-5412:
-

 Summary: Using connect-console-sink/source.properties raises an 
exception related to "file" property not found
 Key: KAFKA-5412
 URL: https://issues.apache.org/jira/browse/KAFKA-5412
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect
Affects Versions: 0.11.1.0
Reporter: Paolo Patierno


Hi,
with the latest 0.11.1.0-SNAPSHOT it happens that the Kafka Connect example 
using connect-console-sink/source.properties doesn't work anymore because the 
needed "file" property isn't found.
This is because the underlying used FileStreamSink/Source connector and task 
has defined a ConfigDef with "file" as mandatory parameter. In the case of 
console example we want to have file=null so that stdin and stdout are used.
One possible solution is set "file=" inside the provided 
connect-console-sink/source.properties.
The other one could be modify the FileStreamSink/Source source code in order to 
remove the "file" definition from the ConfigDef.
What do you think ?
I can provide a PR for that.

Thanks,
Paolo.





--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5409) Providing a custom client-id to the ConsoleProducer tool

2017-06-08 Thread Bharat Viswanadham (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16043002#comment-16043002
 ] 

Bharat Viswanadham commented on KAFKA-5409:
---

Hi Paolo,
There are 2 ways provided currently to pass in additional producer properties.
1. Using --producer-property. Ex: --producer-property client.id=producer1
2. Using --producer.config Ex: -producer.config=<>
In config file add the parameter, you are interested.





> Providing a custom client-id to the ConsoleProducer tool
> 
>
> Key: KAFKA-5409
> URL: https://issues.apache.org/jira/browse/KAFKA-5409
> Project: Kafka
>  Issue Type: Improvement
>  Components: tools
>Reporter: Paolo Patierno
>Priority: Minor
>
> Hi,
> I see that the client-id properties for the ConsoleProducer tool is always 
> "console-producer". It could be useful having it as parameter on the command 
> line or generating a random one like happens for the ConsolerConsumer.
> If it makes sense to you, I can work on that.
> Thanks,
> Paolo.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (KAFKA-5409) Providing a custom client-id to the ConsoleProducer tool

2017-06-08 Thread Bharat Viswanadham (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16043002#comment-16043002
 ] 

Bharat Viswanadham edited comment on KAFKA-5409 at 6/8/17 4:44 PM:
---

Hi Paolo,
There are 2 ways provided currently to pass in additional producer properties.
1. Using --producer-property. Ex: --producer-property client.id=producer1
2. Using --producer.config Ex: -producer.config=<>
In config file add the parameter, you are interested.






was (Author: bharatviswa):
Hi Paolo,
There are 2 ways provided currently to pass in additional producer properties.
1. Using --producer-property. Ex: --producer-property client.id=producer1
2. Using --producer.config Ex: -producer.config=<>
In config file add the parameter, you are interested.





> Providing a custom client-id to the ConsoleProducer tool
> 
>
> Key: KAFKA-5409
> URL: https://issues.apache.org/jira/browse/KAFKA-5409
> Project: Kafka
>  Issue Type: Improvement
>  Components: tools
>Reporter: Paolo Patierno
>Priority: Minor
>
> Hi,
> I see that the client-id properties for the ConsoleProducer tool is always 
> "console-producer". It could be useful having it as parameter on the command 
> line or generating a random one like happens for the ConsolerConsumer.
> If it makes sense to you, I can work on that.
> Thanks,
> Paolo.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5246) Remove backdoor that allows any client to produce to internal topics

2017-06-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16043005#comment-16043005
 ] 

ASF GitHub Bot commented on KAFKA-5246:
---

Github user datalorax closed the pull request at:

https://github.com/apache/kafka/pull/3061


>  Remove backdoor that allows any client to produce to internal topics
> -
>
> Key: KAFKA-5246
> URL: https://issues.apache.org/jira/browse/KAFKA-5246
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.0.0, 0.10.0.1, 0.10.1.0, 0.10.1.1, 0.10.2.0, 
> 0.10.2.1
>Reporter: Andy Coates
>Assignee: Andy Coates
>Priority: Minor
>
> kafka.admim.AdminUtils defines an ‘AdminClientId' val, which looks to be 
> unused in the code, with the exception of a single use in KafkaAPis.scala in 
> handleProducerRequest, where is looks to allow any client, using the special 
> ‘__admin_client' client id, to append to internal topics.
> This looks like a security risk to me, as it would allow any client to 
> produce either rouge offsets or even a record containing something other than 
> group/offset info.
> Can we remove this please?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #3061: KAFKA-5246: Remove backdoor that allows any client...

2017-06-08 Thread datalorax
Github user datalorax closed the pull request at:

https://github.com/apache/kafka/pull/3061


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #3261: MINOR: Set log level for producer internals to tra...

2017-06-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3261


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (KAFKA-5246) Remove backdoor that allows any client to produce to internal topics

2017-06-08 Thread Andy Coates (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Coates updated KAFKA-5246:
---
Resolution: Won't Fix
Status: Resolved  (was: Patch Available)

Discussions on PR mean we're closing this without fixing. Work around is to use 
ACLs to lock down the __consumer_offset topic to only allow required use-cases 
direct access to produce to it.

>  Remove backdoor that allows any client to produce to internal topics
> -
>
> Key: KAFKA-5246
> URL: https://issues.apache.org/jira/browse/KAFKA-5246
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.0.0, 0.10.0.1, 0.10.1.0, 0.10.1.1, 0.10.2.0, 
> 0.10.2.1
>Reporter: Andy Coates
>Assignee: Andy Coates
>Priority: Minor
>
> kafka.admim.AdminUtils defines an ‘AdminClientId' val, which looks to be 
> unused in the code, with the exception of a single use in KafkaAPis.scala in 
> handleProducerRequest, where is looks to allow any client, using the special 
> ‘__admin_client' client id, to append to internal topics.
> This looks like a security risk to me, as it would allow any client to 
> produce either rouge offsets or even a record containing something other than 
> group/offset info.
> Can we remove this please?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5412) Using connect-console-sink/source.properties raises an exception related to "file" property not found

2017-06-08 Thread Randall Hauch (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16043018#comment-16043018
 ] 

Randall Hauch commented on KAFKA-5412:
--

I believe the proper correction would be to add a null default value to the 
ConfigDef in FileStreamSink/Source. When no default value is provided, this 
means that the value is required. Clearly the FileStreamSink/Source is able to 
handle the "file" configuration not being specified, since it then works on the 
standard ouptut/input, respectively.

> Using connect-console-sink/source.properties raises an exception related to 
> "file" property not found
> -
>
> Key: KAFKA-5412
> URL: https://issues.apache.org/jira/browse/KAFKA-5412
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.11.1.0
>Reporter: Paolo Patierno
>
> Hi,
> with the latest 0.11.1.0-SNAPSHOT it happens that the Kafka Connect example 
> using connect-console-sink/source.properties doesn't work anymore because the 
> needed "file" property isn't found.
> This is because the underlying used FileStreamSink/Source connector and task 
> has defined a ConfigDef with "file" as mandatory parameter. In the case of 
> console example we want to have file=null so that stdin and stdout are used.
> One possible solution is set "file=" inside the provided 
> connect-console-sink/source.properties.
> The other one could be modify the FileStreamSink/Source source code in order 
> to remove the "file" definition from the ConfigDef.
> What do you think ?
> I can provide a PR for that.
> Thanks,
> Paolo.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-5412) Using connect-console-sink/source.properties raises an exception related to "file" property not found

2017-06-08 Thread Randall Hauch (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-5412:
-
Affects Version/s: (was: 0.11.1.0)
   0.11.0.0

> Using connect-console-sink/source.properties raises an exception related to 
> "file" property not found
> -
>
> Key: KAFKA-5412
> URL: https://issues.apache.org/jira/browse/KAFKA-5412
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.11.0.0
>Reporter: Paolo Patierno
>
> Hi,
> with the latest 0.11.1.0-SNAPSHOT it happens that the Kafka Connect example 
> using connect-console-sink/source.properties doesn't work anymore because the 
> needed "file" property isn't found.
> This is because the underlying used FileStreamSink/Source connector and task 
> has defined a ConfigDef with "file" as mandatory parameter. In the case of 
> console example we want to have file=null so that stdin and stdout are used.
> One possible solution is set "file=" inside the provided 
> connect-console-sink/source.properties.
> The other one could be modify the FileStreamSink/Source source code in order 
> to remove the "file" definition from the ConfigDef.
> What do you think ?
> I can provide a PR for that.
> Thanks,
> Paolo.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5412) Using connect-console-sink/source.properties raises an exception related to "file" property not found

2017-06-08 Thread Paolo Patierno (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16043034#comment-16043034
 ] 

Paolo Patierno commented on KAFKA-5412:
---

Yes I agree this is a  better solution. I'd like to work on that as a newbie 
contributor to the project. Can you add me to the contributor list so that I 
can assign myself to this and eventually other JIRAs.What do you think ?

> Using connect-console-sink/source.properties raises an exception related to 
> "file" property not found
> -
>
> Key: KAFKA-5412
> URL: https://issues.apache.org/jira/browse/KAFKA-5412
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.11.0.0
>Reporter: Paolo Patierno
>
> Hi,
> with the latest 0.11.1.0-SNAPSHOT it happens that the Kafka Connect example 
> using connect-console-sink/source.properties doesn't work anymore because the 
> needed "file" property isn't found.
> This is because the underlying used FileStreamSink/Source connector and task 
> has defined a ConfigDef with "file" as mandatory parameter. In the case of 
> console example we want to have file=null so that stdin and stdout are used.
> One possible solution is set "file=" inside the provided 
> connect-console-sink/source.properties.
> The other one could be modify the FileStreamSink/Source source code in order 
> to remove the "file" definition from the ConfigDef.
> What do you think ?
> I can provide a PR for that.
> Thanks,
> Paolo.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #3272: MINOR: disable flaky Streams EOS integration tests

2017-06-08 Thread mjsax
GitHub user mjsax opened a pull request:

https://github.com/apache/kafka/pull/3272

MINOR: disable flaky Streams EOS integration tests



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mjsax/kafka minor-disable-eos-tests

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3272.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3272


commit 07f9b8fc90ed13bb551c5806b1147d63ec4c2b88
Author: Matthias J. Sax 
Date:   2017-06-08T17:01:41Z

MINOR: disable flaky Streams EOS integration tests




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [DISCUSS] KIP-167: Add a restoreAll method to StateRestoreCallback

2017-06-08 Thread Bill Bejeck
Guozhang, Damian thanks for the comments.

Giving developers the ability to hook into StateStore recovery phases was
part of my original intent. However the state the KIP is in now won't
provide this functionality.

As a result I'll be doing a significant revision of KIP-167.  I'll be sure
to incorporate all your comments in the new revision.

Thanks,
Bill

On Wed, Jun 7, 2017 at 5:24 AM, Damian Guy  wrote:

> I'm largely in agreement with what Guozhang has suggested, i.e.,
> StateRestoreContext shouldn't have any setters on it and also need to have
> the end offset available such that people can use it derive progress.
> Slightly different, maybe the StateRestoreContext interface could be:
>
> long beginOffset()
> long endOffset()
> long currentOffset()
>
> One further thing, this currently doesn't provide developers the ability to
> hook into this information if they are using the built-in StateStores. Is
> this something we should be considering?
>
>
> On Tue, 6 Jun 2017 at 23:32 Guozhang Wang  wrote:
>
> > Thanks for the updated KIP Bill, I have a couple of comments:
> >
> > 1) I'm assuming beginRestore / endRestore is called only once per store
> > throughout the whole restoration process, and restoreAll is called per
> > batch. In that case I feel we can set the StateRestoreContext as a second
> > parameter in restoreAll and in endRestore as well, and let the library to
> > set the corresponding values instead and only let users to read (since
> the
> > collection of key-value pairs do not contain offset information anyways
> > users cannot really set the offset). The "lastOffsetRestored" would be
> the
> > starting offset when called on `beginRestore`.
> >
> > 2) Users who wants to implement their own batch restoration callbacks
> would
> > now need to implement both `restore` and `restoreAll` while they either
> let
> > `restoreAll` to call `restore` or implement the logic in `restoreAll`
> only
> > and never call `restore`. Maybe we can provide two abstract impl of
> > BatchingStateRestoreCallbacks which does beginRestore / endRestore as
> > no-ops, with one callback implementing `restoreAll` to call abstract
> > `restore` while the other implement `restore` to throw "not supported
> > exception" and keep `restoreAll` abstract.
> >
> > 3) I think we can also return the "offset limit" in StateRestoreContext,
> > which is important for users to track the restoration progress since
> > otherwise they could not tell how many percent of restoration has
> > completed.  I.e.:
> >
> > public interface BatchingStateRestoreCallback extends
> StateRestoreCallback
> > {
> >
> >void restoreAll(Collection> records,
> > StateRestoreContext
> > restoreContext);
> >
> >void beginRestore(StateRestoreContext restoreContext);
> >
> >void endRestore(StateRestoreContext restoreContext);
> > }
> >
> > public interface StateRestoreContext {
> >
> >   long lastOffsetRestored();
> >
> >   long endOffsetToRestore();
> >
> >   int numberRestored();
> > }
> >
> >
> > Guozhang
> >
> >
> >
> > On Fri, Jun 2, 2017 at 9:16 AM, Bill Bejeck  wrote:
> >
> > > Guozhang, Matthias,
> > >
> > > Thanks for the comments.  I have updated the KIP, (JIRA title and
> > > description as well).
> > >
> > > I had thought about introducing a separate interface altogether, but
> > > extending the current one makes more sense.
> > >
> > > As for intermediate callbacks based on time or number of records, I
> think
> > > the latest update to the KIP addresses this point of querying for
> > > intermediate results, but it would be per batch restored.
> > >
> > > Thanks,
> > > Bill
> > >
> > >
> > >
> > >
> > >
> > > On Fri, Jun 2, 2017 at 8:36 AM, Jim Jagielski  wrote:
> > >
> > > >
> > > > > On Jun 2, 2017, at 12:54 AM, Matthias J. Sax <
> matth...@confluent.io>
> > > > wrote:
> > > > >
> > > > > With regard to backward compatibility, we should not change the
> > current
> > > > > interface, but add a new interface that extends the current one.
> > > > >
> > > >
> > > > ++1
> > > >
> > > >
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
>


[GitHub] kafka pull request #3273: HOTFIX: for flaky Streams EOS integration tests

2017-06-08 Thread mjsax
GitHub user mjsax opened a pull request:

https://github.com/apache/kafka/pull/3273

HOTFIX: for flaky Streams EOS integration tests



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mjsax/kafka hotfix-flaky-stream-eos-test

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3273.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3273


commit 643331fcc1b0925d18587633870ba09bb6b368cb
Author: Matthias J. Sax 
Date:   2017-06-08T17:36:53Z

HOTFIX: for flaky Streams EOS integration tests




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [DISCUSS] KIP-150 - Kafka-Streams Cogroup

2017-06-08 Thread Xavier Léauté
I have a minor suggestion to make the API a little bit more symmetric.
I feel it would make more sense to move the initializer and serde to the
final aggregate statement, since the serde only applies to the state store,
and the initializer doesn't bear any relation to the first group in
particular. It would end up looking like this:

KTable cogrouped =
grouped1.cogroup(aggregator1)
.cogroup(grouped2, aggregator2)
.cogroup(grouped3, aggregator3)
.aggregate(initializer1, aggValueSerde, storeName1);

Alternatively, we could move the the first cogroup() method to
KStreamBuilder, similar to how we have .merge()
and end up with an api that would be even more symmetric.

KStreamBuilder.cogroup(grouped1, aggregator1)
  .cogroup(grouped2, aggregator2)
  .cogroup(grouped3, aggregator3)
  .aggregate(initializer1, aggValueSerde, storeName1);

This doesn't have to be a blocker, but I thought it would make the API just
a tad cleaner.

On Tue, Jun 6, 2017 at 3:59 PM Guozhang Wang  wrote:

> Kyle,
>
> Thanks a lot for the updated KIP. It looks good to me.
>
>
> Guozhang
>
>
> On Fri, Jun 2, 2017 at 5:37 AM, Jim Jagielski  wrote:
>
> > This makes much more sense to me. +1
> >
> > > On Jun 1, 2017, at 10:33 AM, Kyle Winkelman 
> > wrote:
> > >
> > > I have updated the KIP and my PR. Let me know what you think.
> > > To created a cogrouped stream just call cogroup on a KgroupedStream and
> > > supply the initializer, aggValueSerde, and an aggregator. Then continue
> > > adding kgroupedstreams and aggregators. Then call one of the many
> > aggregate
> > > calls to create a KTable.
> > >
> > > Thanks,
> > > Kyle
> > >
> > > On Jun 1, 2017 4:03 AM, "Damian Guy"  wrote:
> > >
> > >> Hi Kyle,
> > >>
> > >> Thanks for the update. I think just one initializer makes sense as it
> > >> should only be called once per key and generally it is just going to
> > create
> > >> a new instance of whatever the Aggregate class is.
> > >>
> > >> Cheers,
> > >> Damian
> > >>
> > >> On Wed, 31 May 2017 at 20:09 Kyle Winkelman  >
> > >> wrote:
> > >>
> > >>> Hello all,
> > >>>
> > >>> I have spent some more time on this and the best alternative I have
> > come
> > >> up
> > >>> with is:
> > >>> KGroupedStream has a single cogroup call that takes an initializer
> and
> > an
> > >>> aggregator.
> > >>> CogroupedKStream has a cogroup call that takes additional
> groupedStream
> > >>> aggregator pairs.
> > >>> CogroupedKStream has multiple aggregate methods that create the
> > different
> > >>> stores.
> > >>>
> > >>> I plan on updating the kip but I want people's input on if we should
> > have
> > >>> the initializer be passed in once at the beginning or if we should
> > >> instead
> > >>> have the initializer be required for each call to one of the
> aggregate
> > >>> calls. The first makes more sense to me but doesnt allow the user to
> > >>> specify different initializers for different tables.
> > >>>
> > >>> Thanks,
> > >>> Kyle
> > >>>
> > >>> On May 24, 2017 7:46 PM, "Kyle Winkelman" 
> > >>> wrote:
> > >>>
> >  Yea I really like that idea I'll see what I can do to update the kip
> > >> and
> >  my pr when I have some time. I'm not sure how well creating the
> >  kstreamaggregates will go though because at that point I will have
> > >> thrown
> >  away the type of the values. It will be type safe I just may need to
> > >> do a
> >  little forcing.
> > 
> >  Thanks,
> >  Kyle
> > 
> >  On May 24, 2017 3:28 PM, "Guozhang Wang" 
> wrote:
> > 
> > > Kyle,
> > >
> > > Thanks for the explanations, my previous read on the wiki examples
> > was
> > > wrong.
> > >
> > > So I guess my motivation should be "reduced" to: can we move the
> > >> window
> > > specs param from "KGroupedStream#cogroup(..)" to
> > > "CogroupedKStream#aggregate(..)", and my motivations are:
> > >
> > > 1. minor: we can reduce the #.generics in CogroupedKStream from 3
> to
> > >> 2.
> > > 2. major: this is for extensibility of the APIs, and since we are
> > >>> removing
> > > the "Evolving" annotations on Streams it may be harder to change it
> > >>> again
> > > in the future. The extended use cases are that people wanted to
> have
> > > windowed running aggregates on different granularities, e.g. "give
> me
> > >>> the
> > > counts per-minute, per-hour, per-day and per-week", and today in
> DSL
> > >> we
> > > need to specify that case in multiple aggregate operators, which
> gets
> > >> a
> > > state store / changelog, etc. And it is possible to optimize it as
> > >> well
> > >>> to
> > > a single state store. Its implementation would be tricky as you
> need
> > >> to
> > > contain different lengthed windows within your window store but
> just
> > >>> from
> > > the public API point of view, it could be specified as:
> > >
> > > CogroupedKStream stream = stream1.cogroup(stre

Re: [DISCUSS] KIP-150 - Kafka-Streams Cogroup

2017-06-08 Thread Guozhang Wang
This suggestion lgtm. I would vote for the first alternative than adding it
to the `KStreamBuilder` though.

On Thu, Jun 8, 2017 at 10:58 AM, Xavier Léauté  wrote:

> I have a minor suggestion to make the API a little bit more symmetric.
> I feel it would make more sense to move the initializer and serde to the
> final aggregate statement, since the serde only applies to the state store,
> and the initializer doesn't bear any relation to the first group in
> particular. It would end up looking like this:
>
> KTable cogrouped =
> grouped1.cogroup(aggregator1)
> .cogroup(grouped2, aggregator2)
> .cogroup(grouped3, aggregator3)
> .aggregate(initializer1, aggValueSerde, storeName1);
>
> Alternatively, we could move the the first cogroup() method to
> KStreamBuilder, similar to how we have .merge()
> and end up with an api that would be even more symmetric.
>
> KStreamBuilder.cogroup(grouped1, aggregator1)
>   .cogroup(grouped2, aggregator2)
>   .cogroup(grouped3, aggregator3)
>   .aggregate(initializer1, aggValueSerde, storeName1);
>
> This doesn't have to be a blocker, but I thought it would make the API just
> a tad cleaner.
>
> On Tue, Jun 6, 2017 at 3:59 PM Guozhang Wang  wrote:
>
> > Kyle,
> >
> > Thanks a lot for the updated KIP. It looks good to me.
> >
> >
> > Guozhang
> >
> >
> > On Fri, Jun 2, 2017 at 5:37 AM, Jim Jagielski  wrote:
> >
> > > This makes much more sense to me. +1
> > >
> > > > On Jun 1, 2017, at 10:33 AM, Kyle Winkelman <
> winkelman.k...@gmail.com>
> > > wrote:
> > > >
> > > > I have updated the KIP and my PR. Let me know what you think.
> > > > To created a cogrouped stream just call cogroup on a KgroupedStream
> and
> > > > supply the initializer, aggValueSerde, and an aggregator. Then
> continue
> > > > adding kgroupedstreams and aggregators. Then call one of the many
> > > aggregate
> > > > calls to create a KTable.
> > > >
> > > > Thanks,
> > > > Kyle
> > > >
> > > > On Jun 1, 2017 4:03 AM, "Damian Guy"  wrote:
> > > >
> > > >> Hi Kyle,
> > > >>
> > > >> Thanks for the update. I think just one initializer makes sense as
> it
> > > >> should only be called once per key and generally it is just going to
> > > create
> > > >> a new instance of whatever the Aggregate class is.
> > > >>
> > > >> Cheers,
> > > >> Damian
> > > >>
> > > >> On Wed, 31 May 2017 at 20:09 Kyle Winkelman <
> winkelman.k...@gmail.com
> > >
> > > >> wrote:
> > > >>
> > > >>> Hello all,
> > > >>>
> > > >>> I have spent some more time on this and the best alternative I have
> > > come
> > > >> up
> > > >>> with is:
> > > >>> KGroupedStream has a single cogroup call that takes an initializer
> > and
> > > an
> > > >>> aggregator.
> > > >>> CogroupedKStream has a cogroup call that takes additional
> > groupedStream
> > > >>> aggregator pairs.
> > > >>> CogroupedKStream has multiple aggregate methods that create the
> > > different
> > > >>> stores.
> > > >>>
> > > >>> I plan on updating the kip but I want people's input on if we
> should
> > > have
> > > >>> the initializer be passed in once at the beginning or if we should
> > > >> instead
> > > >>> have the initializer be required for each call to one of the
> > aggregate
> > > >>> calls. The first makes more sense to me but doesnt allow the user
> to
> > > >>> specify different initializers for different tables.
> > > >>>
> > > >>> Thanks,
> > > >>> Kyle
> > > >>>
> > > >>> On May 24, 2017 7:46 PM, "Kyle Winkelman" <
> winkelman.k...@gmail.com>
> > > >>> wrote:
> > > >>>
> > >  Yea I really like that idea I'll see what I can do to update the
> kip
> > > >> and
> > >  my pr when I have some time. I'm not sure how well creating the
> > >  kstreamaggregates will go though because at that point I will have
> > > >> thrown
> > >  away the type of the values. It will be type safe I just may need
> to
> > > >> do a
> > >  little forcing.
> > > 
> > >  Thanks,
> > >  Kyle
> > > 
> > >  On May 24, 2017 3:28 PM, "Guozhang Wang" 
> > wrote:
> > > 
> > > > Kyle,
> > > >
> > > > Thanks for the explanations, my previous read on the wiki
> examples
> > > was
> > > > wrong.
> > > >
> > > > So I guess my motivation should be "reduced" to: can we move the
> > > >> window
> > > > specs param from "KGroupedStream#cogroup(..)" to
> > > > "CogroupedKStream#aggregate(..)", and my motivations are:
> > > >
> > > > 1. minor: we can reduce the #.generics in CogroupedKStream from 3
> > to
> > > >> 2.
> > > > 2. major: this is for extensibility of the APIs, and since we are
> > > >>> removing
> > > > the "Evolving" annotations on Streams it may be harder to change
> it
> > > >>> again
> > > > in the future. The extended use cases are that people wanted to
> > have
> > > > windowed running aggregates on different granularities, e.g.
> "give
> > me
> > > >>> the
> > > > counts per-minute, per-hour, per-day and per-w

Re: [DISCUSS] KIP-150 - Kafka-Streams Cogroup

2017-06-08 Thread Guozhang Wang
On a second thought... This is the current proposal API


```

 CogroupedKStream cogroup(final Initializer initializer, final
Aggregator aggregator, final Serde
aggValueSerde)

```


If we do not have the initializer in the first co-group it might be a bit
awkward for users to specify the aggregator that returns a typed  value?
Maybe it is still better to put these two functions in the same api?



Guozhang

On Thu, Jun 8, 2017 at 11:08 AM, Guozhang Wang  wrote:

> This suggestion lgtm. I would vote for the first alternative than adding
> it to the `KStreamBuilder` though.
>
> On Thu, Jun 8, 2017 at 10:58 AM, Xavier Léauté 
> wrote:
>
>> I have a minor suggestion to make the API a little bit more symmetric.
>> I feel it would make more sense to move the initializer and serde to the
>> final aggregate statement, since the serde only applies to the state
>> store,
>> and the initializer doesn't bear any relation to the first group in
>> particular. It would end up looking like this:
>>
>> KTable cogrouped =
>> grouped1.cogroup(aggregator1)
>> .cogroup(grouped2, aggregator2)
>> .cogroup(grouped3, aggregator3)
>> .aggregate(initializer1, aggValueSerde, storeName1);
>>
>> Alternatively, we could move the the first cogroup() method to
>> KStreamBuilder, similar to how we have .merge()
>> and end up with an api that would be even more symmetric.
>>
>> KStreamBuilder.cogroup(grouped1, aggregator1)
>>   .cogroup(grouped2, aggregator2)
>>   .cogroup(grouped3, aggregator3)
>>   .aggregate(initializer1, aggValueSerde, storeName1);
>>
>> This doesn't have to be a blocker, but I thought it would make the API
>> just
>> a tad cleaner.
>>
>> On Tue, Jun 6, 2017 at 3:59 PM Guozhang Wang  wrote:
>>
>> > Kyle,
>> >
>> > Thanks a lot for the updated KIP. It looks good to me.
>> >
>> >
>> > Guozhang
>> >
>> >
>> > On Fri, Jun 2, 2017 at 5:37 AM, Jim Jagielski  wrote:
>> >
>> > > This makes much more sense to me. +1
>> > >
>> > > > On Jun 1, 2017, at 10:33 AM, Kyle Winkelman <
>> winkelman.k...@gmail.com>
>> > > wrote:
>> > > >
>> > > > I have updated the KIP and my PR. Let me know what you think.
>> > > > To created a cogrouped stream just call cogroup on a KgroupedStream
>> and
>> > > > supply the initializer, aggValueSerde, and an aggregator. Then
>> continue
>> > > > adding kgroupedstreams and aggregators. Then call one of the many
>> > > aggregate
>> > > > calls to create a KTable.
>> > > >
>> > > > Thanks,
>> > > > Kyle
>> > > >
>> > > > On Jun 1, 2017 4:03 AM, "Damian Guy"  wrote:
>> > > >
>> > > >> Hi Kyle,
>> > > >>
>> > > >> Thanks for the update. I think just one initializer makes sense as
>> it
>> > > >> should only be called once per key and generally it is just going
>> to
>> > > create
>> > > >> a new instance of whatever the Aggregate class is.
>> > > >>
>> > > >> Cheers,
>> > > >> Damian
>> > > >>
>> > > >> On Wed, 31 May 2017 at 20:09 Kyle Winkelman <
>> winkelman.k...@gmail.com
>> > >
>> > > >> wrote:
>> > > >>
>> > > >>> Hello all,
>> > > >>>
>> > > >>> I have spent some more time on this and the best alternative I
>> have
>> > > come
>> > > >> up
>> > > >>> with is:
>> > > >>> KGroupedStream has a single cogroup call that takes an initializer
>> > and
>> > > an
>> > > >>> aggregator.
>> > > >>> CogroupedKStream has a cogroup call that takes additional
>> > groupedStream
>> > > >>> aggregator pairs.
>> > > >>> CogroupedKStream has multiple aggregate methods that create the
>> > > different
>> > > >>> stores.
>> > > >>>
>> > > >>> I plan on updating the kip but I want people's input on if we
>> should
>> > > have
>> > > >>> the initializer be passed in once at the beginning or if we should
>> > > >> instead
>> > > >>> have the initializer be required for each call to one of the
>> > aggregate
>> > > >>> calls. The first makes more sense to me but doesnt allow the user
>> to
>> > > >>> specify different initializers for different tables.
>> > > >>>
>> > > >>> Thanks,
>> > > >>> Kyle
>> > > >>>
>> > > >>> On May 24, 2017 7:46 PM, "Kyle Winkelman" <
>> winkelman.k...@gmail.com>
>> > > >>> wrote:
>> > > >>>
>> > >  Yea I really like that idea I'll see what I can do to update the
>> kip
>> > > >> and
>> > >  my pr when I have some time. I'm not sure how well creating the
>> > >  kstreamaggregates will go though because at that point I will
>> have
>> > > >> thrown
>> > >  away the type of the values. It will be type safe I just may
>> need to
>> > > >> do a
>> > >  little forcing.
>> > > 
>> > >  Thanks,
>> > >  Kyle
>> > > 
>> > >  On May 24, 2017 3:28 PM, "Guozhang Wang" 
>> > wrote:
>> > > 
>> > > > Kyle,
>> > > >
>> > > > Thanks for the explanations, my previous read on the wiki
>> examples
>> > > was
>> > > > wrong.
>> > > >
>> > > > So I guess my motivation should be "reduced" to: can we move the
>> > > >> window
>> > > > specs param from "KGroupedStream#cogroup(

Re: [Vote] KIP-150 - Kafka-Streams Cogroup

2017-06-08 Thread Guozhang Wang
I think we can continue on this voting thread.

Currently we have one binding vote and 2 non-binging votes. I would like to
call out for other people especially committers to also take a look at this
proposal and vote.


Guozhang


On Wed, Jun 7, 2017 at 6:37 PM, Kyle Winkelman 
wrote:

> Just bringing people's attention to the vote thread for my KIP. I started
> it before another round of discussion happened. Not sure the protocol so
> someone let me know if I am supposed to restart the vote.
> Thanks,
> Kyle
>
> On May 24, 2017 8:49 AM, "Bill Bejeck"  wrote:
>
> > +1  for the KIP and +1 what Xavier said as well.
> >
> > On Wed, May 24, 2017 at 3:57 AM, Damian Guy 
> wrote:
> >
> > > Also, +1 for the KIP
> > >
> > > On Wed, 24 May 2017 at 08:57 Damian Guy  wrote:
> > >
> > > > +1 to what Xavier said
> > > >
> > > > On Wed, 24 May 2017 at 06:45 Xavier Léauté 
> > wrote:
> > > >
> > > >> I don't think we should wait for entries from each stream, since
> that
> > > >> might
> > > >> limit the usefulness of the cogroup operator. There are instances
> > where
> > > it
> > > >> can be useful to compute something based on data from one or more
> > > stream,
> > > >> without having to wait for all the streams to produce something for
> > the
> > > >> group. In the example I gave in the discussion, it is possible to
> > > compute
> > > >> impression/auction statistics without having to wait for click data,
> > > which
> > > >> can typically arrive several minutes late.
> > > >>
> > > >> We could have a separate discussion around adding inner / outer
> > > modifiers
> > > >> to each of the streams to decide which fields are optional /
> required
> > > >> before sending updates if we think that might be useful.
> > > >>
> > > >>
> > > >>
> > > >> On Tue, May 23, 2017 at 6:28 PM Guozhang Wang 
> > > wrote:
> > > >>
> > > >> > The proposal LGTM, +1
> > > >> >
> > > >> > One question I have is about when to send the record to the
> resulted
> > > >> KTable
> > > >> > changelog. For example in your code snippet in the wiki page,
> before
> > > you
> > > >> > see the end result of
> > > >> >
> > > >> > 1L, Customer[
> > > >> >
> > > >> >   cart:{Item[no:01], Item[no:03],
> Item[no:04]},
> > > >> >   purchases:{Item[no:07], Item[no:08]},
> > > >> >   wishList:{Item[no:11]}
> > > >> >   ]
> > > >> >
> > > >> >
> > > >> > You will firs see
> > > >> >
> > > >> > 1L, Customer[
> > > >> >
> > > >> >   cart:{Item[no:01]},
> > > >> >   purchases:{},
> > > >> >   wishList:{}
> > > >> >   ]
> > > >> >
> > > >> > 1L, Customer[
> > > >> >
> > > >> >   cart:{Item[no:01]},
> > > >> >   purchases:{Item[no:07],Item[no:08]},
> > > >> >
> > > >> >   wishList:{}
> > > >> >   ]
> > > >> >
> > > >> > 1L, Customer[
> > > >> >
> > > >> >   cart:{Item[no:01]},
> > > >> >   purchases:{Item[no:07],Item[no:08]},
> > > >> >
> > > >> >   wishList:{}
> > > >> >   ]
> > > >> >
> > > >> > ...
> > > >> >
> > > >> >
> > > >> > I'm wondering if it makes more sense to only start sending the
> > update
> > > if
> > > >> > the corresponding agg-key has seen at least one input from each of
> > the
> > > >> > input stream? Maybe it is out of the scope of this KIP and we can
> > make
> > > >> it a
> > > >> > more general discussion in a separate one.
> > > >> >
> > > >> >
> > > >> > Guozhang
> > > >> >
> > > >> >
> > > >> > On Fri, May 19, 2017 at 8:37 AM, Xavier Léauté <
> xav...@confluent.io
> > >
> > > >> > wrote:
> > > >> >
> > > >> > > Hi Kyle, I left a few more comments in the discussion thread, if
> > you
> > > >> > > wouldn't mind taking a look
> > > >> > >
> > > >> > > On Fri, May 19, 2017 at 5:31 AM Kyle Winkelman <
> > > >> winkelman.k...@gmail.com
> > > >> > >
> > > >> > > wrote:
> > > >> > >
> > > >> > > > Hello all,
> > > >> > > >
> > > >> > > > I would like to start the vote on KIP-150.
> > > >> > > >
> > > >> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-150+-+
> > > >> > > Kafka-Streams+Cogroup
> > > >> > > >
> > > >> > > > Thanks,
> > > >> > > > Kyle
> > > >> > > >
> > > >> > >
> > > >> >
> > > >> >
> > > >> >
> > > >> > --
> > > >> > -- Guozhang
> > > >> >
> > > >>
> > > >
> > >
> >
>



-- 
-- Guozhang


[jira] [Commented] (KAFKA-5409) Providing a custom client-id to the ConsoleProducer tool

2017-06-08 Thread Paolo Patierno (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16043185#comment-16043185
 ] 

Paolo Patierno commented on KAFKA-5409:
---

Hi Bharat,
you are right but ... inside the getNewProducerProps() method, the call to 
producerProps(config) gets the "extraProducerProps" and fill the properties 
(i.e. the client.id passed by the --producer-property option) in the right way 
then ... the client.id is overridden by :

{code}
props.put(ProducerConfig.CLIENT_ID_CONFIG, "console-producer")
{code}

so the --producer-property option loses its effect.

> Providing a custom client-id to the ConsoleProducer tool
> 
>
> Key: KAFKA-5409
> URL: https://issues.apache.org/jira/browse/KAFKA-5409
> Project: Kafka
>  Issue Type: Improvement
>  Components: tools
>Reporter: Paolo Patierno
>Priority: Minor
>
> Hi,
> I see that the client-id properties for the ConsoleProducer tool is always 
> "console-producer". It could be useful having it as parameter on the command 
> line or generating a random one like happens for the ConsolerConsumer.
> If it makes sense to you, I can work on that.
> Thanks,
> Paolo.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (KAFKA-5409) Providing a custom client-id to the ConsoleProducer tool

2017-06-08 Thread Paolo Patierno (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16043185#comment-16043185
 ] 

Paolo Patierno edited comment on KAFKA-5409 at 6/8/17 6:23 PM:
---

Hi Bharat,
you are right but ... inside the getNewProducerProps() method, the call to 
producerProps(config) gets the "extraProducerProps" and fill the properties 
(i.e. the client.id passed by the --producer-property option) in the right way 
... 

{code}
private def producerProps(config: ProducerConfig): Properties = {
val props =
  if (config.options.has(config.producerConfigOpt))
Utils.loadProps(config.options.valueOf(config.producerConfigOpt))
  else new Properties
props.putAll(config.extraProducerProps)
props
  }
{code}

then ... the client.id is overridden by :

{code}
props.put(ProducerConfig.CLIENT_ID_CONFIG, "console-producer")
{code}

so the --producer-property option loses its effect.


was (Author: ppatierno):
Hi Bharat,
you are right but ... inside the getNewProducerProps() method, the call to 
producerProps(config) gets the "extraProducerProps" and fill the properties 
(i.e. the client.id passed by the --producer-property option) in the right way 
then ... the client.id is overridden by :

{code}
props.put(ProducerConfig.CLIENT_ID_CONFIG, "console-producer")
{code}

so the --producer-property option loses its effect.

> Providing a custom client-id to the ConsoleProducer tool
> 
>
> Key: KAFKA-5409
> URL: https://issues.apache.org/jira/browse/KAFKA-5409
> Project: Kafka
>  Issue Type: Improvement
>  Components: tools
>Reporter: Paolo Patierno
>Priority: Minor
>
> Hi,
> I see that the client-id properties for the ConsoleProducer tool is always 
> "console-producer". It could be useful having it as parameter on the command 
> line or generating a random one like happens for the ConsolerConsumer.
> If it makes sense to you, I can work on that.
> Thanks,
> Paolo.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[DISCUSS] KIP-163: Lower the Minimum Required ACL Permission of OffsetFetch

2017-06-08 Thread Vahid S Hashemian
Hi all,

I'm resending my earlier note hoping it would spark some conversation this 
time around :)

Thanks.
--Vahid




From:   "Vahid S Hashemian" 
To: dev , "Kafka User" 
Date:   05/30/2017 08:33 AM
Subject:KIP-163: Lower the Minimum Required ACL Permission of 
OffsetFetch



Hi,

I started a new KIP to improve the minimum required ACL permissions of 
some of the APIs: 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-163%3A+Lower+the+Minimum+Required+ACL+Permission+of+OffsetFetch

The KIP is to address KAFKA-4585.

Feedback and suggestions are welcome!

Thanks.
--Vahid







Re: [VOTE] KIP-164 Add unavailablePartitionCount and per-partition Unavailable metrics

2017-06-08 Thread Dong Lin
Thanks to everyone who voted! I am waiting for committers to vote before
closing this thread.

On Mon, Jun 5, 2017 at 6:10 AM, Bill Bejeck  wrote:

> Thanks for the KIP.
>
> +1
>
> -Bill
>
> On Mon, Jun 5, 2017 at 8:51 AM, Edoardo Comar  wrote:
>
> > +1 (non binding)
> >
> > Dong, thanks for the KIP :-)
> > --
> > Edoardo Comar
> > IBM Message Hub
> > eco...@uk.ibm.com
> > IBM UK Ltd, Hursley Park, SO21 2JN
> >
> >
> >
> >
> > From:Michal Borowiecki 
> > To:dev@kafka.apache.org
> > Date:02/06/2017 10:25
> > Subject:Re: [VOTE] KIP-164 Add unavailablePartitionCount and
> > per-partition Unavailable metrics
> > --
> >
> >
> >
> > +1 (non binding)
> >
> > Thanks,
> > Michał
> >
> > On 02/06/17 10:18, Mickael Maison wrote:
> > +1 (non binding)
> > Thanks for the KIP
> >
> > On Thu, Jun 1, 2017 at 5:44 PM, Dong Lin **
> >  wrote:
> >
> > Hi all,
> >
> > Can you please vote for KIP-164? The KIP can be found at
> >
> > *https://cwiki.apache.org/confluence/display/KAFKA/KIP-164-+
> Add+UnderMinIsrPartitionCount+and+per-partition+UnderMinIsr+metrics*
> >  Add+UnderMinIsrPartitionCount+and+per-partition+UnderMinIsr+metrics>
> > .
> >
> > Thanks,
> > Dong
> >
> >
> > --
> >  *Michal Borowiecki*
> > *Senior Software Engineer L4*
> > *T: * +44 208 742 1600 <(208)%20742-1600>
> > +44 203 249 8448 <(203)%20249-8448>
> >
> > *E: * *michal.borowie...@openbet.com* 
> > *W: * *www.openbet.com* 
> > *OpenBet Ltd*
> > Chiswick Park Building 9
> > 566 Chiswick High Rd
> > London
> > W4 5XT
> > UK
> > 
> > This message is confidential and intended only for the addressee. If you
> > have received this message in error, please immediately notify the
> > *postmas...@openbet.com*  and delete it from
> your
> > system as well as any copies. The content of e-mails as well as traffic
> > data may be monitored by OpenBet for employment and security purposes. To
> > protect the environment please do not print this e-mail unless necessary.
> > OpenBet Ltd. Registered Office: Chiswick Park Building 9, 566 Chiswick
> High
> > Road, London, W4 5XT, United Kingdom. A company registered in England and
> > Wales. Registered no. 3134634. VAT no. GB927523612
> >
> >
> >
> > Unless stated otherwise above:
> > IBM United Kingdom Limited - Registered in England and Wales with number
> > 741598.
> > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
> 3AU
> >
> >
>


Re: [DISCUSS] KIP-164 Add unavailablePartitionCount and per-partition Unavailable metrics

2017-06-08 Thread Dong Lin
Hey Ismael,

Thanks for the feedback! Could you please vote for the KIP if it looks
good? Then I will find two more committers to vote as well.

Thanks,
Dong

On Tue, May 30, 2017 at 9:08 AM, Dong Lin  wrote:

> Thanks Edoardo and everyone for the comment! That is a very good point. I
> have updated to wiki to use UnderMinIsrPartitionCount as the per-broker
> metric name and UnderMinIsr as the per-partition metric name. The
> motivation section has also been updated to clarify how the existence of
> UnderMinIsrPartition reduces the availability of the Kafka service.
>
> Please find the latest wiki at https://cwiki.apache.org/
> confluence/display/KAFKA/KIP-164-+Add+UnderMinIsrPartitionCount+and+
> per-partition+UnderMinIsr+metrics .
>
>
>
> On Tue, May 30, 2017 at 5:35 AM, Ismael Juma  wrote:
>
>> Thanks for the KIP, Dong. I agree that that the metrics are useful. Like
>> Edoardo and Mickael said, it seems like it may be better to choose a
>> different name. A couple of additional suggestions:
>> `UnderMinIsrPartitionCount` and `UnderMinIsr`.
>>
>> Ismael
>>
>> On Tue, May 30, 2017 at 12:04 PM, Mickael Maison <
>> mickael.mai...@gmail.com>
>> wrote:
>>
>> > What about simply calling them 'BelowIsrPartitionCount' and 'BelowIsr' ?
>> >
>> > On Tue, May 30, 2017 at 11:40 AM, Edoardo Comar 
>> wrote:
>> > > Hi Dong,
>> > >
>> > > many thanks for the KIP. It's a very useful metric.
>> > >
>> > > by saying
>> > >> Unavailable partitions could be most easily defined as “The number of
>> > > partitions that this broker leads for which the ISR is insufficient to
>> > > meet the minimum ISR required.”
>> > >
>> > > I presume you meant to call 'Unavailable' the partitions whose
>> ISR.size <
>> > > min.insync  ?
>> > >
>> > > Now, a partition whose ISR is < min.insync can be still used to
>> consume
>> > > messages from. It also can be used to produce messages to, as long as
>> the
>> > > producer does not request acks=-1 (i.e. acks=all).
>> > >
>> > > So it is not exactly 'Unavailable' ... perhaps we could call it
>> 'Unsafe'
>> > ?
>> > > Or the community can come up with a better name.
>> > >
>> > > I recently had a few discussions about the issue, and I opened a PR to
>> > > update the docs (that's still hoping to be reviewed and merged ...
>> hint
>> > > hint :-)
>> > > https://github.com/apache/kafka/pull/3035
>> > > https://issues.apache.org/jira/browse/KAFKA-5290
>> > >
>> > > Thanks!
>> > > Edo
>> > > --
>> > > Edoardo Comar
>> > > IBM Message Hub
>> > > eco...@uk.ibm.com
>> > > IBM UK Ltd, Hursley Park, SO21 2JN
>> > >
>> > >
>> > >
>> > >
>> > > From:   Mickael Maison 
>> > > To: dev@kafka.apache.org
>> > > Date:   30/05/2017 10:51
>> > > Subject:Re: [DISCUSS] KIP-164 Add unavailablePartitionCount
>> and
>> > > per-partition Unavailable metrics
>> > >
>> > >
>> > >
>> > > +1
>> > > It's a mystery how this didn't already exist as it's one of the key
>> > > cluster's health indicator
>> > >
>> > > On Mon, May 29, 2017 at 9:18 AM, Gwen Shapira 
>> wrote:
>> > >> Hi,
>> > >>
>> > >> Sounds good. I was sure this existed already for some reason :)
>> > >>
>> > >> On Sun, May 28, 2017 at 11:06 AM Dong Lin 
>> wrote:
>> > >>
>> > >>> Hi,
>> > >>>
>> > >>> We created KIP-164 to propose adding per-partition metric
>> *Unavailable*
>> > > and
>> > >>> per-broker metric *UnavailablePartitionCount*
>> > >>>
>> > >>> The KIP wik can be found at
>> > >>>
>> > >>>
>> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-164-+Add+
>> > unavailablePartitionCount+and+per-partition+Unavailable+metrics
>> > >
>> > >>> .
>> > >>>
>> > >>> Comments are welcome.
>> > >>>
>> > >>> Thanks,
>> > >>> Dong
>> > >>>
>> > >
>> > >
>> > >
>> > >
>> > > Unless stated otherwise above:
>> > > IBM United Kingdom Limited - Registered in England and Wales with
>> number
>> > > 741598.
>> > > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
>> > 3AU
>> > >
>> >
>>
>
>


Re: [DISCUSS] KIP-150 - Kafka-Streams Cogroup

2017-06-08 Thread Xavier Léauté
Another reason for the serde not to be in the first cogroup call, is that
the serde should not be required if you pass a StateStoreSupplier to
aggregate()

Regarding the aggregated type  I don't the why initializer should be
favored over aggregator to define the type. In my mind separating the
initializer into the last aggregate call clearly indicates that the
initializer is independent of any of the aggregators or streams and that we
don't wait for grouped1 events to initialize the co-group.

On Thu, Jun 8, 2017 at 11:14 AM Guozhang Wang  wrote:

> On a second thought... This is the current proposal API
>
>
> ```
>
>  CogroupedKStream cogroup(final Initializer initializer, final
> Aggregator aggregator, final Serde
> aggValueSerde)
>
> ```
>
>
> If we do not have the initializer in the first co-group it might be a bit
> awkward for users to specify the aggregator that returns a typed  value?
> Maybe it is still better to put these two functions in the same api?
>
>
>
> Guozhang
>
> On Thu, Jun 8, 2017 at 11:08 AM, Guozhang Wang  wrote:
>
> > This suggestion lgtm. I would vote for the first alternative than adding
> > it to the `KStreamBuilder` though.
> >
> > On Thu, Jun 8, 2017 at 10:58 AM, Xavier Léauté 
> > wrote:
> >
> >> I have a minor suggestion to make the API a little bit more symmetric.
> >> I feel it would make more sense to move the initializer and serde to the
> >> final aggregate statement, since the serde only applies to the state
> >> store,
> >> and the initializer doesn't bear any relation to the first group in
> >> particular. It would end up looking like this:
> >>
> >> KTable cogrouped =
> >> grouped1.cogroup(aggregator1)
> >> .cogroup(grouped2, aggregator2)
> >> .cogroup(grouped3, aggregator3)
> >> .aggregate(initializer1, aggValueSerde, storeName1);
> >>
> >> Alternatively, we could move the the first cogroup() method to
> >> KStreamBuilder, similar to how we have .merge()
> >> and end up with an api that would be even more symmetric.
> >>
> >> KStreamBuilder.cogroup(grouped1, aggregator1)
> >>   .cogroup(grouped2, aggregator2)
> >>   .cogroup(grouped3, aggregator3)
> >>   .aggregate(initializer1, aggValueSerde, storeName1);
> >>
> >> This doesn't have to be a blocker, but I thought it would make the API
> >> just
> >> a tad cleaner.
> >>
> >> On Tue, Jun 6, 2017 at 3:59 PM Guozhang Wang 
> wrote:
> >>
> >> > Kyle,
> >> >
> >> > Thanks a lot for the updated KIP. It looks good to me.
> >> >
> >> >
> >> > Guozhang
> >> >
> >> >
> >> > On Fri, Jun 2, 2017 at 5:37 AM, Jim Jagielski 
> wrote:
> >> >
> >> > > This makes much more sense to me. +1
> >> > >
> >> > > > On Jun 1, 2017, at 10:33 AM, Kyle Winkelman <
> >> winkelman.k...@gmail.com>
> >> > > wrote:
> >> > > >
> >> > > > I have updated the KIP and my PR. Let me know what you think.
> >> > > > To created a cogrouped stream just call cogroup on a
> KgroupedStream
> >> and
> >> > > > supply the initializer, aggValueSerde, and an aggregator. Then
> >> continue
> >> > > > adding kgroupedstreams and aggregators. Then call one of the many
> >> > > aggregate
> >> > > > calls to create a KTable.
> >> > > >
> >> > > > Thanks,
> >> > > > Kyle
> >> > > >
> >> > > > On Jun 1, 2017 4:03 AM, "Damian Guy" 
> wrote:
> >> > > >
> >> > > >> Hi Kyle,
> >> > > >>
> >> > > >> Thanks for the update. I think just one initializer makes sense
> as
> >> it
> >> > > >> should only be called once per key and generally it is just going
> >> to
> >> > > create
> >> > > >> a new instance of whatever the Aggregate class is.
> >> > > >>
> >> > > >> Cheers,
> >> > > >> Damian
> >> > > >>
> >> > > >> On Wed, 31 May 2017 at 20:09 Kyle Winkelman <
> >> winkelman.k...@gmail.com
> >> > >
> >> > > >> wrote:
> >> > > >>
> >> > > >>> Hello all,
> >> > > >>>
> >> > > >>> I have spent some more time on this and the best alternative I
> >> have
> >> > > come
> >> > > >> up
> >> > > >>> with is:
> >> > > >>> KGroupedStream has a single cogroup call that takes an
> initializer
> >> > and
> >> > > an
> >> > > >>> aggregator.
> >> > > >>> CogroupedKStream has a cogroup call that takes additional
> >> > groupedStream
> >> > > >>> aggregator pairs.
> >> > > >>> CogroupedKStream has multiple aggregate methods that create the
> >> > > different
> >> > > >>> stores.
> >> > > >>>
> >> > > >>> I plan on updating the kip but I want people's input on if we
> >> should
> >> > > have
> >> > > >>> the initializer be passed in once at the beginning or if we
> should
> >> > > >> instead
> >> > > >>> have the initializer be required for each call to one of the
> >> > aggregate
> >> > > >>> calls. The first makes more sense to me but doesnt allow the
> user
> >> to
> >> > > >>> specify different initializers for different tables.
> >> > > >>>
> >> > > >>> Thanks,
> >> > > >>> Kyle
> >> > > >>>
> >> > > >>> On May 24, 2017 7:46 PM, "Kyle Winkelman" <
> >> winkelman.k...@gmail.com>
> >> > > >>> wrote:
> >> > > >>>
> >> > > >

[GitHub] kafka pull request #3242: KAFKA-5357 follow-up: Yammer metrics, not Kafka Me...

2017-06-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3242


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-5357) StackOverFlow error in transaction coordinator

2017-06-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16043233#comment-16043233
 ] 

ASF GitHub Bot commented on KAFKA-5357:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3242


> StackOverFlow error in transaction coordinator
> --
>
> Key: KAFKA-5357
> URL: https://issues.apache.org/jira/browse/KAFKA-5357
> Project: Kafka
>  Issue Type: Sub-task
>Affects Versions: 0.11.0.0
>Reporter: Apurva Mehta
>Assignee: Damian Guy
>Priority: Blocker
>  Labels: exactly-once
> Fix For: 0.11.0.0
>
> Attachments: KAFKA-5357.tar.gz
>
>
> I observed the following in the broker logs: 
> {noformat}
> [2017-06-01 04:10:36,664] ERROR [Replica Manager on Broker 1]: Error 
> processing append operation on partition __transaction_state-37 
> (kafka.server.ReplicaManager)
> [2017-06-01 04:10:36,667] ERROR [TxnMarkerSenderThread-1]: Error due to 
> (kafka.common.InterBrokerSendThread)
> java.lang.StackOverflowError
> at java.security.AccessController.doPrivileged(Native Method)
> at java.io.PrintWriter.(PrintWriter.java:116)
> at java.io.PrintWriter.(PrintWriter.java:100)
> at 
> org.apache.log4j.DefaultThrowableRenderer.render(DefaultThrowableRenderer.java:58)
> at 
> org.apache.log4j.spi.ThrowableInformation.getThrowableStrRep(ThrowableInformation.java:87)
> at 
> org.apache.log4j.spi.LoggingEvent.getThrowableStrRep(LoggingEvent.java:413)
> at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:313)
> at 
> org.apache.log4j.DailyRollingFileAppender.subAppend(DailyRollingFileAppender.java:369)
> at org.apache.log4j.WriterAppender.append(WriterAppender.java:162)
> at 
> org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251)
> at 
> org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)
> at org.apache.log4j.Category.callAppenders(Category.java:206)
> at org.apache.log4j.Category.forcedLog(Category.java:391)
> at org.apache.log4j.Category.error(Category.java:322)
> at kafka.utils.Logging$class.error(Logging.scala:105)
> at kafka.server.ReplicaManager.error(ReplicaManager.scala:122)
> at 
> kafka.server.ReplicaManager$$anonfun$appendToLocalLog$2.apply(ReplicaManager.scala:557)
> at 
> kafka.server.ReplicaManager$$anonfun$appendToLocalLog$2.apply(ReplicaManager.scala:505)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at scala.collection.immutable.Map$Map1.foreach(Map.scala:116)
> at 
> scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
> at scala.collection.AbstractTraversable.map(Traversable.scala:104)
> at 
> kafka.server.ReplicaManager.appendToLocalLog(ReplicaManager.scala:505)
> at kafka.server.ReplicaManager.appendRecords(ReplicaManager.scala:346)
> at 
> kafka.coordinator.transaction.TransactionStateManager$$anonfun$appendTransactionToLog$1.apply$mcV$sp(TransactionStateManager.scala:589)
> at 
> kafka.coordinator.transaction.TransactionStateManager$$anonfun$appendTransactionToLog$1.apply(TransactionStateManager.scala:570)
> at 
> kafka.coordinator.transaction.TransactionStateManager$$anonfun$appendTransactionToLog$1.apply(TransactionStateManager.scala:570)
> at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:213)
> at kafka.utils.CoreUtils$.inReadLock(CoreUtils.scala:219)
> at 
> kafka.coordinator.transaction.TransactionStateManager.appendTransactionToLog(TransactionStateManager.scala:564)
> at 
> kafka.coordinator.transaction.TransactionMarkerChannelManager.kafka$coordinator$transaction$TransactionMarkerChannelManager$$retryAppendCallback$1(TransactionMarkerChannelManager.scala:225)
> at 
> kafka.coordinator.transaction.TransactionMarkerChannelManager$$anonfun$kafka$coordinator$transaction$TransactionMarkerChannelManager$$retryAppendCallback$1$4.apply(TransactionMarkerChannelManager.scala:225)
> at 
> kafka.coordinator.transaction.TransactionMarkerChannelManager$$anonfun$kafka$coordinator$transaction$TransactionMarkerChannelManager$$retryAppendCallback$1$4.apply(TransactionMarkerChannelManager.scala:225)
> at 
> kafka.coordinator.transaction.TransactionStateManager.kafka$coordinator$transaction$TransactionStateManager$$updateCacheCallback$1(TransactionStateManager.scala:561)
> at 
> kafka.coordinator.transaction.TransactionStateManager$$anonfun$appendTransactionToLog$1$$anonfun$apply$mcV$sp$4

Jenkins build is back to normal : kafka-trunk-jdk8 #1678

2017-06-08 Thread Apache Jenkins Server
See 




[jira] [Commented] (KAFKA-5409) Providing a custom client-id to the ConsoleProducer tool

2017-06-08 Thread Bharat Viswanadham (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16043268#comment-16043268
 ] 

Bharat Viswanadham commented on KAFKA-5409:
---

Hi Paolo,
Ya missed that point.
Ya I think to use client.id sent by user, we need to have an option --client-id 
or 
update the code to call producerProps after setting default props in 
getNewProducerProps

> Providing a custom client-id to the ConsoleProducer tool
> 
>
> Key: KAFKA-5409
> URL: https://issues.apache.org/jira/browse/KAFKA-5409
> Project: Kafka
>  Issue Type: Improvement
>  Components: tools
>Reporter: Paolo Patierno
>Priority: Minor
>
> Hi,
> I see that the client-id properties for the ConsoleProducer tool is always 
> "console-producer". It could be useful having it as parameter on the command 
> line or generating a random one like happens for the ConsolerConsumer.
> If it makes sense to you, I can work on that.
> Thanks,
> Paolo.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-1044) change log4j to slf4j

2017-06-08 Thread Viktor Somogyi (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16043329#comment-16043329
 ] 

Viktor Somogyi commented on KAFKA-1044:
---

@ewencp, thank you for your help.
I've started this task and all in all my conclusion is that we should delay 
this until the release of slf4j 1.8 (alpha2 is already out), as there are some 
log4j features that are being used but not part of slf4j, such as the 
LogManager in Log4jController or the Fatal log level.
It is a bit hard to work around these effectively and without breaking 
behaviour. Best I know is we could use error level instead of fatal for the 
time being or we could separate off Log4jController into its own module and try 
to load it in runtime if log4j is being used to get rid of the hard log4j 
dependency, but these are quite hacky and I'm not sure if it's worth the effort 
now as we have a workaround for the original issue. Please let me know what you 
think.

> change log4j to slf4j 
> --
>
> Key: KAFKA-1044
> URL: https://issues.apache.org/jira/browse/KAFKA-1044
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 0.8.0
>Reporter: shijinkui
>Assignee: Viktor Somogyi
>Priority: Minor
>  Labels: newbie
>
> can u chanage the log4j to slf4j, in my project, i use logback, it's conflict 
> with log4j.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Build failed in Jenkins: kafka-0.11.0-jdk7 #123

2017-06-08 Thread Apache Jenkins Server
See 


Changes:

[wangguoz] KAFKA-5357 follow-up: Yammer metrics, not Kafka Metrics

--
[...truncated 2.38 MB...]

org.apache.kafka.streams.integration.KStreamsFineGrainedAutoResetIntegrationTest
 > 
shouldOnlyReadRecordsWhereEarliestSpecifiedWithNoCommittedOffsetsWithDefaultGlobalAutoOffsetResetEarliest
 PASSED

org.apache.kafka.streams.integration.KStreamsFineGrainedAutoResetIntegrationTest
 > shouldThrowStreamsExceptionNoResetSpecified STARTED

org.apache.kafka.streams.integration.KStreamsFineGrainedAutoResetIntegrationTest
 > shouldThrowStreamsExceptionNoResetSpecified PASSED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testShouldReadFromRegexAndNamedTopics STARTED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testShouldReadFromRegexAndNamedTopics PASSED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testRegexMatchesTopicsAWhenCreated STARTED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testRegexMatchesTopicsAWhenCreated PASSED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testMultipleConsumersCanReadFromPartitionedTopic STARTED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testMultipleConsumersCanReadFromPartitionedTopic PASSED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testRegexMatchesTopicsAWhenDeleted STARTED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testRegexMatchesTopicsAWhenDeleted PASSED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testNoMessagesSentExceptionFromOverlappingPatterns STARTED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testNoMessagesSentExceptionFromOverlappingPatterns PASSED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
shouldAddStateStoreToRegexDefinedSource STARTED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
shouldAddStateStoreToRegexDefinedSource PASSED

org.apache.kafka.streams.integration.KStreamRepartitionJoinTest > 
shouldCorrectlyRepartitionOnJoinOperationsWithZeroSizedCache STARTED

org.apache.kafka.streams.integration.KStreamRepartitionJoinTest > 
shouldCorrectlyRepartitionOnJoinOperationsWithZeroSizedCache PASSED

org.apache.kafka.streams.integration.KStreamRepartitionJoinTest > 
shouldCorrectlyRepartitionOnJoinOperationsWithNonZeroSizedCache STARTED

org.apache.kafka.streams.integration.KStreamRepartitionJoinTest > 
shouldCorrectlyRepartitionOnJoinOperationsWithNonZeroSizedCache PASSED

org.apache.kafka.streams.integration.InternalTopicIntegrationTest > 
shouldCompactTopicsForStateChangelogs STARTED

org.apache.kafka.streams.integration.InternalTopicIntegrationTest > 
shouldCompactTopicsForStateChangelogs PASSED

org.apache.kafka.streams.integration.InternalTopicIntegrationTest > 
shouldUseCompactAndDeleteForWindowStoreChangelogs STARTED

org.apache.kafka.streams.integration.InternalTopicIntegrationTest > 
shouldUseCompactAndDeleteForWindowStoreChangelogs PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduceSessionWindows STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduceSessionWindows PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduce STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduce PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldAggregate STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldAggregate PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCount STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCount PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldGroupByKey STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldGroupByKey PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCountWithInternalStore STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCountWithInternalStore PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduceWindowed STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduceWindowed PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCountSessionWindows STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCountSessionWindows PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldAggregateWindowed STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldAggregateWindowed P

[GitHub] kafka pull request #3273: HOTFIX: for flaky Streams EOS integration tests

2017-06-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3273


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #3272: MINOR: disable flaky Streams EOS integration tests

2017-06-08 Thread mjsax
Github user mjsax closed the pull request at:

https://github.com/apache/kafka/pull/3272


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Issue Comment Deleted] (KAFKA-1044) change log4j to slf4j

2017-06-08 Thread Viktor Somogyi (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viktor Somogyi updated KAFKA-1044:
--
Comment: was deleted

(was: @ewencp, thank you for your help.
I've started this task and all in all my conclusion is that we should delay 
this until the release of slf4j 1.8 (alpha2 is already out), as there are some 
log4j features that are being used but not part of slf4j, such as the 
LogManager in Log4jController or the Fatal log level.
It is a bit hard to work around these effectively and without breaking 
behaviour. Best I know is we could use error level instead of fatal for the 
time being or we could separate off Log4jController into its own module and try 
to load it in runtime if log4j is being used to get rid of the hard log4j 
dependency, but these are quite hacky and I'm not sure if it's worth the effort 
now as we have a workaround for the original issue. Please let me know what you 
think.)

> change log4j to slf4j 
> --
>
> Key: KAFKA-1044
> URL: https://issues.apache.org/jira/browse/KAFKA-1044
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 0.8.0
>Reporter: shijinkui
>Assignee: Viktor Somogyi
>Priority: Minor
>  Labels: newbie
>
> can u chanage the log4j to slf4j, in my project, i use logback, it's conflict 
> with log4j.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: Pull-Requests not worked on

2017-06-08 Thread Colin McCabe
So, there has been some instability in the junit tests recently due to
some big features landing.  This is starting to improve a lot, though,
so hopefully it won't be a problem for much longer.

If you do hit what you think is a bogus unit test failure, you can type
"retest this please" in the PR to trigger another junit run.

It would be good to close some of the stale PRs.  I have some myself
that I should probably close since they have been superseded by
different approaches.

With regard to OSGi specifically, it might be hard to find someone who
has the relevant expertise to review the patch.  Since our understanding
is limited, maybe it makes sense to have  someone else who understands
OSGi repackage the software for that system?  I'm just tossing out ideas
here, though-- I could be wrong.

best,
Colin


On Tue, May 23, 2017, at 09:48, Jeff Widman wrote:
> I agree with this. As a new contributor, it was a bit demoralizing to
> look
> at all the open PR's and wonder whether when I sent a patch it would just
> be left to sit in the ether.
> 
> In other projects I'm involved with, more typically the maintainers go
> through periodically and close old PR's that will never be merged. I know
> at this point it's an intimidating amount of work, but I still think it'd
> be useful to cut down this backlog.
> 
> Maybe at the SF Kafka summit sprint have a group that does this? It's a
> decent task for n00bs who want to help but don't know where to start to
> ask
> them to help identify PR's that are ancient and should be closed as they
> will never be merged.
> 
> On Tue, May 23, 2017 at 4:59 AM,  wrote:
> 
> > Hello everyone
> >
> > I am wondering how pull-requests are handled for Kafka? There is currently
> > a huge amount of PRs on Github and most of them are not getting any
> > attention.
> >
> > If the maintainers only have a look at PR which passed the CI (which makes
> > sense due to the amount), then there is a problem, because the CI-pipeline
> > is not stable. I've submitted a PR myself which adds OSGi-metadata to the
> > kafka-clients artifact (see 2882). The pipeline fails randomly even though
> > the change only adds some entries to the manifest.
> > The next issue I have is, that people submitting PRs cannot have a look at
> > the failing CI job. So with regards to my PR, I dont have a clue what went
> > wrong.
> > If I am missing something in the process please let me know.
> > Regarding PR 2882, please consider merging because it would safe the
> > osgi-community the effort of wrapping the kafka artifact and deploy it
> > with different coordinates on maven central (which can confuse users)
> > regards
> > Marc
> >


Build failed in Jenkins: kafka-trunk-jdk8 #1679

2017-06-08 Thread Apache Jenkins Server
See 


Changes:

[wangguoz] KAFKA-5357 follow-up: Yammer metrics, not Kafka Metrics

--
[...truncated 4.58 MB...]

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCount PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldGroupByKey STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldGroupByKey PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCountWithInternalStore STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCountWithInternalStore PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduceWindowed STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduceWindowed PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCountSessionWindows STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCountSessionWindows PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldAggregateWindowed STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldAggregateWindowed PASSED

org.apache.kafka.streams.integration.FanoutIntegrationTest > 
shouldFanoutTheInput STARTED

org.apache.kafka.streams.integration.FanoutIntegrationTest > 
shouldFanoutTheInput PASSED

org.apache.kafka.streams.integration.EosIntegrationTest > 
shouldBeAbleToRunWithEosEnabled STARTED

org.apache.kafka.streams.integration.EosIntegrationTest > 
shouldBeAbleToRunWithEosEnabled PASSED

org.apache.kafka.streams.integration.EosIntegrationTest > 
shouldBeAbleToCommitToMultiplePartitions STARTED

org.apache.kafka.streams.integration.EosIntegrationTest > 
shouldBeAbleToCommitToMultiplePartitions PASSED

org.apache.kafka.streams.integration.EosIntegrationTest > 
shouldNotViolateEosIfOneTaskFails STARTED

org.apache.kafka.streams.integration.EosIntegrationTest > 
shouldNotViolateEosIfOneTaskFails FAILED
java.lang.AssertionError: Condition not met within timeout 6. Should 
receive uncaught exception from one StreamThread.
at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:274)
at 
org.apache.kafka.streams.integration.EosIntegrationTest.shouldNotViolateEosIfOneTaskFails(EosIntegrationTest.java:406)

org.apache.kafka.streams.integration.EosIntegrationTest > 
shouldBeAbleToPerformMultipleTransactions STARTED

org.apache.kafka.streams.integration.EosIntegrationTest > 
shouldBeAbleToPerformMultipleTransactions PASSED

org.apache.kafka.streams.integration.EosIntegrationTest > 
shouldBeAbleToCommitMultiplePartitionOffsets STARTED

org.apache.kafka.streams.integration.EosIntegrationTest > 
shouldBeAbleToCommitMultiplePartitionOffsets PASSED

org.apache.kafka.streams.integration.EosIntegrationTest > 
shouldNotViolateEosIfOneTaskFailsWithState STARTED

org.apache.kafka.streams.integration.EosIntegrationTest > 
shouldNotViolateEosIfOneTaskFailsWithState PASSED

org.apache.kafka.streams.integration.EosIntegrationTest > 
shouldBeAbleToRestartAfterClose STARTED

org.apache.kafka.streams.integration.EosIntegrationTest > 
shouldBeAbleToRestartAfterClose PASSED

org.apache.kafka.streams.integration.GlobalKTableIntegrationTest > 
shouldKStreamGlobalKTableLeftJoin STARTED

org.apache.kafka.streams.integration.GlobalKTableIntegrationTest > 
shouldKStreamGlobalKTableLeftJoin PASSED

org.apache.kafka.streams.integration.GlobalKTableIntegrationTest > 
shouldKStreamGlobalKTableJoin STARTED

org.apache.kafka.streams.integration.GlobalKTableIntegrationTest > 
shouldKStreamGlobalKTableJoin PASSED

org.apache.kafka.streams.integration.InternalTopicIntegrationTest > 
shouldCompactTopicsForStateChangelogs STARTED

org.apache.kafka.streams.integration.InternalTopicIntegrationTest > 
shouldCompactTopicsForStateChangelogs PASSED

org.apache.kafka.streams.integration.InternalTopicIntegrationTest > 
shouldUseCompactAndDeleteForWindowStoreChangelogs STARTED

org.apache.kafka.streams.integration.InternalTopicIntegrationTest > 
shouldUseCompactAndDeleteForWindowStoreChangelogs PASSED

org.apache.kafka.streams.StreamsConfigTest > testGetConsumerConfigs STARTED

org.apache.kafka.streams.StreamsConfigTest > testGetConsumerConfigs PASSED

org.apache.kafka.streams.StreamsConfigTest > shouldAcceptAtLeastOnce STARTED

org.apache.kafka.streams.StreamsConfigTest > shouldAcceptAtLeastOnce PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldAllowSettingProducerEnableIdempotenceIfEosDisabled STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldAllowSettingProducerEnableIdempotenceIfEosDisabled PASSED

org.apache.kafka.streams.StreamsConfigTest > defaultSerdeShouldBeConfigured 
STARTED

org.apache.kafka.streams.StreamsConfigTest > defaultSerdeShouldBeC

[GitHub] kafka pull request #3201: KAFKA-5362: Add EOS system tests for Streams API

2017-06-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3201


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (KAFKA-5362) Add EOS system tests for Streams API

2017-06-08 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang resolved KAFKA-5362.
--
Resolution: Fixed

Issue resolved by pull request 3201
[https://github.com/apache/kafka/pull/3201]

> Add EOS system tests for Streams API
> 
>
> Key: KAFKA-5362
> URL: https://issues.apache.org/jira/browse/KAFKA-5362
> Project: Kafka
>  Issue Type: Bug
>  Components: streams, system tests
>Affects Versions: 0.11.0.0
>Reporter: Matthias J. Sax
>Assignee: Matthias J. Sax
>Priority: Blocker
>  Labels: exactly-once
> Fix For: 0.11.0.0
>
>
> We need to add more system tests for Streams API with exactly-once enabled.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5362) Add EOS system tests for Streams API

2017-06-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16043446#comment-16043446
 ] 

ASF GitHub Bot commented on KAFKA-5362:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3201


> Add EOS system tests for Streams API
> 
>
> Key: KAFKA-5362
> URL: https://issues.apache.org/jira/browse/KAFKA-5362
> Project: Kafka
>  Issue Type: Bug
>  Components: streams, system tests
>Affects Versions: 0.11.0.0
>Reporter: Matthias J. Sax
>Assignee: Matthias J. Sax
>Priority: Blocker
>  Labels: exactly-once
> Fix For: 0.11.0.0
>
>
> We need to add more system tests for Streams API with exactly-once enabled.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-1955) Explore disk-based buffering in new Kafka Producer

2017-06-08 Thread Colin P. McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16043458#comment-16043458
 ] 

Colin P. McCabe commented on KAFKA-1955:


As Jay wrote, there are some potential problems with the disk-based buffering 
approach:

{quote}
The cons of the second approach are the following:
1. You end up depending on disks on all the producer machines. If you have 
1 producers, that is 10k places state is kept. These tend to fail a lot.
2. You can get data arbitrarily delayed
3. You still don't tolerate hard outages since there is no replication in the 
producer tier
4. This tends to make problems with duplicates more common in certain failure 
scenarios.
{quote}

Do we have potential solutions for these?

bq. I believe a malloc/free implementation over `MappedByteBuffer` will be the 
best choice. This will allow the producer buffers to use a file like a piece of 
memory at the cost of maintaining a more complex free list.

How do you plan on ensuring that the messages are written to disk in a timely 
fashion?  It seems possible that you could lose quite a lot of data if you lose 
power before the memory-mapped regions are written back to disk.  Also, a 
malloc implementation is quite a lot of complexity-- are we sure it's worth it?

If we are going to do this, we'd probably want to start with something like an 
append-only log that on which we call {{fsync}} periodically.  Also, we would 
need a KIP...

> Explore disk-based buffering in new Kafka Producer
> --
>
> Key: KAFKA-1955
> URL: https://issues.apache.org/jira/browse/KAFKA-1955
> Project: Kafka
>  Issue Type: Improvement
>  Components: producer 
>Affects Versions: 0.8.2.0
>Reporter: Jay Kreps
>Assignee: Jay Kreps
> Attachments: KAFKA-1955.patch, 
> KAFKA-1955-RABASED-TO-8th-AUG-2015.patch
>
>
> There are two approaches to using Kafka for capturing event data that has no 
> other "source of truth store":
> 1. Just write to Kafka and try hard to keep the Kafka cluster up as you would 
> a database.
> 2. Write to some kind of local disk store and copy from that to Kafka.
> The cons of the second approach are the following:
> 1. You end up depending on disks on all the producer machines. If you have 
> 1 producers, that is 10k places state is kept. These tend to fail a lot.
> 2. You can get data arbitrarily delayed
> 3. You still don't tolerate hard outages since there is no replication in the 
> producer tier
> 4. This tends to make problems with duplicates more common in certain failure 
> scenarios.
> There is one big pro, though: you don't have to keep Kafka running all the 
> time.
> So far we have done nothing in Kafka to help support approach (2), but people 
> have built a lot of buffering things. It's not clear that this is necessarily 
> bad.
> However implementing this in the new Kafka producer might actually be quite 
> easy. Here is an idea for how to do it. Implementation of this idea is 
> probably pretty easy but it would require some pretty thorough testing to see 
> if it was a success.
> The new producer maintains a pool of ByteBuffer instances which it attempts 
> to recycle and uses to buffer and send messages. When unsent data is queuing 
> waiting to be sent to the cluster it is hanging out in this pool.
> One approach to implementing a disk-baked buffer would be to slightly 
> generalize this so that the buffer pool has the option to use a mmap'd file 
> backend for it's ByteBuffers. When the BufferPool was created with a 
> totalMemory setting of 1GB it would preallocate a 1GB sparse file and memory 
> map it, then chop the file into batchSize MappedByteBuffer pieces and 
> populate it's buffer with those.
> Everything else would work normally except now all the buffered data would be 
> disk backed and in cases where there was significant backlog these would 
> start to fill up and page out.
> We currently allow messages larger than batchSize and to handle these we do a 
> one-off allocation of the necessary size. We would have to disallow this when 
> running in mmap mode. However since the disk buffer will be really big this 
> should not be a significant limitation as the batch size can be pretty big.
> We would want to ensure that the pooling always gives out the most recently 
> used ByteBuffer (I think it does). This way under normal operation where 
> requests are processed quickly a given buffer would be reused many times 
> before any physical disk write activity occurred.
> Note that although this let's the producer buffer very large amounts of data 
> the buffer isn't really fault-tolerant, since the ordering in the file isn't 
> known so there is no easy way to recovery the producer's buffer in a failure. 
> So the scope of this feature would just be to

Build failed in Jenkins: kafka-trunk-jdk7 #2372

2017-06-08 Thread Apache Jenkins Server
See 


Changes:

[jason] HOTFIX: for flaky Streams EOS integration tests

--
[...truncated 2.40 MB...]
org.apache.kafka.streams.kstream.internals.KStreamKStreamLeftJoinTest > 
testLeftJoin STARTED

org.apache.kafka.streams.kstream.internals.KStreamKStreamLeftJoinTest > 
testLeftJoin PASSED

org.apache.kafka.streams.kstream.internals.KStreamKStreamLeftJoinTest > 
testWindowing STARTED

org.apache.kafka.streams.kstream.internals.KStreamKStreamLeftJoinTest > 
testWindowing PASSED

org.apache.kafka.streams.kstream.internals.KTableForeachTest > testForeach 
STARTED

org.apache.kafka.streams.kstream.internals.KTableForeachTest > testForeach 
PASSED

org.apache.kafka.streams.kstream.internals.KTableForeachTest > testTypeVariance 
STARTED

org.apache.kafka.streams.kstream.internals.KTableForeachTest > testTypeVariance 
PASSED

org.apache.kafka.streams.kstream.internals.KStreamPrintTest > 
testPrintKeyValueWithName STARTED

org.apache.kafka.streams.kstream.internals.KStreamPrintTest > 
testPrintKeyValueWithName PASSED

org.apache.kafka.streams.kstream.internals.GlobalKTableJoinsTest > 
shouldInnerJoinWithStream STARTED

org.apache.kafka.streams.kstream.internals.GlobalKTableJoinsTest > 
shouldInnerJoinWithStream PASSED

org.apache.kafka.streams.kstream.internals.GlobalKTableJoinsTest > 
shouldLeftJoinWithStream STARTED

org.apache.kafka.streams.kstream.internals.GlobalKTableJoinsTest > 
shouldLeftJoinWithStream PASSED

org.apache.kafka.streams.kstream.internals.KStreamMapValuesTest > 
testFlatMapValues STARTED

org.apache.kafka.streams.kstream.internals.KStreamMapValuesTest > 
testFlatMapValues PASSED

org.apache.kafka.streams.kstream.internals.KStreamFlatMapTest > testFlatMap 
STARTED

org.apache.kafka.streams.kstream.internals.KStreamFlatMapTest > testFlatMap 
PASSED

org.apache.kafka.streams.kstream.internals.WindowedStreamPartitionerTest > 
testCopartitioning STARTED

org.apache.kafka.streams.kstream.internals.WindowedStreamPartitionerTest > 
testCopartitioning PASSED

org.apache.kafka.streams.kstream.internals.WindowedStreamPartitionerTest > 
testWindowedSerializerNoArgConstructors STARTED

org.apache.kafka.streams.kstream.internals.WindowedStreamPartitionerTest > 
testWindowedSerializerNoArgConstructors PASSED

org.apache.kafka.streams.kstream.internals.WindowedStreamPartitionerTest > 
testWindowedDeserializerNoArgConstructors STARTED

org.apache.kafka.streams.kstream.internals.WindowedStreamPartitionerTest > 
testWindowedDeserializerNoArgConstructors PASSED

org.apache.kafka.streams.kstream.internals.KStreamKTableLeftJoinTest > testJoin 
STARTED

org.apache.kafka.streams.kstream.internals.KStreamKTableLeftJoinTest > testJoin 
PASSED

org.apache.kafka.streams.kstream.internals.UnlimitedWindowTest > 
shouldAlwaysOverlap STARTED

org.apache.kafka.streams.kstream.internals.UnlimitedWindowTest > 
shouldAlwaysOverlap PASSED

org.apache.kafka.streams.kstream.internals.UnlimitedWindowTest > 
cannotCompareUnlimitedWindowWithDifferentWindowType STARTED

org.apache.kafka.streams.kstream.internals.UnlimitedWindowTest > 
cannotCompareUnlimitedWindowWithDifferentWindowType PASSED

org.apache.kafka.streams.kstream.internals.KTableImplTest > 
shouldNotAllowNullMapperOnMapValues STARTED

org.apache.kafka.streams.kstream.internals.KTableImplTest > 
shouldNotAllowNullMapperOnMapValues PASSED

org.apache.kafka.streams.kstream.internals.KTableImplTest > 
shouldNotAllowNullPredicateOnFilter STARTED

org.apache.kafka.streams.kstream.internals.KTableImplTest > 
shouldNotAllowNullPredicateOnFilter PASSED

org.apache.kafka.streams.kstream.internals.KTableImplTest > 
shouldNotAllowNullOtherTableOnJoin STARTED

org.apache.kafka.streams.kstream.internals.KTableImplTest > 
shouldNotAllowNullOtherTableOnJoin PASSED

org.apache.kafka.streams.kstream.internals.KTableImplTest > 
shouldNotAllowNullSelectorOnGroupBy STARTED

org.apache.kafka.streams.kstream.internals.KTableImplTest > 
shouldNotAllowNullSelectorOnGroupBy PASSED

org.apache.kafka.streams.kstream.internals.KTableImplTest > 
shouldNotAllowNullJoinerOnLeftJoin STARTED

org.apache.kafka.streams.kstream.internals.KTableImplTest > 
shouldNotAllowNullJoinerOnLeftJoin PASSED

org.apache.kafka.streams.kstream.internals.KTableImplTest > testStateStore 
STARTED

org.apache.kafka.streams.kstream.internals.KTableImplTest > testStateStore 
PASSED

org.apache.kafka.streams.kstream.internals.KTableImplTest > 
shouldNotAllowEmptyFilePathOnWriteAsText STARTED

org.apache.kafka.streams.kstream.internals.KTableImplTest > 
shouldNotAllowEmptyFilePathOnWriteAsText PASSED

org.apache.kafka.streams.kstream.internals.KTableImplTest > 
shouldNotAllowNullOtherTableOnLeftJoin STARTED

org.apache.kafka.streams.kstream.internals.KTableImplTest > 
shouldNotAllowNullOtherTableOnLeftJoin PASSED

org.apache.kafka.streams.kstream.internals.KTableImplTest > 
shouldNotAllowNullSelectorOnToStrea

Jenkins build is back to normal : kafka-0.11.0-jdk7 #124

2017-06-08 Thread Apache Jenkins Server
See 




[jira] [Commented] (KAFKA-3925) Default log.dir=/tmp/kafka-logs is unsafe

2017-06-08 Thread Colin P. McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16043469#comment-16043469
 ] 

Colin P. McCabe commented on KAFKA-3925:


The issue with {{/var}} is that if you run Kafka as an ordinary user, you 
cannot write to this directory.  Perhaps we could write to 
{{$HOME/kafka-logs}}?  Although that would imply a different location for 
different users.

I also think a lot of unit tests rely on the {{/tmp}} location.

> Default log.dir=/tmp/kafka-logs is unsafe
> -
>
> Key: KAFKA-3925
> URL: https://issues.apache.org/jira/browse/KAFKA-3925
> Project: Kafka
>  Issue Type: Bug
>  Components: config
>Affects Versions: 0.10.0.0
> Environment: Various, depends on OS and configuration
>Reporter: Peter Davis
>
> Many operating systems are configured to delete files under /tmp.  For 
> example Ubuntu has 
> [tmpreaper|http://manpages.ubuntu.com/manpages/wily/man8/tmpreaper.8.html], 
> others use tmpfs, others delete /tmp on startup. 
> Defaults are OK to make getting started easier but should not be unsafe 
> (risking data loss). 
> Something under /var would be a better default log.dir under *nix.  Or 
> relative to the Kafka bin directory to avoid needing root.  
> If the default cannot be changed, I would suggest a special warning print to 
> the console on broker startup if log.dir is under /tmp. 
> See [users list 
> thread|http://mail-archives.apache.org/mod_mbox/kafka-users/201607.mbox/%3cCAD5tkZb-0MMuWJqHNUJ3i1+xuNPZ4tnQt-RPm65grxE0=0o...@mail.gmail.com%3e].
>   I've also been bitten by this. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-3925) Default log.dir=/tmp/kafka-logs is unsafe

2017-06-08 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16043472#comment-16043472
 ] 

Ismael Juma commented on KAFKA-3925:


Unit tests set the log.dir explicitly so they would not be affected by a change 
in the default.

> Default log.dir=/tmp/kafka-logs is unsafe
> -
>
> Key: KAFKA-3925
> URL: https://issues.apache.org/jira/browse/KAFKA-3925
> Project: Kafka
>  Issue Type: Bug
>  Components: config
>Affects Versions: 0.10.0.0
> Environment: Various, depends on OS and configuration
>Reporter: Peter Davis
>
> Many operating systems are configured to delete files under /tmp.  For 
> example Ubuntu has 
> [tmpreaper|http://manpages.ubuntu.com/manpages/wily/man8/tmpreaper.8.html], 
> others use tmpfs, others delete /tmp on startup. 
> Defaults are OK to make getting started easier but should not be unsafe 
> (risking data loss). 
> Something under /var would be a better default log.dir under *nix.  Or 
> relative to the Kafka bin directory to avoid needing root.  
> If the default cannot be changed, I would suggest a special warning print to 
> the console on broker startup if log.dir is under /tmp. 
> See [users list 
> thread|http://mail-archives.apache.org/mod_mbox/kafka-users/201607.mbox/%3cCAD5tkZb-0MMuWJqHNUJ3i1+xuNPZ4tnQt-RPm65grxE0=0o...@mail.gmail.com%3e].
>   I've also been bitten by this. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka-site pull request #60: Update delivery semantics section for KIP-98

2017-06-08 Thread hachikuji
GitHub user hachikuji opened a pull request:

https://github.com/apache/kafka-site/pull/60

Update delivery semantics section for KIP-98



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hachikuji/kafka-site update-delivery-semantics

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka-site/pull/60.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #60


commit 0759f3a22e377e12e857eb6da4977adb62261d30
Author: Jason Gustafson 
Date:   2017-06-08T21:28:13Z

Update delivery semantics section for KIP-98




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (KAFKA-5336) The required ACL permission for ListGroup is invalid

2017-06-08 Thread Colin P. McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin P. McCabe resolved KAFKA-5336.

   Resolution: Duplicate
Fix Version/s: 0.11.0.0

KAFKA-5292 in 0.11 added {{Describe}} as a valid operation on {{Cluster}}

> The required ACL permission for ListGroup is invalid
> 
>
> Key: KAFKA-5336
> URL: https://issues.apache.org/jira/browse/KAFKA-5336
> Project: Kafka
>  Issue Type: Bug
>  Components: security
>Affects Versions: 0.10.2.1
>Reporter: Vahid Hashemian
>Assignee: Vahid Hashemian
>Priority: Minor
> Fix For: 0.11.0.0
>
>
> The {{ListGroup}} API authorizes requests with _Describe_ access to the 
> cluster resource:
> {code}
>   def handleListGroupsRequest(request: RequestChannel.Request) {
> if (!authorize(request.session, Describe, Resource.ClusterResource)) {
>   sendResponseMaybeThrottle(request, requestThrottleMs =>
> ListGroupsResponse.fromError(requestThrottleMs, 
> Errors.CLUSTER_AUTHORIZATION_FAILED))
> } else {
>   ...
> {code}
>  However, the list of operations (or permissions) allowed for the cluster 
> resource does not include _Describe_:
> {code}
>   val ResourceTypeToValidOperations = Map[ResourceType, Set[Operation]] (
> ...
> Cluster -> Set(Create, ClusterAction, DescribeConfigs, AlterConfigs, 
> IdempotentWrite, All),
> ...
>   )
> {code}
> Only a user with _All_ cluster permission can successfully call the 
> {{ListGroup}} API. No other permission (not even any combination that does 
> not include _All_) would let user use this API.
> The bug could be as simple as a typo in the API handler. Though it's not 
> obvious what actual permission was meant to be used there (perhaps 
> _DescribeConfigs_?)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka-site issue #60: Update delivery semantics section for KIP-98

2017-06-08 Thread hachikuji
Github user hachikuji commented on the issue:

https://github.com/apache/kafka-site/pull/60
  
cc @guozhangwang @apurva @ijuma 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (KAFKA-5336) ListGroup requires Describe on Cluster, but the command-line AclCommand tool does not allow this to be set

2017-06-08 Thread Colin P. McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin P. McCabe updated KAFKA-5336:
---
Summary: ListGroup requires Describe on Cluster, but the command-line 
AclCommand tool does not allow this to be set  (was: The required ACL 
permission for ListGroup is invalid)

> ListGroup requires Describe on Cluster, but the command-line AclCommand tool 
> does not allow this to be set
> --
>
> Key: KAFKA-5336
> URL: https://issues.apache.org/jira/browse/KAFKA-5336
> Project: Kafka
>  Issue Type: Bug
>  Components: security
>Affects Versions: 0.10.2.1
>Reporter: Vahid Hashemian
>Assignee: Vahid Hashemian
>Priority: Minor
> Fix For: 0.11.0.0
>
>
> The {{ListGroup}} API authorizes requests with _Describe_ access to the 
> cluster resource:
> {code}
>   def handleListGroupsRequest(request: RequestChannel.Request) {
> if (!authorize(request.session, Describe, Resource.ClusterResource)) {
>   sendResponseMaybeThrottle(request, requestThrottleMs =>
> ListGroupsResponse.fromError(requestThrottleMs, 
> Errors.CLUSTER_AUTHORIZATION_FAILED))
> } else {
>   ...
> {code}
>  However, the list of operations (or permissions) allowed for the cluster 
> resource does not include _Describe_:
> {code}
>   val ResourceTypeToValidOperations = Map[ResourceType, Set[Operation]] (
> ...
> Cluster -> Set(Create, ClusterAction, DescribeConfigs, AlterConfigs, 
> IdempotentWrite, All),
> ...
>   )
> {code}
> Only a user with _All_ cluster permission can successfully call the 
> {{ListGroup}} API. No other permission (not even any combination that does 
> not include _All_) would let user use this API.
> The bug could be as simple as a typo in the API handler. Though it's not 
> obvious what actual permission was meant to be used there (perhaps 
> _DescribeConfigs_?)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5402) JmxReporter Fetch metrics for kafka.server should not be created when client quotas are not enabled

2017-06-08 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16043480#comment-16043480
 ] 

Jun Rao commented on KAFKA-5402:


Not sure if this is the same as KAFKA-3980. Currently, if a metric is not being 
actively updated (after the client is gone), the metric is supposed to be 
automatically removed after 1 hour.

Also, it seem that currently even if quota is not enabled, we still create the 
quota metric with the client-id. Not sure how useful those metrics are w/o 
enabling quota. Perhaps we should only create the metric if quota is enabled. 
[~rsivaram], what do you think?

> JmxReporter Fetch metrics for kafka.server should not be created when client 
> quotas are not enabled
> ---
>
> Key: KAFKA-5402
> URL: https://issues.apache.org/jira/browse/KAFKA-5402
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.1.1
>Reporter: Koelli Mungee
> Attachments: Fetch.jpg, Metrics.jpg
>
>
> JMXReporter kafka.server Fetch metrics should not be created when client 
> quotas are not enforced for client fetch requests. Currently, these metrics 
> are created and this can cause OutOfMemoryException in the KafkaServer in 
> cases where a large number of consumers are being created rapidly.
> Attaching screenshots from a heapdump showing the 
> kafka.server:type=Fetch,client-id=consumer-358567 with different client.ids 
> from a kafkaserver where client quotas were not enabled.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-3199) LoginManager should allow using an existing Subject

2017-06-08 Thread Ji sun (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16043484#comment-16043484
 ] 

Ji sun commented on KAFKA-3199:
---

Hi Adam, do you mind me taking this jira for Kafka 0.10?

> LoginManager should allow using an existing Subject
> ---
>
> Key: KAFKA-3199
> URL: https://issues.apache.org/jira/browse/KAFKA-3199
> Project: Kafka
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 0.9.0.0
>Reporter: Adam Kunicki
>Assignee: Adam Kunicki
>Priority: Critical
>
> LoginManager currently creates a new Login in the constructor which then 
> performs a login and starts a ticket renewal thread. The problem here is that 
> because Kafka performs its own login, it doesn't offer the ability to re-use 
> an existing subject that's already managed by the client application.
> The goal of LoginManager appears to be to be able to return a valid Subject. 
> It would be a simple fix to have LoginManager.acquireLoginManager() check for 
> a new config e.g. kerberos.use.existing.subject. 
> This would instead of creating a new Login in the constructor simply call 
> Subject.getSubject(AccessController.getContext()); to use the already logged 
> in Subject.
> This is also doable without introducing a new configuration and simply 
> checking if there is already a valid Subject available, but I think it may be 
> preferable to require that users explicitly request this behavior.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: Kafka-2170 & Kafka-1194

2017-06-08 Thread Colin McCabe
I think this is an area that needs some more work.  I don't think we'll
have a fix for it in 0.11.

best,
Colin

On Thu, Jun 1, 2017, at 05:51, Jacob Braaten wrote:
> Hello,
> 
> I am emailing to check on the status of the two bugs above. Both pertain
> to
> the same issue of not being able to delete old log segments on a Windows
> machine.
> 
> I am just curious if you know of a time table for when these would be
> patched.
> 
> KAFKA-1194 
> KAFKA-2170 
> 
> Thank you,
> Jake


Re: [DISCUSS] KIP-150 - Kafka-Streams Cogroup

2017-06-08 Thread Kyle Winkelman
I chose the current way so if you make multiple tables you don't need to
supply the serde and initializer multiple times. It is true that you
wouldnt need the serde if you use a statestoresupplier but I think we could
note that in the method call.

I am fine with the first option if thats what people like. Maybe Eno or
Damian can give their opinion.

I dont really like the kstreambuilder option cause I think it is kind of
hard to find unless you know it's there.

On Jun 8, 2017 1:51 PM, "Xavier Léauté"  wrote:

> Another reason for the serde not to be in the first cogroup call, is that
> the serde should not be required if you pass a StateStoreSupplier to
> aggregate()
>
> Regarding the aggregated type  I don't the why initializer should be
> favored over aggregator to define the type. In my mind separating the
> initializer into the last aggregate call clearly indicates that the
> initializer is independent of any of the aggregators or streams and that we
> don't wait for grouped1 events to initialize the co-group.
>
> On Thu, Jun 8, 2017 at 11:14 AM Guozhang Wang  wrote:
>
> > On a second thought... This is the current proposal API
> >
> >
> > ```
> >
> >  CogroupedKStream cogroup(final Initializer initializer,
> final
> > Aggregator aggregator, final Serde
> > aggValueSerde)
> >
> > ```
> >
> >
> > If we do not have the initializer in the first co-group it might be a bit
> > awkward for users to specify the aggregator that returns a typed 
> value?
> > Maybe it is still better to put these two functions in the same api?
> >
> >
> >
> > Guozhang
> >
> > On Thu, Jun 8, 2017 at 11:08 AM, Guozhang Wang 
> wrote:
> >
> > > This suggestion lgtm. I would vote for the first alternative than
> adding
> > > it to the `KStreamBuilder` though.
> > >
> > > On Thu, Jun 8, 2017 at 10:58 AM, Xavier Léauté 
> > > wrote:
> > >
> > >> I have a minor suggestion to make the API a little bit more symmetric.
> > >> I feel it would make more sense to move the initializer and serde to
> the
> > >> final aggregate statement, since the serde only applies to the state
> > >> store,
> > >> and the initializer doesn't bear any relation to the first group in
> > >> particular. It would end up looking like this:
> > >>
> > >> KTable cogrouped =
> > >> grouped1.cogroup(aggregator1)
> > >> .cogroup(grouped2, aggregator2)
> > >> .cogroup(grouped3, aggregator3)
> > >> .aggregate(initializer1, aggValueSerde, storeName1);
> > >>
> > >> Alternatively, we could move the the first cogroup() method to
> > >> KStreamBuilder, similar to how we have .merge()
> > >> and end up with an api that would be even more symmetric.
> > >>
> > >> KStreamBuilder.cogroup(grouped1, aggregator1)
> > >>   .cogroup(grouped2, aggregator2)
> > >>   .cogroup(grouped3, aggregator3)
> > >>   .aggregate(initializer1, aggValueSerde, storeName1);
> > >>
> > >> This doesn't have to be a blocker, but I thought it would make the API
> > >> just
> > >> a tad cleaner.
> > >>
> > >> On Tue, Jun 6, 2017 at 3:59 PM Guozhang Wang 
> > wrote:
> > >>
> > >> > Kyle,
> > >> >
> > >> > Thanks a lot for the updated KIP. It looks good to me.
> > >> >
> > >> >
> > >> > Guozhang
> > >> >
> > >> >
> > >> > On Fri, Jun 2, 2017 at 5:37 AM, Jim Jagielski 
> > wrote:
> > >> >
> > >> > > This makes much more sense to me. +1
> > >> > >
> > >> > > > On Jun 1, 2017, at 10:33 AM, Kyle Winkelman <
> > >> winkelman.k...@gmail.com>
> > >> > > wrote:
> > >> > > >
> > >> > > > I have updated the KIP and my PR. Let me know what you think.
> > >> > > > To created a cogrouped stream just call cogroup on a
> > KgroupedStream
> > >> and
> > >> > > > supply the initializer, aggValueSerde, and an aggregator. Then
> > >> continue
> > >> > > > adding kgroupedstreams and aggregators. Then call one of the
> many
> > >> > > aggregate
> > >> > > > calls to create a KTable.
> > >> > > >
> > >> > > > Thanks,
> > >> > > > Kyle
> > >> > > >
> > >> > > > On Jun 1, 2017 4:03 AM, "Damian Guy" 
> > wrote:
> > >> > > >
> > >> > > >> Hi Kyle,
> > >> > > >>
> > >> > > >> Thanks for the update. I think just one initializer makes sense
> > as
> > >> it
> > >> > > >> should only be called once per key and generally it is just
> going
> > >> to
> > >> > > create
> > >> > > >> a new instance of whatever the Aggregate class is.
> > >> > > >>
> > >> > > >> Cheers,
> > >> > > >> Damian
> > >> > > >>
> > >> > > >> On Wed, 31 May 2017 at 20:09 Kyle Winkelman <
> > >> winkelman.k...@gmail.com
> > >> > >
> > >> > > >> wrote:
> > >> > > >>
> > >> > > >>> Hello all,
> > >> > > >>>
> > >> > > >>> I have spent some more time on this and the best alternative I
> > >> have
> > >> > > come
> > >> > > >> up
> > >> > > >>> with is:
> > >> > > >>> KGroupedStream has a single cogroup call that takes an
> > initializer
> > >> > and
> > >> > > an
> > >> > > >>> aggregator.
> > >> > > >>> CogroupedKStream has a cogroup call that takes additional
> > >> > groupedStream
> > >> > >

[jira] [Commented] (KAFKA-1955) Explore disk-based buffering in new Kafka Producer

2017-06-08 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16043494#comment-16043494
 ] 

Jay Kreps commented on KAFKA-1955:
--

I think the patch I submitted was kind of a cool hack, but after thinking about 
it I don't think it is really what you want.

Here are the considerations I thought we should probably think through:
1. How will recovery work? The patch I gave didn't have a mechanism to recover 
from a failure. In the case of a OS crash the OS gives very weak guarantees on 
what is on disk for any data that hasn't been fsync'd. Not only can arbitrary 
bits of data be missing but it is even possible with some FS configurations to 
get arbitrary corrupt blocks that haven't been zero'd yet. I think to get this 
right you need a commit log and recovery proceedure that verifies unsync'd data 
on startup. I'm not 100% sure you can do this with just the buffer pool, though 
maybe you can.
2. What are the ordering guarantees for buffered data?
3. How does this interact with transactions/EOS?
4. Should it be the case that all writes go through the commit log or should it 
be the case that only failures are journaled. If you journal all writes prior 
to sending to the server, the problem is that that amounts to significant 
overhead and leads to the possibility that logging or other I/O can slow you 
down. If you journal only failures you have the problem that your throughput 
may be very high in the non-failure scenario and then when Kafka goes down 
suddenly you start doing I/O but that is much slower and your throughput drops 
precipitously. Either may be okay but it is worth thinking through what the 
right behavior is.

> Explore disk-based buffering in new Kafka Producer
> --
>
> Key: KAFKA-1955
> URL: https://issues.apache.org/jira/browse/KAFKA-1955
> Project: Kafka
>  Issue Type: Improvement
>  Components: producer 
>Affects Versions: 0.8.2.0
>Reporter: Jay Kreps
>Assignee: Jay Kreps
> Attachments: KAFKA-1955.patch, 
> KAFKA-1955-RABASED-TO-8th-AUG-2015.patch
>
>
> There are two approaches to using Kafka for capturing event data that has no 
> other "source of truth store":
> 1. Just write to Kafka and try hard to keep the Kafka cluster up as you would 
> a database.
> 2. Write to some kind of local disk store and copy from that to Kafka.
> The cons of the second approach are the following:
> 1. You end up depending on disks on all the producer machines. If you have 
> 1 producers, that is 10k places state is kept. These tend to fail a lot.
> 2. You can get data arbitrarily delayed
> 3. You still don't tolerate hard outages since there is no replication in the 
> producer tier
> 4. This tends to make problems with duplicates more common in certain failure 
> scenarios.
> There is one big pro, though: you don't have to keep Kafka running all the 
> time.
> So far we have done nothing in Kafka to help support approach (2), but people 
> have built a lot of buffering things. It's not clear that this is necessarily 
> bad.
> However implementing this in the new Kafka producer might actually be quite 
> easy. Here is an idea for how to do it. Implementation of this idea is 
> probably pretty easy but it would require some pretty thorough testing to see 
> if it was a success.
> The new producer maintains a pool of ByteBuffer instances which it attempts 
> to recycle and uses to buffer and send messages. When unsent data is queuing 
> waiting to be sent to the cluster it is hanging out in this pool.
> One approach to implementing a disk-baked buffer would be to slightly 
> generalize this so that the buffer pool has the option to use a mmap'd file 
> backend for it's ByteBuffers. When the BufferPool was created with a 
> totalMemory setting of 1GB it would preallocate a 1GB sparse file and memory 
> map it, then chop the file into batchSize MappedByteBuffer pieces and 
> populate it's buffer with those.
> Everything else would work normally except now all the buffered data would be 
> disk backed and in cases where there was significant backlog these would 
> start to fill up and page out.
> We currently allow messages larger than batchSize and to handle these we do a 
> one-off allocation of the necessary size. We would have to disallow this when 
> running in mmap mode. However since the disk buffer will be really big this 
> should not be a significant limitation as the batch size can be pretty big.
> We would want to ensure that the pooling always gives out the most recently 
> used ByteBuffer (I think it does). This way under normal operation where 
> requests are processed quickly a given buffer would be reused many times 
> before any physical disk write activity occurred.
> Note that although this let's the producer buffer very large amo

[jira] [Comment Edited] (KAFKA-1955) Explore disk-based buffering in new Kafka Producer

2017-06-08 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16043494#comment-16043494
 ] 

Jay Kreps edited comment on KAFKA-1955 at 6/8/17 9:41 PM:
--

I think the patch I submitted was kind of a cool hack, but after thinking about 
it I wasn't convinced it was really what you actually want.

Here are the considerations I thought we should probably think through:
1. How will recovery work? The patch I gave didn't have a mechanism to recover 
from a failure. In the case of a OS crash the OS gives very weak guarantees on 
what is on disk for any data that hasn't been fsync'd. Not only can arbitrary 
bits of data be missing but it is even possible with some FS configurations to 
get arbitrary corrupt blocks that haven't been zero'd yet. I think to get this 
right you need a commit log and recovery proceedure that verifies unsync'd data 
on startup. I'm not 100% sure you can do this with just the buffer pool, though 
maybe you can.
2. What are the ordering guarantees for buffered data?
3. How does this interact with transactions/EOS?
4. Should it be the case that all writes go through the commit log or should it 
be the case that only failures are journaled. If you journal all writes prior 
to sending to the server, the problem is that that amounts to significant 
overhead and leads to the possibility that logging or other I/O can slow you 
down. If you journal only failures you have the problem that your throughput 
may be very high in the non-failure scenario and then when Kafka goes down 
suddenly you start doing I/O but that is much slower and your throughput drops 
precipitously. Either may be okay but it is worth thinking through what the 
right behavior is.


was (Author: jkreps):
I think the patch I submitted was kind of a cool hack, but after thinking about 
it I don't think it is really what you want.

Here are the considerations I thought we should probably think through:
1. How will recovery work? The patch I gave didn't have a mechanism to recover 
from a failure. In the case of a OS crash the OS gives very weak guarantees on 
what is on disk for any data that hasn't been fsync'd. Not only can arbitrary 
bits of data be missing but it is even possible with some FS configurations to 
get arbitrary corrupt blocks that haven't been zero'd yet. I think to get this 
right you need a commit log and recovery proceedure that verifies unsync'd data 
on startup. I'm not 100% sure you can do this with just the buffer pool, though 
maybe you can.
2. What are the ordering guarantees for buffered data?
3. How does this interact with transactions/EOS?
4. Should it be the case that all writes go through the commit log or should it 
be the case that only failures are journaled. If you journal all writes prior 
to sending to the server, the problem is that that amounts to significant 
overhead and leads to the possibility that logging or other I/O can slow you 
down. If you journal only failures you have the problem that your throughput 
may be very high in the non-failure scenario and then when Kafka goes down 
suddenly you start doing I/O but that is much slower and your throughput drops 
precipitously. Either may be okay but it is worth thinking through what the 
right behavior is.

> Explore disk-based buffering in new Kafka Producer
> --
>
> Key: KAFKA-1955
> URL: https://issues.apache.org/jira/browse/KAFKA-1955
> Project: Kafka
>  Issue Type: Improvement
>  Components: producer 
>Affects Versions: 0.8.2.0
>Reporter: Jay Kreps
>Assignee: Jay Kreps
> Attachments: KAFKA-1955.patch, 
> KAFKA-1955-RABASED-TO-8th-AUG-2015.patch
>
>
> There are two approaches to using Kafka for capturing event data that has no 
> other "source of truth store":
> 1. Just write to Kafka and try hard to keep the Kafka cluster up as you would 
> a database.
> 2. Write to some kind of local disk store and copy from that to Kafka.
> The cons of the second approach are the following:
> 1. You end up depending on disks on all the producer machines. If you have 
> 1 producers, that is 10k places state is kept. These tend to fail a lot.
> 2. You can get data arbitrarily delayed
> 3. You still don't tolerate hard outages since there is no replication in the 
> producer tier
> 4. This tends to make problems with duplicates more common in certain failure 
> scenarios.
> There is one big pro, though: you don't have to keep Kafka running all the 
> time.
> So far we have done nothing in Kafka to help support approach (2), but people 
> have built a lot of buffering things. It's not clear that this is necessarily 
> bad.
> However implementing this in the new Kafka producer might actually be quite 
> easy. Here is an idea for how to do it. Implementation of this id

[jira] [Comment Edited] (KAFKA-1955) Explore disk-based buffering in new Kafka Producer

2017-06-08 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16043494#comment-16043494
 ] 

Jay Kreps edited comment on KAFKA-1955 at 6/8/17 9:43 PM:
--

I think the patch I submitted was kind of a cool hack, but after thinking about 
it I wasn't convinced it was really what you actually want.

Here are the considerations I thought we should probably think through:
1. How will recovery work? The patch I gave didn't have a mechanism to recover 
from a failure. I don't think this is really good enough. It means that it is 
okay if Kafka goes down, or if the app goes down, but not both. This helps but 
seems like not really what you want. But to properly handle app failure isn't 
that easy. For example, in the case of a OS crash the OS gives very weak 
guarantees on what is on disk for any data that hasn't been fsync'd. Not only 
can arbitrary bits of data be missing but it is even possible with some FS 
configurations to get arbitrary corrupt blocks that haven't been zero'd yet. I 
think to get this right you need a commit log and recovery procedure that 
verifies unsync'd data on startup. I'm not 100% sure you can do this with just 
the buffer pool, though maybe you can.
2. What are the ordering guarantees for buffered data?
3. How does this interact with transactions/EOS?
4. Should it be the case that all writes go through the commit log or should it 
be the case that only failures are journaled. If you journal all writes prior 
to sending to the server, the problem is that that amounts to significant 
overhead and leads to the possibility that logging or other I/O can slow you 
down. If you journal only failures you have the problem that your throughput 
may be very high in the non-failure scenario and then when Kafka goes down 
suddenly you start doing I/O but that is much slower and your throughput drops 
precipitously. Either may be okay but it is worth thinking through what the 
right behavior is.


was (Author: jkreps):
I think the patch I submitted was kind of a cool hack, but after thinking about 
it I wasn't convinced it was really what you actually want.

Here are the considerations I thought we should probably think through:
1. How will recovery work? The patch I gave didn't have a mechanism to recover 
from a failure. In the case of a OS crash the OS gives very weak guarantees on 
what is on disk for any data that hasn't been fsync'd. Not only can arbitrary 
bits of data be missing but it is even possible with some FS configurations to 
get arbitrary corrupt blocks that haven't been zero'd yet. I think to get this 
right you need a commit log and recovery proceedure that verifies unsync'd data 
on startup. I'm not 100% sure you can do this with just the buffer pool, though 
maybe you can.
2. What are the ordering guarantees for buffered data?
3. How does this interact with transactions/EOS?
4. Should it be the case that all writes go through the commit log or should it 
be the case that only failures are journaled. If you journal all writes prior 
to sending to the server, the problem is that that amounts to significant 
overhead and leads to the possibility that logging or other I/O can slow you 
down. If you journal only failures you have the problem that your throughput 
may be very high in the non-failure scenario and then when Kafka goes down 
suddenly you start doing I/O but that is much slower and your throughput drops 
precipitously. Either may be okay but it is worth thinking through what the 
right behavior is.

> Explore disk-based buffering in new Kafka Producer
> --
>
> Key: KAFKA-1955
> URL: https://issues.apache.org/jira/browse/KAFKA-1955
> Project: Kafka
>  Issue Type: Improvement
>  Components: producer 
>Affects Versions: 0.8.2.0
>Reporter: Jay Kreps
>Assignee: Jay Kreps
> Attachments: KAFKA-1955.patch, 
> KAFKA-1955-RABASED-TO-8th-AUG-2015.patch
>
>
> There are two approaches to using Kafka for capturing event data that has no 
> other "source of truth store":
> 1. Just write to Kafka and try hard to keep the Kafka cluster up as you would 
> a database.
> 2. Write to some kind of local disk store and copy from that to Kafka.
> The cons of the second approach are the following:
> 1. You end up depending on disks on all the producer machines. If you have 
> 1 producers, that is 10k places state is kept. These tend to fail a lot.
> 2. You can get data arbitrarily delayed
> 3. You still don't tolerate hard outages since there is no replication in the 
> producer tier
> 4. This tends to make problems with duplicates more common in certain failure 
> scenarios.
> There is one big pro, though: you don't have to keep Kafka running all the 
> time.
> So far we have done nothing in Kafka to help support 

Re: [DISCUSS] KIP-150 - Kafka-Streams Cogroup

2017-06-08 Thread Guozhang Wang
Note that although the internal `AbstractStoreSupplier` does maintain the
key-value serdes, we do not enforce the interface of `StateStoreSupplier`
to always retain that information, and hence we cannot assume that
StateStoreSuppliers always retain key / value serdes.

On Thu, Jun 8, 2017 at 11:51 AM, Xavier Léauté  wrote:

> Another reason for the serde not to be in the first cogroup call, is that
> the serde should not be required if you pass a StateStoreSupplier to
> aggregate()
>
> Regarding the aggregated type  I don't the why initializer should be
> favored over aggregator to define the type. In my mind separating the
> initializer into the last aggregate call clearly indicates that the
> initializer is independent of any of the aggregators or streams and that we
> don't wait for grouped1 events to initialize the co-group.
>
> On Thu, Jun 8, 2017 at 11:14 AM Guozhang Wang  wrote:
>
> > On a second thought... This is the current proposal API
> >
> >
> > ```
> >
> >  CogroupedKStream cogroup(final Initializer initializer,
> final
> > Aggregator aggregator, final Serde
> > aggValueSerde)
> >
> > ```
> >
> >
> > If we do not have the initializer in the first co-group it might be a bit
> > awkward for users to specify the aggregator that returns a typed 
> value?
> > Maybe it is still better to put these two functions in the same api?
> >
> >
> >
> > Guozhang
> >
> > On Thu, Jun 8, 2017 at 11:08 AM, Guozhang Wang 
> wrote:
> >
> > > This suggestion lgtm. I would vote for the first alternative than
> adding
> > > it to the `KStreamBuilder` though.
> > >
> > > On Thu, Jun 8, 2017 at 10:58 AM, Xavier Léauté 
> > > wrote:
> > >
> > >> I have a minor suggestion to make the API a little bit more symmetric.
> > >> I feel it would make more sense to move the initializer and serde to
> the
> > >> final aggregate statement, since the serde only applies to the state
> > >> store,
> > >> and the initializer doesn't bear any relation to the first group in
> > >> particular. It would end up looking like this:
> > >>
> > >> KTable cogrouped =
> > >> grouped1.cogroup(aggregator1)
> > >> .cogroup(grouped2, aggregator2)
> > >> .cogroup(grouped3, aggregator3)
> > >> .aggregate(initializer1, aggValueSerde, storeName1);
> > >>
> > >> Alternatively, we could move the the first cogroup() method to
> > >> KStreamBuilder, similar to how we have .merge()
> > >> and end up with an api that would be even more symmetric.
> > >>
> > >> KStreamBuilder.cogroup(grouped1, aggregator1)
> > >>   .cogroup(grouped2, aggregator2)
> > >>   .cogroup(grouped3, aggregator3)
> > >>   .aggregate(initializer1, aggValueSerde, storeName1);
> > >>
> > >> This doesn't have to be a blocker, but I thought it would make the API
> > >> just
> > >> a tad cleaner.
> > >>
> > >> On Tue, Jun 6, 2017 at 3:59 PM Guozhang Wang 
> > wrote:
> > >>
> > >> > Kyle,
> > >> >
> > >> > Thanks a lot for the updated KIP. It looks good to me.
> > >> >
> > >> >
> > >> > Guozhang
> > >> >
> > >> >
> > >> > On Fri, Jun 2, 2017 at 5:37 AM, Jim Jagielski 
> > wrote:
> > >> >
> > >> > > This makes much more sense to me. +1
> > >> > >
> > >> > > > On Jun 1, 2017, at 10:33 AM, Kyle Winkelman <
> > >> winkelman.k...@gmail.com>
> > >> > > wrote:
> > >> > > >
> > >> > > > I have updated the KIP and my PR. Let me know what you think.
> > >> > > > To created a cogrouped stream just call cogroup on a
> > KgroupedStream
> > >> and
> > >> > > > supply the initializer, aggValueSerde, and an aggregator. Then
> > >> continue
> > >> > > > adding kgroupedstreams and aggregators. Then call one of the
> many
> > >> > > aggregate
> > >> > > > calls to create a KTable.
> > >> > > >
> > >> > > > Thanks,
> > >> > > > Kyle
> > >> > > >
> > >> > > > On Jun 1, 2017 4:03 AM, "Damian Guy" 
> > wrote:
> > >> > > >
> > >> > > >> Hi Kyle,
> > >> > > >>
> > >> > > >> Thanks for the update. I think just one initializer makes sense
> > as
> > >> it
> > >> > > >> should only be called once per key and generally it is just
> going
> > >> to
> > >> > > create
> > >> > > >> a new instance of whatever the Aggregate class is.
> > >> > > >>
> > >> > > >> Cheers,
> > >> > > >> Damian
> > >> > > >>
> > >> > > >> On Wed, 31 May 2017 at 20:09 Kyle Winkelman <
> > >> winkelman.k...@gmail.com
> > >> > >
> > >> > > >> wrote:
> > >> > > >>
> > >> > > >>> Hello all,
> > >> > > >>>
> > >> > > >>> I have spent some more time on this and the best alternative I
> > >> have
> > >> > > come
> > >> > > >> up
> > >> > > >>> with is:
> > >> > > >>> KGroupedStream has a single cogroup call that takes an
> > initializer
> > >> > and
> > >> > > an
> > >> > > >>> aggregator.
> > >> > > >>> CogroupedKStream has a cogroup call that takes additional
> > >> > groupedStream
> > >> > > >>> aggregator pairs.
> > >> > > >>> CogroupedKStream has multiple aggregate methods that create
> the
> > >> > > different
> > >> > > >>> stores.
> > >> > > >>>
> > >> > > >>> I plan on upd

[jira] [Commented] (KAFKA-3199) LoginManager should allow using an existing Subject

2017-06-08 Thread Adam Kunicki (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16043503#comment-16043503
 ] 

Adam Kunicki commented on KAFKA-3199:
-

Yes, thank you! [~jisunkim]

> LoginManager should allow using an existing Subject
> ---
>
> Key: KAFKA-3199
> URL: https://issues.apache.org/jira/browse/KAFKA-3199
> Project: Kafka
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 0.9.0.0
>Reporter: Adam Kunicki
>Assignee: Adam Kunicki
>Priority: Critical
>
> LoginManager currently creates a new Login in the constructor which then 
> performs a login and starts a ticket renewal thread. The problem here is that 
> because Kafka performs its own login, it doesn't offer the ability to re-use 
> an existing subject that's already managed by the client application.
> The goal of LoginManager appears to be to be able to return a valid Subject. 
> It would be a simple fix to have LoginManager.acquireLoginManager() check for 
> a new config e.g. kerberos.use.existing.subject. 
> This would instead of creating a new Login in the constructor simply call 
> Subject.getSubject(AccessController.getContext()); to use the already logged 
> in Subject.
> This is also doable without introducing a new configuration and simply 
> checking if there is already a valid Subject available, but I think it may be 
> preferable to require that users explicitly request this behavior.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [VOTE] KIP-134: Delay initial consumer group rebalance

2017-06-08 Thread Guozhang Wang
Just recapping on client-side v.s. broker-side config: we did discuss about
adding this as a client-side config and bump up join-group request (I think
both Ismael and Ewen questioned about it) to include this configured value
to the broker. I cannot remember if there is any strong motivations against
going to the client-side config, except that we felt a default non-zero
value will benefit most users assuming they start with more than one member
in their group but only advanced users would really realize this config
existing and tune it themselves.

I agree that we could re-consider it for the next release if we observe
that it is actually affecting more users than benefiting them.

Guozhang

On Wed, Jun 7, 2017 at 2:26 AM, Damian Guy  wrote:

> Hi Jun/Ismael,
>
> Sounds good to me.
>
> Thanks,
> Damian
>
> On Tue, 6 Jun 2017 at 23:08 Ismael Juma  wrote:
>
> > Hi Jun,
> >
> > The console consumer issue also came up in a conversation I was having
> > recently. Seems like the config/server.properties change is a reasonable
> > compromise given that we have other defaults that are for development.
> >
> > Ismael
> >
> > On Tue, Jun 6, 2017 at 10:59 PM, Jun Rao  wrote:
> >
> > > Hi, Everyone,
> > >
> > > Sorry for being late on this thread. I just came across this thread. I
> > have
> > > a couple of concerns on this. (1) It seems the amount of delay will be
> > > application specific. So, it seems that it's better for the delay to
> be a
> > > client side config instead of a server side one? (2) When running
> console
> > > consumer in quickstart, a minimum of 3 sec delay seems to be a bad
> > > experience for our users.
> > >
> > > Since we are getting late into the release cycle, it may be a bit too
> > late
> > > to make big changes in the 0.11 release. Perhaps we should at least
> > > consider overriding the delay in config/server.properties to 0 to
> improve
> > > the quickstart experience?
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > >
> > > On Tue, Apr 11, 2017 at 12:19 AM, Damian Guy 
> > wrote:
> > >
> > > > Hi Onur,
> > > >
> > > > It was in my previous email. But here it is again.
> > > >
> > > > 
> > > >
> > > > 1. Better rebalance timing. We will try to rebalance only when all
> the
> > > > consumers in a group have joined. The challenge would be someone has
> to
> > > > define what does ALL consumers mean, it could either be a time or
> > number
> > > of
> > > > consumers, etc.
> > > >
> > > > 2. Avoid frequent rebalance. For example, if there are 100 consumers
> > in a
> > > > group, today, in the worst case, we may end up with 100 rebalances
> even
> > > if
> > > > all the consumers joined the group in a reasonably small amount of
> > time.
> > > > Frequent rebalance is also a bad thing for brokers.
> > > >
> > > > Having a client side configuration may solve problem 1 better because
> > > each
> > > > consumer group can potentially configure their own timing. However,
> it
> > > does
> > > > not really prevent frequent rebalance in general because some of the
> > > > consumers can be misconfigured. (This may have something to do with
> > > KIP-124
> > > > as well. But if quota is applied on the JoinGroup/SyncGroup request
> it
> > > may
> > > > cause some unwanted cascading effects.)
> > > >
> > > > Having a broker side configuration may result in less flexibility for
> > > each
> > > > consumer group, but it can prevent frequent rebalance better. I think
> > > with
> > > > some reasonable design, the rebalance timing issue can be resolved on
> > the
> > > > broker side as well. Matthias had a good point on extending the delay
> > > when
> > > > a new consumer joins a group (we actually did something similar to
> > batch
> > > > ISR change propagation). For example, let's say on the broker side,
> we
> > > will
> > > > always delay 2 seconds each time we see a new consumer joining a
> > consumer
> > > > group. This would probably work for most of the consumer groups and
> > will
> > > > also limit the rebalance frequency to protect the brokers.
> > > >
> > > > I am not sure about the streams use case here, but if something like
> 2
> > > > seconds of delay is acceptable for streams, I would prefer adding the
> > > > configuration to the broker so that we can address both problems.
> > > >
> > > > On Thu, 6 Apr 2017 at 17:11 Onur Karaman <
> onurkaraman.apa...@gmail.com
> > >
> > > > wrote:
> > > >
> > > > > Hi Damian.
> > > > >
> > > > > Can you copy the point Becket made earlier that you say isn't
> > > addressed?
> > > > >
> > > > > On Thu, Apr 6, 2017 at 2:51 AM, Damian Guy 
> > > wrote:
> > > > >
> > > > > > Thanks all, the Vote is now closed and the KIP has been accepted
> > > with 9
> > > > > +1s
> > > > > >
> > > > > > 3 binding::
> > > > > > Guozhang,
> > > > > > Jason,
> > > > > > Ismael
> > > > > >
> > > > > > 6 non-binding:
> > > > > > Bill,
> > > > > > Eno,
> > > > > > Mathieu,
> > > > > > Matthias,
> > > > > > Dong,
> 

[jira] [Commented] (KAFKA-4628) Support KTable/GlobalKTable Joins

2017-06-08 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16043528#comment-16043528
 ] 

Guozhang Wang commented on KAFKA-4628:
--

[~dminkovsky] Thanks for sharing your use cases. We are actively working on the 
table / global table join semantics now, stay tuned.

> Support KTable/GlobalKTable Joins
> -
>
> Key: KAFKA-4628
> URL: https://issues.apache.org/jira/browse/KAFKA-4628
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Damian Guy
> Fix For: 0.11.1.0
>
>
> In KIP-99 we have added support for GlobalKTables, however we don't currently 
> support KTable/GlobalKTable joins as they require materializing a state store 
> for the join. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5317) Update KIP-98 to reflect changes during implementation.

2017-06-08 Thread Gwen Shapira (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16043541#comment-16043541
 ] 

Gwen Shapira commented on KAFKA-5317:
-

I understand the importance, but is it really a release blocker?

> Update KIP-98 to reflect changes during implementation.
> ---
>
> Key: KAFKA-5317
> URL: https://issues.apache.org/jira/browse/KAFKA-5317
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, core, producer 
>Reporter: Apurva Mehta
>Priority: Blocker
>  Labels: documentation, exactly-once
> Fix For: 0.11.0.0
>
>
> While implementing the EOS design, there are some minor (or major?) tweaks we 
> made as we hands-on the code bases. We will compile all these changes in a 
> single run at the end of the code-ready and this JIRA is for keeping track of 
> all the changes we made.
> 02/27/2017: collapse the two types of transactional log messages into a 
> single type with key as transactional id, and value as [pid, epoch, 
> transaction_timeout, transaction_state, [topic partition ] ]. Also using a 
> single memory map in cache instead of two on the TC.
> 03/01/2017: for pid expiration, we decided to use min(transactional id 
> expiration timeout, topic retention). For topics enabled for compaction only, 
> we just use the transactional timeout. If the retention setting is larger 
> than the transactional id expiration timeout, then the pid will be 
> "logically" expired (i.e. we will remove it from the cached pid mapping and 
> ignore it when rebuilding the cache from the log)
> 03/20/2017: add a new exception type in `o.a.k.common.errors` for invalid 
> transaction timeout values.
> 03/25/2017: extend WriteTxnMarkerRequest to contain multiple markers for 
> multiple PIDs with a single request.
> 04/20/2017: add transactionStartTime to TransactionMetadata
> 04/26/2017: added a new retriable error: Errors.CONCURRENT_TRANSACTIONS
> 04/01/2017: We also enforce acks=all on the client when idempotence is 
> enabled. Without this, we cannot again guarantee idemptoence.
> 04/10/2017: WE also don't pass the underlying exception to 
> `RetriableOffsetCommitFailedException` anymore: 
> https://issues.apache.org/jira/browse/KAFKA-5052



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #3272: MINOR: disable flaky Streams EOS integration tests

2017-06-08 Thread mjsax
GitHub user mjsax reopened a pull request:

https://github.com/apache/kafka/pull/3272

MINOR: disable flaky Streams EOS integration tests



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mjsax/kafka minor-disable-eos-tests

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3272.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3272


commit 07f9b8fc90ed13bb551c5806b1147d63ec4c2b88
Author: Matthias J. Sax 
Date:   2017-06-08T17:01:41Z

MINOR: disable flaky Streams EOS integration tests




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Info regarding kafka topic

2017-06-08 Thread BigData dev
Hi,
I have a 3 node Kafka Broker cluster.
I have created a topic and the leader for the topic is broker 1(1001). And
the broker got died.
But when I see the information in zookeeper for the topic, I see the leader
is still set to broker 1 (1001) and isr is set to 1001. Is this a bug in
kafka, as now leader is died, the leader should have set to none.

*[zk: localhost:2181(CONNECTED) 7] get
/brokers/topics/t3/partitions/0/state*

*{"controller_epoch":1,"leader":1001,"version":1,"leader_epoch":1,"isr":[1001]}*

*cZxid = 0x10078*

*ctime = Thu Jun 08 14:50:07 PDT 2017*

*mZxid = 0x1008c*

*mtime = Thu Jun 08 14:51:09 PDT 2017*

*pZxid = 0x10078*

*cversion = 0*

*dataVersion = 1*

*aclVersion = 0*

*ephemeralOwner = 0x0*

*dataLength = 78*

*numChildren = 0*

*[zk: localhost:2181(CONNECTED) 8] *


And when I use describe command the output is

*[root@meets2 kafka-broker]# bin/kafka-topics.sh --describe --topic t3
--zookeeper localhost:2181*

*Topic:t3 PartitionCount:1 ReplicationFactor:2 Configs:*

*Topic: t3 Partition: 0 Leader: 1001 Replicas: 1001,1003 Isr: 1001*


When I use unavailable-partition option, I can know correctly.

*[root@meets2 kafka-broker]# bin/kafka-topics.sh --describe --topic t3
--zookeeper localhost:2181 --unavailable-partitions*

* Topic: t3 Partition: 0 Leader: 1001 Replicas: 1001,1003 Isr: 1001*


But in zookeeper topic state, the leader should have been set to none, not
the actual leader when the broker has died. Is this according to design or
is it a bug in Kafka. Could you please provide any information on this?


*Thanks,*

*Bharat*


[GitHub] kafka pull request #3271: KAFKA-5411: AdminClient javadoc and documentation ...

2017-06-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3271


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (KAFKA-5411) Generate javadoc for AdminClient and show configs in documentation

2017-06-08 Thread Gwen Shapira (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gwen Shapira updated KAFKA-5411:

   Resolution: Fixed
Fix Version/s: 0.11.1.0
   Status: Resolved  (was: Patch Available)

Issue resolved by pull request 3271
[https://github.com/apache/kafka/pull/3271]

> Generate javadoc for AdminClient and show configs in documentation
> --
>
> Key: KAFKA-5411
> URL: https://issues.apache.org/jira/browse/KAFKA-5411
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Ismael Juma
>Assignee: Ismael Juma
>Priority: Blocker
> Fix For: 0.11.0.0, 0.11.1.0
>
>
> Also fix the table of contents.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5411) Generate javadoc for AdminClient and show configs in documentation

2017-06-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16043546#comment-16043546
 ] 

ASF GitHub Bot commented on KAFKA-5411:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3271


> Generate javadoc for AdminClient and show configs in documentation
> --
>
> Key: KAFKA-5411
> URL: https://issues.apache.org/jira/browse/KAFKA-5411
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Ismael Juma
>Assignee: Ismael Juma
>Priority: Blocker
> Fix For: 0.11.0.0, 0.11.1.0
>
>
> Also fix the table of contents.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #3272: MINOR: disable flaky Streams EOS integration tests

2017-06-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3272


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #3274: KAFKA-3199 LoginManager should allow using an exis...

2017-06-08 Thread utenakr
GitHub user utenakr opened a pull request:

https://github.com/apache/kafka/pull/3274

KAFKA-3199 LoginManager should allow using an existing Subject

LoginManager or KerberosLogin (for > kafka 0.10) should allow using an 
existing Subject. If there's an existing subject, the Jaas configuration won't 
needed in getService()

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/utenakr/kafka trunk

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3274.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3274


commit 95d6b98440f02fee23f8063ad082ec3dae4bd0b2
Author: Ji Sun 
Date:   2017-06-08T22:21:50Z

KAFKA-3199 LoginManager should allow using an existing Subject




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-3199) LoginManager should allow using an existing Subject

2017-06-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16043552#comment-16043552
 ] 

ASF GitHub Bot commented on KAFKA-3199:
---

GitHub user utenakr opened a pull request:

https://github.com/apache/kafka/pull/3274

KAFKA-3199 LoginManager should allow using an existing Subject

LoginManager or KerberosLogin (for > kafka 0.10) should allow using an 
existing Subject. If there's an existing subject, the Jaas configuration won't 
needed in getService()

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/utenakr/kafka trunk

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3274.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3274


commit 95d6b98440f02fee23f8063ad082ec3dae4bd0b2
Author: Ji Sun 
Date:   2017-06-08T22:21:50Z

KAFKA-3199 LoginManager should allow using an existing Subject




> LoginManager should allow using an existing Subject
> ---
>
> Key: KAFKA-3199
> URL: https://issues.apache.org/jira/browse/KAFKA-3199
> Project: Kafka
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 0.9.0.0
>Reporter: Adam Kunicki
>Assignee: Adam Kunicki
>Priority: Critical
>
> LoginManager currently creates a new Login in the constructor which then 
> performs a login and starts a ticket renewal thread. The problem here is that 
> because Kafka performs its own login, it doesn't offer the ability to re-use 
> an existing subject that's already managed by the client application.
> The goal of LoginManager appears to be to be able to return a valid Subject. 
> It would be a simple fix to have LoginManager.acquireLoginManager() check for 
> a new config e.g. kerberos.use.existing.subject. 
> This would instead of creating a new Login in the constructor simply call 
> Subject.getSubject(AccessController.getContext()); to use the already logged 
> in Subject.
> This is also doable without introducing a new configuration and simply 
> checking if there is already a valid Subject available, but I think it may be 
> preferable to require that users explicitly request this behavior.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Jenkins build is back to normal : kafka-trunk-jdk7 #2373

2017-06-08 Thread Apache Jenkins Server
See 




[jira] [Commented] (KAFKA-5317) Update KIP-98 to reflect changes during implementation.

2017-06-08 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16043562#comment-16043562
 ] 

Ismael Juma commented on KAFKA-5317:


Technically, it should be.

> Update KIP-98 to reflect changes during implementation.
> ---
>
> Key: KAFKA-5317
> URL: https://issues.apache.org/jira/browse/KAFKA-5317
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, core, producer 
>Reporter: Apurva Mehta
>Priority: Blocker
>  Labels: documentation, exactly-once
> Fix For: 0.11.0.0
>
>
> While implementing the EOS design, there are some minor (or major?) tweaks we 
> made as we hands-on the code bases. We will compile all these changes in a 
> single run at the end of the code-ready and this JIRA is for keeping track of 
> all the changes we made.
> 02/27/2017: collapse the two types of transactional log messages into a 
> single type with key as transactional id, and value as [pid, epoch, 
> transaction_timeout, transaction_state, [topic partition ] ]. Also using a 
> single memory map in cache instead of two on the TC.
> 03/01/2017: for pid expiration, we decided to use min(transactional id 
> expiration timeout, topic retention). For topics enabled for compaction only, 
> we just use the transactional timeout. If the retention setting is larger 
> than the transactional id expiration timeout, then the pid will be 
> "logically" expired (i.e. we will remove it from the cached pid mapping and 
> ignore it when rebuilding the cache from the log)
> 03/20/2017: add a new exception type in `o.a.k.common.errors` for invalid 
> transaction timeout values.
> 03/25/2017: extend WriteTxnMarkerRequest to contain multiple markers for 
> multiple PIDs with a single request.
> 04/20/2017: add transactionStartTime to TransactionMetadata
> 04/26/2017: added a new retriable error: Errors.CONCURRENT_TRANSACTIONS
> 04/01/2017: We also enforce acks=all on the client when idempotence is 
> enabled. Without this, we cannot again guarantee idemptoence.
> 04/10/2017: WE also don't pass the underlying exception to 
> `RetriableOffsetCommitFailedException` anymore: 
> https://issues.apache.org/jira/browse/KAFKA-5052



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


  1   2   >