[GitHub] kafka pull request: MINOR: Add property to configure showing of st...

2016-01-07 Thread ijuma
GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/739

MINOR: Add property to configure showing of standard streams in Gradle

This is handy when debugging certain kinds of Jenkins failures.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka gradle-show-standard-streams

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/739.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #739


commit 8f9a51fab3dede011a770dda0225eb0ba5065678
Author: Ismael Juma 
Date:   2016-01-07T10:30:16Z

Fix indenting in build.gradle

commit 783dfa83495a4b03ccc4bbb061b86a5ee80ee097
Author: Ismael Juma 
Date:   2016-01-07T10:30:40Z

Add property to configure showing of standard streams in Gradle




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Work started] (KAFKA-3063) Transient exist(1) in unknown test cases

2016-01-07 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on KAFKA-3063 started by Ismael Juma.
--
> Transient exist(1) in unknown test cases
> 
>
> Key: KAFKA-3063
> URL: https://issues.apache.org/jira/browse/KAFKA-3063
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Guozhang Wang
>Assignee: Ismael Juma
> Fix For: 0.9.1.0
>
>
> We see transient failures like the following
> {code}
> FAILURE: Build failed with an exception.
> * What went wrong:
> Execution failed for task ':core:test'.
> > Process 'Gradle Test Executor 2' finished with non-zero exit value 1
> {code}
> which are likely to be from an unexpected System.exit(1). But with the 
> current logging settings it is hard to locate which test cases triggered this 
> failures. More investigations needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3063) LogRecoveryTest exits with -1 occasionally

2016-01-07 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-3063:
---
Summary: LogRecoveryTest exits with -1 occasionally  (was: Transient 
exist(1) in unknown test cases)

> LogRecoveryTest exits with -1 occasionally
> --
>
> Key: KAFKA-3063
> URL: https://issues.apache.org/jira/browse/KAFKA-3063
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Guozhang Wang
>Assignee: Ismael Juma
> Fix For: 0.9.1.0
>
>
> We see transient failures like the following
> {code}
> FAILURE: Build failed with an exception.
> * What went wrong:
> Execution failed for task ':core:test'.
> > Process 'Gradle Test Executor 2' finished with non-zero exit value 1
> {code}
> which are likely to be from an unexpected System.exit(1). But with the 
> current logging settings it is hard to locate which test cases triggered this 
> failures. More investigations needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3068) NetworkClient may connect to a different Kafka cluster than originally configured

2016-01-07 Thread Eno Thereska (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15087309#comment-15087309
 ] 

Eno Thereska commented on KAFKA-3068:
-

[~junrao], [~hachikuji] I understand the concerns. What I don't like about 
using the bootstrap servers is that the problem is punted to the user (to 
provide enough bootstrap servers, to keep track of whether they have moved and 
to restart producers when they do so. For a long running cluster of 100+ 
machines that is hard to do.). [~junrao]: between these two non-ideal solutions 
do we have a sense which one is the least worst? I can change the code to use 
the bootstrap brokers but I am worried an equal number of users may be 
dissatisfied from that. 

> NetworkClient may connect to a different Kafka cluster than originally 
> configured
> -
>
> Key: KAFKA-3068
> URL: https://issues.apache.org/jira/browse/KAFKA-3068
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.9.0.0
>Reporter: Jun Rao
>
> In https://github.com/apache/kafka/pull/290, we added the logic to cache all 
> brokers (id and ip) that the client has ever seen. If we can't find an 
> available broker from the current Metadata, we will pick a broker that we 
> have ever seen (in NetworkClient.leastLoadedNode()).
> One potential problem this logic can introduce is the following. Suppose that 
> we have a broker with id 1 in a Kafka cluster. A producer client remembers 
> this broker in nodesEverSeen. At some point, we bring down this broker and 
> use the host in a different Kafka cluster. Then, the producer client uses 
> this broker from nodesEverSeen to refresh metadata. It will find the metadata 
> in a different Kafka cluster and start producing data there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2988) Change default configuration of the log cleaner

2016-01-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1508#comment-1508
 ] 

ASF GitHub Bot commented on KAFKA-2988:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/686


> Change default configuration of the log cleaner
> ---
>
> Key: KAFKA-2988
> URL: https://issues.apache.org/jira/browse/KAFKA-2988
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.9.0.0
>Reporter: Grant Henke
>Assignee: Grant Henke
>
> Since 0.9.0 the internal "__consumer_offsets" topic is being used more 
> heavily. Because this is a compacted topic "log.cleaner.enable" needs to be 
> "true" in order for it to be compacted. 
> Since this is critical for core Kafka functionality we should change the 
> default to true and potentially consider removing the option to disable all 
> together. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-2988: Change default configuration of th...

2016-01-07 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/686


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Jenkins build is back to normal : kafka-trunk-jdk7 #943

2016-01-07 Thread Apache Jenkins Server
See 



Connecting with new consumers and existing group appears to cause existing group to rebalance

2016-01-07 Thread Cliff Rhyne
Hi,

I'm testing out some changes with the 0.9.0.0 new KafkaConsumer API.  We
re-use the same consumer group ID across different components of the
application (which consume from different topics).  One topic is always
being consumed from, the rest are turned on and off.

If I only run the always-on consumer, I have a very low occurrence rate of
the log message below:

org.apache.kafka.clients.consumer.internals.AbstractCoordinator - Attempt
to heart beat failed since the group is rebalancing, try to re-join group.


If I run both types of consumers, the log message occurs frequently and the
alway-on consumer eventually doesn't succeed in rejoining (I see the
attempt in the logs to rejoin but nothing happens after that).  I only have
logs on the client side to work with; there's nothing showing up in the
kafka logs to show why the group's state isn't stable.

Thanks,
Cliff

-- 
Cliff Rhyne
Software Engineering Lead
e: crh...@signal.co
signal.co


Cut Through the Noise

This e-mail and any files transmitted with it are for the sole use of the
intended recipient(s) and may contain confidential and privileged
information. Any unauthorized use of this email is strictly prohibited.
©2015 Signal. All rights reserved.


Re: Connecting with new consumers and existing group appears to cause existing group to rebalance

2016-01-07 Thread Jason Gustafson
Hey Cliff,

Are you using the 0.9.0.0 release? We've fixed a few problems in the 0.9.0
branch, some of which might explain the behavior you're seeing. There was
one bug in particular which resulted in the consumer not fetching data for
a set of partitions after a rebalance.

-Jason

On Thu, Jan 7, 2016 at 1:01 PM, Cliff Rhyne  wrote:

> Hi,
>
> I'm testing out some changes with the 0.9.0.0 new KafkaConsumer API.  We
> re-use the same consumer group ID across different components of the
> application (which consume from different topics).  One topic is always
> being consumed from, the rest are turned on and off.
>
> If I only run the always-on consumer, I have a very low occurrence rate of
> the log message below:
>
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator - Attempt
> to heart beat failed since the group is rebalancing, try to re-join group.
>
>
> If I run both types of consumers, the log message occurs frequently and the
> alway-on consumer eventually doesn't succeed in rejoining (I see the
> attempt in the logs to rejoin but nothing happens after that).  I only have
> logs on the client side to work with; there's nothing showing up in the
> kafka logs to show why the group's state isn't stable.
>
> Thanks,
> Cliff
>
> --
> Cliff Rhyne
> Software Engineering Lead
> e: crh...@signal.co
> signal.co
> 
>
> Cut Through the Noise
>
> This e-mail and any files transmitted with it are for the sole use of the
> intended recipient(s) and may contain confidential and privileged
> information. Any unauthorized use of this email is strictly prohibited.
> ©2015 Signal. All rights reserved.
>


Re: Connecting with new consumers and existing group appears to cause existing group to rebalance

2016-01-07 Thread Cliff Rhyne
Hi Jason,

I'm just on the 0.9.0.0 release.  Are the fixes in the client, the kafka
service, or both?  I'll give it a try.

Is there a timeline for when 0.9.0.1 would be released?

Thanks,
Cliff

On Thu, Jan 7, 2016 at 3:14 PM, Jason Gustafson  wrote:

> Hey Cliff,
>
> Are you using the 0.9.0.0 release? We've fixed a few problems in the 0.9.0
> branch, some of which might explain the behavior you're seeing. There was
> one bug in particular which resulted in the consumer not fetching data for
> a set of partitions after a rebalance.
>
> -Jason
>
> On Thu, Jan 7, 2016 at 1:01 PM, Cliff Rhyne  wrote:
>
> > Hi,
> >
> > I'm testing out some changes with the 0.9.0.0 new KafkaConsumer API.  We
> > re-use the same consumer group ID across different components of the
> > application (which consume from different topics).  One topic is always
> > being consumed from, the rest are turned on and off.
> >
> > If I only run the always-on consumer, I have a very low occurrence rate
> of
> > the log message below:
> >
> > org.apache.kafka.clients.consumer.internals.AbstractCoordinator - Attempt
> > to heart beat failed since the group is rebalancing, try to re-join
> group.
> >
> >
> > If I run both types of consumers, the log message occurs frequently and
> the
> > alway-on consumer eventually doesn't succeed in rejoining (I see the
> > attempt in the logs to rejoin but nothing happens after that).  I only
> have
> > logs on the client side to work with; there's nothing showing up in the
> > kafka logs to show why the group's state isn't stable.
> >
> > Thanks,
> > Cliff
> >
> > --
> > Cliff Rhyne
> > Software Engineering Lead
> > e: crh...@signal.co
> > signal.co
> > 
> >
> > Cut Through the Noise
> >
> > This e-mail and any files transmitted with it are for the sole use of the
> > intended recipient(s) and may contain confidential and privileged
> > information. Any unauthorized use of this email is strictly prohibited.
> > ©2015 Signal. All rights reserved.
> >
>



-- 
Cliff Rhyne
Software Engineering Lead
e: crh...@signal.co
signal.co


Cut Through the Noise

This e-mail and any files transmitted with it are for the sole use of the
intended recipient(s) and may contain confidential and privileged
information. Any unauthorized use of this email is strictly prohibited.
©2015 Signal. All rights reserved.


[jira] [Created] (KAFKA-3076) BrokerChangeListener should log the brokers in order

2016-01-07 Thread Jun Rao (JIRA)
Jun Rao created KAFKA-3076:
--

 Summary: BrokerChangeListener should log the brokers in order
 Key: KAFKA-3076
 URL: https://issues.apache.org/jira/browse/KAFKA-3076
 Project: Kafka
  Issue Type: Improvement
  Components: core
Affects Versions: 0.9.0.0
Reporter: Jun Rao


Currently, in BrokerChangeListener, we log the full, new and deleted broker set 
in random order. It would be better if we log them in sorted order.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (KAFKA-3068) NetworkClient may connect to a different Kafka cluster than originally configured

2016-01-07 Thread Eno Thereska (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eno Thereska reassigned KAFKA-3068:
---

Assignee: Eno Thereska

> NetworkClient may connect to a different Kafka cluster than originally 
> configured
> -
>
> Key: KAFKA-3068
> URL: https://issues.apache.org/jira/browse/KAFKA-3068
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.9.0.0
>Reporter: Jun Rao
>Assignee: Eno Thereska
>
> In https://github.com/apache/kafka/pull/290, we added the logic to cache all 
> brokers (id and ip) that the client has ever seen. If we can't find an 
> available broker from the current Metadata, we will pick a broker that we 
> have ever seen (in NetworkClient.leastLoadedNode()).
> One potential problem this logic can introduce is the following. Suppose that 
> we have a broker with id 1 in a Kafka cluster. A producer client remembers 
> this broker in nodesEverSeen. At some point, we bring down this broker and 
> use the host in a different Kafka cluster. Then, the producer client uses 
> this broker from nodesEverSeen to refresh metadata. It will find the metadata 
> in a different Kafka cluster and start producing data there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Connecting with new consumers and existing group appears to cause existing group to rebalance

2016-01-07 Thread Jason Gustafson
There have been bugs affecting both the client and the server. The one I
mentioned above only affected the client, so you could try updating it
alone if that's easier, but it would be better to do both.

I'll leave it to others to comment on the release timeline. I haven't seen
any major consumer-related bugs in the past couple weeks, so my feeling is
that it's starting to stabilize. It would be nice to get KIP-41 into the
next release though.

-Jason

On Thu, Jan 7, 2016 at 1:18 PM, Cliff Rhyne  wrote:

> Hi Jason,
>
> I'm just on the 0.9.0.0 release.  Are the fixes in the client, the kafka
> service, or both?  I'll give it a try.
>
> Is there a timeline for when 0.9.0.1 would be released?
>
> Thanks,
> Cliff
>
> On Thu, Jan 7, 2016 at 3:14 PM, Jason Gustafson 
> wrote:
>
> > Hey Cliff,
> >
> > Are you using the 0.9.0.0 release? We've fixed a few problems in the
> 0.9.0
> > branch, some of which might explain the behavior you're seeing. There was
> > one bug in particular which resulted in the consumer not fetching data
> for
> > a set of partitions after a rebalance.
> >
> > -Jason
> >
> > On Thu, Jan 7, 2016 at 1:01 PM, Cliff Rhyne  wrote:
> >
> > > Hi,
> > >
> > > I'm testing out some changes with the 0.9.0.0 new KafkaConsumer API.
> We
> > > re-use the same consumer group ID across different components of the
> > > application (which consume from different topics).  One topic is always
> > > being consumed from, the rest are turned on and off.
> > >
> > > If I only run the always-on consumer, I have a very low occurrence rate
> > of
> > > the log message below:
> > >
> > > org.apache.kafka.clients.consumer.internals.AbstractCoordinator -
> Attempt
> > > to heart beat failed since the group is rebalancing, try to re-join
> > group.
> > >
> > >
> > > If I run both types of consumers, the log message occurs frequently and
> > the
> > > alway-on consumer eventually doesn't succeed in rejoining (I see the
> > > attempt in the logs to rejoin but nothing happens after that).  I only
> > have
> > > logs on the client side to work with; there's nothing showing up in the
> > > kafka logs to show why the group's state isn't stable.
> > >
> > > Thanks,
> > > Cliff
> > >
> > > --
> > > Cliff Rhyne
> > > Software Engineering Lead
> > > e: crh...@signal.co
> > > signal.co
> > > 
> > >
> > > Cut Through the Noise
> > >
> > > This e-mail and any files transmitted with it are for the sole use of
> the
> > > intended recipient(s) and may contain confidential and privileged
> > > information. Any unauthorized use of this email is strictly prohibited.
> > > ©2015 Signal. All rights reserved.
> > >
> >
>
>
>
> --
> Cliff Rhyne
> Software Engineering Lead
> e: crh...@signal.co
> signal.co
> 
>
> Cut Through the Noise
>
> This e-mail and any files transmitted with it are for the sole use of the
> intended recipient(s) and may contain confidential and privileged
> information. Any unauthorized use of this email is strictly prohibited.
> ©2015 Signal. All rights reserved.
>


[jira] [Commented] (KAFKA-3068) NetworkClient may connect to a different Kafka cluster than originally configured

2016-01-07 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15088145#comment-15088145
 ] 

Ewen Cheslack-Postava commented on KAFKA-3068:
--

You can get into bad states without losing everything simultaneously, and in 
situations that aren't unrealistic in practice: see 
https://issues.apache.org/jira/browse/KAFKA-1843?focusedCommentId=14517659&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14517659

> NetworkClient may connect to a different Kafka cluster than originally 
> configured
> -
>
> Key: KAFKA-3068
> URL: https://issues.apache.org/jira/browse/KAFKA-3068
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.9.0.0
>Reporter: Jun Rao
>Assignee: Eno Thereska
>
> In https://github.com/apache/kafka/pull/290, we added the logic to cache all 
> brokers (id and ip) that the client has ever seen. If we can't find an 
> available broker from the current Metadata, we will pick a broker that we 
> have ever seen (in NetworkClient.leastLoadedNode()).
> One potential problem this logic can introduce is the following. Suppose that 
> we have a broker with id 1 in a Kafka cluster. A producer client remembers 
> this broker in nodesEverSeen. At some point, we bring down this broker and 
> use the host in a different Kafka cluster. Then, the producer client uses 
> this broker from nodesEverSeen to refresh metadata. It will find the metadata 
> in a different Kafka cluster and start producing data there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Connecting with new consumers and existing group appears to cause existing group to rebalance

2016-01-07 Thread Cliff Rhyne
I'll give the 0.9.0 trunk a try.

By the way, it looks to me that this might be a separate issue.  I just
setup unique group IDs for the stop/start topics from the always-on topic
and my test passed.  I think the issue is that the
GroupCoordinator.doJoinGroup() only tracks the group and not the topic when
deciding whether to rebalance (GroupCoordinator.prepareRebalance()).  If a
new consumer joins an existing group, it triggers a rebalance even if it's
consuming from a new topic (which matches my symptoms).

What do you think?

On Thu, Jan 7, 2016 at 3:27 PM, Jason Gustafson  wrote:

> There have been bugs affecting both the client and the server. The one I
> mentioned above only affected the client, so you could try updating it
> alone if that's easier, but it would be better to do both.
>
> I'll leave it to others to comment on the release timeline. I haven't seen
> any major consumer-related bugs in the past couple weeks, so my feeling is
> that it's starting to stabilize. It would be nice to get KIP-41 into the
> next release though.
>
> -Jason
>
> On Thu, Jan 7, 2016 at 1:18 PM, Cliff Rhyne  wrote:
>
> > Hi Jason,
> >
> > I'm just on the 0.9.0.0 release.  Are the fixes in the client, the kafka
> > service, or both?  I'll give it a try.
> >
> > Is there a timeline for when 0.9.0.1 would be released?
> >
> > Thanks,
> > Cliff
> >
> > On Thu, Jan 7, 2016 at 3:14 PM, Jason Gustafson 
> > wrote:
> >
> > > Hey Cliff,
> > >
> > > Are you using the 0.9.0.0 release? We've fixed a few problems in the
> > 0.9.0
> > > branch, some of which might explain the behavior you're seeing. There
> was
> > > one bug in particular which resulted in the consumer not fetching data
> > for
> > > a set of partitions after a rebalance.
> > >
> > > -Jason
> > >
> > > On Thu, Jan 7, 2016 at 1:01 PM, Cliff Rhyne  wrote:
> > >
> > > > Hi,
> > > >
> > > > I'm testing out some changes with the 0.9.0.0 new KafkaConsumer API.
> > We
> > > > re-use the same consumer group ID across different components of the
> > > > application (which consume from different topics).  One topic is
> always
> > > > being consumed from, the rest are turned on and off.
> > > >
> > > > If I only run the always-on consumer, I have a very low occurrence
> rate
> > > of
> > > > the log message below:
> > > >
> > > > org.apache.kafka.clients.consumer.internals.AbstractCoordinator -
> > Attempt
> > > > to heart beat failed since the group is rebalancing, try to re-join
> > > group.
> > > >
> > > >
> > > > If I run both types of consumers, the log message occurs frequently
> and
> > > the
> > > > alway-on consumer eventually doesn't succeed in rejoining (I see the
> > > > attempt in the logs to rejoin but nothing happens after that).  I
> only
> > > have
> > > > logs on the client side to work with; there's nothing showing up in
> the
> > > > kafka logs to show why the group's state isn't stable.
> > > >
> > > > Thanks,
> > > > Cliff
> > > >
> > > > --
> > > > Cliff Rhyne
> > > > Software Engineering Lead
> > > > e: crh...@signal.co
> > > > signal.co
> > > > 
> > > >
> > > > Cut Through the Noise
> > > >
> > > > This e-mail and any files transmitted with it are for the sole use of
> > the
> > > > intended recipient(s) and may contain confidential and privileged
> > > > information. Any unauthorized use of this email is strictly
> prohibited.
> > > > ©2015 Signal. All rights reserved.
> > > >
> > >
> >
> >
> >
> > --
> > Cliff Rhyne
> > Software Engineering Lead
> > e: crh...@signal.co
> > signal.co
> > 
> >
> > Cut Through the Noise
> >
> > This e-mail and any files transmitted with it are for the sole use of the
> > intended recipient(s) and may contain confidential and privileged
> > information. Any unauthorized use of this email is strictly prohibited.
> > ©2015 Signal. All rights reserved.
> >
>



-- 
Cliff Rhyne
Software Engineering Lead
e: crh...@signal.co
signal.co


Cut Through the Noise

This e-mail and any files transmitted with it are for the sole use of the
intended recipient(s) and may contain confidential and privileged
information. Any unauthorized use of this email is strictly prohibited.
©2015 Signal. All rights reserved.


Re: Connecting with new consumers and existing group appears to cause existing group to rebalance

2016-01-07 Thread Jason Gustafson
>
> If a new consumer joins an existing group, it triggers a rebalance even
> if it's
> consuming from a new topic (which matches my symptoms).
>

Not sure I understand the issue. Rebalances are triggered when either 1)
group membership changes, 2) a consumer's subscription changes, or 3) when
the number of partitions for a topic changes.

-Jason

On Thu, Jan 7, 2016 at 1:38 PM, Cliff Rhyne  wrote:

> I'll give the 0.9.0 trunk a try.
>
> By the way, it looks to me that this might be a separate issue.  I just
> setup unique group IDs for the stop/start topics from the always-on topic
> and my test passed.  I think the issue is that the
> GroupCoordinator.doJoinGroup() only tracks the group and not the topic when
> deciding whether to rebalance (GroupCoordinator.prepareRebalance()).  If a
> new consumer joins an existing group, it triggers a rebalance even if it's
> consuming from a new topic (which matches my symptoms).
>
> What do you think?
>
> On Thu, Jan 7, 2016 at 3:27 PM, Jason Gustafson 
> wrote:
>
> > There have been bugs affecting both the client and the server. The one I
> > mentioned above only affected the client, so you could try updating it
> > alone if that's easier, but it would be better to do both.
> >
> > I'll leave it to others to comment on the release timeline. I haven't
> seen
> > any major consumer-related bugs in the past couple weeks, so my feeling
> is
> > that it's starting to stabilize. It would be nice to get KIP-41 into the
> > next release though.
> >
> > -Jason
> >
> > On Thu, Jan 7, 2016 at 1:18 PM, Cliff Rhyne  wrote:
> >
> > > Hi Jason,
> > >
> > > I'm just on the 0.9.0.0 release.  Are the fixes in the client, the
> kafka
> > > service, or both?  I'll give it a try.
> > >
> > > Is there a timeline for when 0.9.0.1 would be released?
> > >
> > > Thanks,
> > > Cliff
> > >
> > > On Thu, Jan 7, 2016 at 3:14 PM, Jason Gustafson 
> > > wrote:
> > >
> > > > Hey Cliff,
> > > >
> > > > Are you using the 0.9.0.0 release? We've fixed a few problems in the
> > > 0.9.0
> > > > branch, some of which might explain the behavior you're seeing. There
> > was
> > > > one bug in particular which resulted in the consumer not fetching
> data
> > > for
> > > > a set of partitions after a rebalance.
> > > >
> > > > -Jason
> > > >
> > > > On Thu, Jan 7, 2016 at 1:01 PM, Cliff Rhyne 
> wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > I'm testing out some changes with the 0.9.0.0 new KafkaConsumer
> API.
> > > We
> > > > > re-use the same consumer group ID across different components of
> the
> > > > > application (which consume from different topics).  One topic is
> > always
> > > > > being consumed from, the rest are turned on and off.
> > > > >
> > > > > If I only run the always-on consumer, I have a very low occurrence
> > rate
> > > > of
> > > > > the log message below:
> > > > >
> > > > > org.apache.kafka.clients.consumer.internals.AbstractCoordinator -
> > > Attempt
> > > > > to heart beat failed since the group is rebalancing, try to re-join
> > > > group.
> > > > >
> > > > >
> > > > > If I run both types of consumers, the log message occurs frequently
> > and
> > > > the
> > > > > alway-on consumer eventually doesn't succeed in rejoining (I see
> the
> > > > > attempt in the logs to rejoin but nothing happens after that).  I
> > only
> > > > have
> > > > > logs on the client side to work with; there's nothing showing up in
> > the
> > > > > kafka logs to show why the group's state isn't stable.
> > > > >
> > > > > Thanks,
> > > > > Cliff
> > > > >
> > > > > --
> > > > > Cliff Rhyne
> > > > > Software Engineering Lead
> > > > > e: crh...@signal.co
> > > > > signal.co
> > > > > 
> > > > >
> > > > > Cut Through the Noise
> > > > >
> > > > > This e-mail and any files transmitted with it are for the sole use
> of
> > > the
> > > > > intended recipient(s) and may contain confidential and privileged
> > > > > information. Any unauthorized use of this email is strictly
> > prohibited.
> > > > > ©2015 Signal. All rights reserved.
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Cliff Rhyne
> > > Software Engineering Lead
> > > e: crh...@signal.co
> > > signal.co
> > > 
> > >
> > > Cut Through the Noise
> > >
> > > This e-mail and any files transmitted with it are for the sole use of
> the
> > > intended recipient(s) and may contain confidential and privileged
> > > information. Any unauthorized use of this email is strictly prohibited.
> > > ©2015 Signal. All rights reserved.
> > >
> >
>
>
>
> --
> Cliff Rhyne
> Software Engineering Lead
> e: crh...@signal.co
> signal.co
> 
>
> Cut Through the Noise
>
> This e-mail and any files transmitted with it are for the sole use of the
> intended recipient(s) and may contain confidential and privileged
> information. Any unauthorized use of this email is strictly prohibited.
> ©2015 Signal. All rights reserved.
>


Re: TMR should nopt create topic if not existing.

2016-01-07 Thread Mayuresh Gharat
+ dev

Hi

There has been discussion on the ticket :
https://issues.apache.org/jira/browse/KAFKA-2887, that we are going to
deprecate auto-creation of topic or at least turn it off by default once we
have the CreateTopics API. It also says the patch is available for this.

The only other ticket that I came across on the ticket is
https://issues.apache.org/jira/browse/KAFKA-1507.

I wanted to confirm that https://issues.apache.org/jira/browse/KAFKA-1507
is the ticket that has the CreateTopics API patch.
<%28862%29%20250-7125>


-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


Re: Connecting with new consumers and existing group appears to cause existing group to rebalance

2016-01-07 Thread Cliff Rhyne
I'll explain.

Say there are two topics, foo and bar.  Foo has two partitions (foo-0 and
foo-1).  Bar has one partition (bar-0).  The application uses one consumer
group for all KafkaConsumers called "app".  The application has two
consumers always consuming from the two foo partitions, but
creates/destroys a consumer for bar as needed.  Connecting to bar to
consume shouldn't cause a rebalance of foo, but because all three consumers
use "app" as their group, it does.

One of my assumptions is that topics and rebalancing should operate
independently from each other regardless of the group.

Thanks,
Cliff

On Thu, Jan 7, 2016 at 3:44 PM, Jason Gustafson  wrote:

> >
> > If a new consumer joins an existing group, it triggers a rebalance even
> > if it's
> > consuming from a new topic (which matches my symptoms).
> >
>
> Not sure I understand the issue. Rebalances are triggered when either 1)
> group membership changes, 2) a consumer's subscription changes, or 3) when
> the number of partitions for a topic changes.
>
> -Jason
>
> On Thu, Jan 7, 2016 at 1:38 PM, Cliff Rhyne  wrote:
>
> > I'll give the 0.9.0 trunk a try.
> >
> > By the way, it looks to me that this might be a separate issue.  I just
> > setup unique group IDs for the stop/start topics from the always-on topic
> > and my test passed.  I think the issue is that the
> > GroupCoordinator.doJoinGroup() only tracks the group and not the topic
> when
> > deciding whether to rebalance (GroupCoordinator.prepareRebalance()).  If
> a
> > new consumer joins an existing group, it triggers a rebalance even if
> it's
> > consuming from a new topic (which matches my symptoms).
> >
> > What do you think?
> >
> > On Thu, Jan 7, 2016 at 3:27 PM, Jason Gustafson 
> > wrote:
> >
> > > There have been bugs affecting both the client and the server. The one
> I
> > > mentioned above only affected the client, so you could try updating it
> > > alone if that's easier, but it would be better to do both.
> > >
> > > I'll leave it to others to comment on the release timeline. I haven't
> > seen
> > > any major consumer-related bugs in the past couple weeks, so my feeling
> > is
> > > that it's starting to stabilize. It would be nice to get KIP-41 into
> the
> > > next release though.
> > >
> > > -Jason
> > >
> > > On Thu, Jan 7, 2016 at 1:18 PM, Cliff Rhyne  wrote:
> > >
> > > > Hi Jason,
> > > >
> > > > I'm just on the 0.9.0.0 release.  Are the fixes in the client, the
> > kafka
> > > > service, or both?  I'll give it a try.
> > > >
> > > > Is there a timeline for when 0.9.0.1 would be released?
> > > >
> > > > Thanks,
> > > > Cliff
> > > >
> > > > On Thu, Jan 7, 2016 at 3:14 PM, Jason Gustafson 
> > > > wrote:
> > > >
> > > > > Hey Cliff,
> > > > >
> > > > > Are you using the 0.9.0.0 release? We've fixed a few problems in
> the
> > > > 0.9.0
> > > > > branch, some of which might explain the behavior you're seeing.
> There
> > > was
> > > > > one bug in particular which resulted in the consumer not fetching
> > data
> > > > for
> > > > > a set of partitions after a rebalance.
> > > > >
> > > > > -Jason
> > > > >
> > > > > On Thu, Jan 7, 2016 at 1:01 PM, Cliff Rhyne 
> > wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > I'm testing out some changes with the 0.9.0.0 new KafkaConsumer
> > API.
> > > > We
> > > > > > re-use the same consumer group ID across different components of
> > the
> > > > > > application (which consume from different topics).  One topic is
> > > always
> > > > > > being consumed from, the rest are turned on and off.
> > > > > >
> > > > > > If I only run the always-on consumer, I have a very low
> occurrence
> > > rate
> > > > > of
> > > > > > the log message below:
> > > > > >
> > > > > > org.apache.kafka.clients.consumer.internals.AbstractCoordinator -
> > > > Attempt
> > > > > > to heart beat failed since the group is rebalancing, try to
> re-join
> > > > > group.
> > > > > >
> > > > > >
> > > > > > If I run both types of consumers, the log message occurs
> frequently
> > > and
> > > > > the
> > > > > > alway-on consumer eventually doesn't succeed in rejoining (I see
> > the
> > > > > > attempt in the logs to rejoin but nothing happens after that).  I
> > > only
> > > > > have
> > > > > > logs on the client side to work with; there's nothing showing up
> in
> > > the
> > > > > > kafka logs to show why the group's state isn't stable.
> > > > > >
> > > > > > Thanks,
> > > > > > Cliff
> > > > > >
> > > > > > --
> > > > > > Cliff Rhyne
> > > > > > Software Engineering Lead
> > > > > > e: crh...@signal.co
> > > > > > signal.co
> > > > > > 
> > > > > >
> > > > > > Cut Through the Noise
> > > > > >
> > > > > > This e-mail and any files transmitted with it are for the sole
> use
> > of
> > > > the
> > > > > > intended recipient(s) and may contain confidential and privileged
> > > > > > information. Any unauthorized use of this email is strictly
> > > prohibited.
> > > > > > ©2015 Signal. All rights reserved.
> > > > > >
> > > >

[jira] [Commented] (KAFKA-3068) NetworkClient may connect to a different Kafka cluster than originally configured

2016-01-07 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15088200#comment-15088200
 ] 

Ewen Cheslack-Postava commented on KAFKA-3068:
--

[~junrao] It's definitely a real issue -- see 
https://issues.apache.org/jira/browse/KAFKA-1843?focusedCommentId=14517659&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14517659
 where you can get into unrecoverable situations if a Kafka node is 
decomissioned or in the case of a failure. This doesn't necessarily seem like 
it would be that rare to me -- sure, it requires there to be failures, but any 
app that produces to a single topic could encounter a case like this since 
it'll have a very limited set of brokers to work with. Multiple people reported 
encountering this issue, although as far as I've seen it may only have been in 
testing environments: 
http://mail-archives.apache.org/mod_mbox/kafka-users/201505.mbox/%3ccaj7fv7qbfqzrd0eodjmtrq725q+ziirnctwjhyhqldm+5cw...@mail.gmail.com%3E

It sounds like KAFKA-1843 is messy enough that it needs a more complete 
discussion to resolve and maybe a KIP. Just waiting on the old list doesn't 
help in the situation described above if there's an actual hardware failure 
that takes the node out entirely -- eventually you need to try something else. 
If we're going to change anything now, I'd suggest either making the entries in 
nodesEverSeen expire but at a longer interval than the metadata refresh or 
falling back to bootstrap nodes. I think the former is probably the better 
solution, though the latter may be easier for people to reason about until we 
work out a more complete solution. The former seems nicer because it decouples 
the expiration from other failures. Currently a broker failure that triggers a 
metadata refresh can force that broker out of the set of nodes to try 
immediately, which is why you can get into situations like the one linked above 
(and in much less time than the metadata refresh interval since these events 
force a metadata refresh).

Ultimately, the problem boils down to the fact that we're not getting the 
information we want from these metadata updates -- we get the info we need 
about topic leaders, but we're also using this metadata for basic connectivity 
to the cluster. That breaks down if you're only working with one or a small 
number of topics and they end up with too few or crashed replicas. I think 
caching old entries is not the ideal solution here, but it's a way to work 
around the current situation. It would probably be better if metadata responses 
could include a random subset of the current set of brokers so that rather than 
relying on the original bootstrap servers we could get new, definitely valid 
bootstrap brokers with each metadata refresh. (It might even be possible to 
implement this just by having the broker return extra brokers in the metadata 
response even if there aren't entries in the topic metadata section of the 
response referencing those brokers; but I'm not sure if that'd break anything, 
either in our clients or in third party clients. It also doesn't fully fix the 
problem since the linked issue can still occur, but should only ever happen in 
the case of their tiny test case, not in practice.)

> NetworkClient may connect to a different Kafka cluster than originally 
> configured
> -
>
> Key: KAFKA-3068
> URL: https://issues.apache.org/jira/browse/KAFKA-3068
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.9.0.0
>Reporter: Jun Rao
>Assignee: Eno Thereska
>
> In https://github.com/apache/kafka/pull/290, we added the logic to cache all 
> brokers (id and ip) that the client has ever seen. If we can't find an 
> available broker from the current Metadata, we will pick a broker that we 
> have ever seen (in NetworkClient.leastLoadedNode()).
> One potential problem this logic can introduce is the following. Suppose that 
> we have a broker with id 1 in a Kafka cluster. A producer client remembers 
> this broker in nodesEverSeen. At some point, we bring down this broker and 
> use the host in a different Kafka cluster. Then, the producer client uses 
> this broker from nodesEverSeen to refresh metadata. It will find the metadata 
> in a different Kafka cluster and start producing data there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3068) NetworkClient may connect to a different Kafka cluster than originally configured

2016-01-07 Thread Jason Gustafson (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15088209#comment-15088209
 ] 

Jason Gustafson commented on KAFKA-3068:


[~ewencp] Is it not correct that the topic metadata response always includes 
all alive brokers?

> NetworkClient may connect to a different Kafka cluster than originally 
> configured
> -
>
> Key: KAFKA-3068
> URL: https://issues.apache.org/jira/browse/KAFKA-3068
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.9.0.0
>Reporter: Jun Rao
>Assignee: Eno Thereska
>
> In https://github.com/apache/kafka/pull/290, we added the logic to cache all 
> brokers (id and ip) that the client has ever seen. If we can't find an 
> available broker from the current Metadata, we will pick a broker that we 
> have ever seen (in NetworkClient.leastLoadedNode()).
> One potential problem this logic can introduce is the following. Suppose that 
> we have a broker with id 1 in a Kafka cluster. A producer client remembers 
> this broker in nodesEverSeen. At some point, we bring down this broker and 
> use the host in a different Kafka cluster. Then, the producer client uses 
> this broker from nodesEverSeen to refresh metadata. It will find the metadata 
> in a different Kafka cluster and start producing data there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Connecting with new consumers and existing group appears to cause existing group to rebalance

2016-01-07 Thread Jason Gustafson
Thanks for the explanation. Unfortunately, the consumer doesn't work that
way. Rebalances affect all members of the group regardless of
subscriptions. Partial rebalances could be an interesting idea to consider
for the future, but we haven't had any cases where a group had differing
subscriptions in a steady state. Usually users just use separate groups.
Would that not make sense in this case?

-Jason

On Thu, Jan 7, 2016 at 1:54 PM, Cliff Rhyne  wrote:

> I'll explain.
>
> Say there are two topics, foo and bar.  Foo has two partitions (foo-0 and
> foo-1).  Bar has one partition (bar-0).  The application uses one consumer
> group for all KafkaConsumers called "app".  The application has two
> consumers always consuming from the two foo partitions, but
> creates/destroys a consumer for bar as needed.  Connecting to bar to
> consume shouldn't cause a rebalance of foo, but because all three consumers
> use "app" as their group, it does.
>
> One of my assumptions is that topics and rebalancing should operate
> independently from each other regardless of the group.
>
> Thanks,
> Cliff
>
> On Thu, Jan 7, 2016 at 3:44 PM, Jason Gustafson 
> wrote:
>
> > >
> > > If a new consumer joins an existing group, it triggers a rebalance even
> > > if it's
> > > consuming from a new topic (which matches my symptoms).
> > >
> >
> > Not sure I understand the issue. Rebalances are triggered when either 1)
> > group membership changes, 2) a consumer's subscription changes, or 3)
> when
> > the number of partitions for a topic changes.
> >
> > -Jason
> >
> > On Thu, Jan 7, 2016 at 1:38 PM, Cliff Rhyne  wrote:
> >
> > > I'll give the 0.9.0 trunk a try.
> > >
> > > By the way, it looks to me that this might be a separate issue.  I just
> > > setup unique group IDs for the stop/start topics from the always-on
> topic
> > > and my test passed.  I think the issue is that the
> > > GroupCoordinator.doJoinGroup() only tracks the group and not the topic
> > when
> > > deciding whether to rebalance (GroupCoordinator.prepareRebalance()).
> If
> > a
> > > new consumer joins an existing group, it triggers a rebalance even if
> > it's
> > > consuming from a new topic (which matches my symptoms).
> > >
> > > What do you think?
> > >
> > > On Thu, Jan 7, 2016 at 3:27 PM, Jason Gustafson 
> > > wrote:
> > >
> > > > There have been bugs affecting both the client and the server. The
> one
> > I
> > > > mentioned above only affected the client, so you could try updating
> it
> > > > alone if that's easier, but it would be better to do both.
> > > >
> > > > I'll leave it to others to comment on the release timeline. I haven't
> > > seen
> > > > any major consumer-related bugs in the past couple weeks, so my
> feeling
> > > is
> > > > that it's starting to stabilize. It would be nice to get KIP-41 into
> > the
> > > > next release though.
> > > >
> > > > -Jason
> > > >
> > > > On Thu, Jan 7, 2016 at 1:18 PM, Cliff Rhyne 
> wrote:
> > > >
> > > > > Hi Jason,
> > > > >
> > > > > I'm just on the 0.9.0.0 release.  Are the fixes in the client, the
> > > kafka
> > > > > service, or both?  I'll give it a try.
> > > > >
> > > > > Is there a timeline for when 0.9.0.1 would be released?
> > > > >
> > > > > Thanks,
> > > > > Cliff
> > > > >
> > > > > On Thu, Jan 7, 2016 at 3:14 PM, Jason Gustafson <
> ja...@confluent.io>
> > > > > wrote:
> > > > >
> > > > > > Hey Cliff,
> > > > > >
> > > > > > Are you using the 0.9.0.0 release? We've fixed a few problems in
> > the
> > > > > 0.9.0
> > > > > > branch, some of which might explain the behavior you're seeing.
> > There
> > > > was
> > > > > > one bug in particular which resulted in the consumer not fetching
> > > data
> > > > > for
> > > > > > a set of partitions after a rebalance.
> > > > > >
> > > > > > -Jason
> > > > > >
> > > > > > On Thu, Jan 7, 2016 at 1:01 PM, Cliff Rhyne 
> > > wrote:
> > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > I'm testing out some changes with the 0.9.0.0 new KafkaConsumer
> > > API.
> > > > > We
> > > > > > > re-use the same consumer group ID across different components
> of
> > > the
> > > > > > > application (which consume from different topics).  One topic
> is
> > > > always
> > > > > > > being consumed from, the rest are turned on and off.
> > > > > > >
> > > > > > > If I only run the always-on consumer, I have a very low
> > occurrence
> > > > rate
> > > > > > of
> > > > > > > the log message below:
> > > > > > >
> > > > > > >
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator -
> > > > > Attempt
> > > > > > > to heart beat failed since the group is rebalancing, try to
> > re-join
> > > > > > group.
> > > > > > >
> > > > > > >
> > > > > > > If I run both types of consumers, the log message occurs
> > frequently
> > > > and
> > > > > > the
> > > > > > > alway-on consumer eventually doesn't succeed in rejoining (I
> see
> > > the
> > > > > > > attempt in the logs to rejoin but nothing happens after
> that).  I
> > > > only
> > > > > > have
> > > > > > > l

[jira] [Commented] (KAFKA-3068) NetworkClient may connect to a different Kafka cluster than originally configured

2016-01-07 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15088222#comment-15088222
 ] 

Ewen Cheslack-Postava commented on KAFKA-3068:
--

[~hachikuji] Sorry, yeah, that's right. But you still have the same issues in 
any smaller cluster or in the case of a partition. So actually, even having the 
extra brokers returned doesn't work out great since as soon as you lose too 
many from the active set you can get into that same situation where you're not 
longer able to connect to any of the nodes from the last metadata refresh.

> NetworkClient may connect to a different Kafka cluster than originally 
> configured
> -
>
> Key: KAFKA-3068
> URL: https://issues.apache.org/jira/browse/KAFKA-3068
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.9.0.0
>Reporter: Jun Rao
>Assignee: Eno Thereska
>
> In https://github.com/apache/kafka/pull/290, we added the logic to cache all 
> brokers (id and ip) that the client has ever seen. If we can't find an 
> available broker from the current Metadata, we will pick a broker that we 
> have ever seen (in NetworkClient.leastLoadedNode()).
> One potential problem this logic can introduce is the following. Suppose that 
> we have a broker with id 1 in a Kafka cluster. A producer client remembers 
> this broker in nodesEverSeen. At some point, we bring down this broker and 
> use the host in a different Kafka cluster. Then, the producer client uses 
> this broker from nodesEverSeen to refresh metadata. It will find the metadata 
> in a different Kafka cluster and start producing data there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2856) add KTable

2016-01-07 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-2856:
-
Assignee: Yasuhiro Matsuda

> add KTable
> --
>
> Key: KAFKA-2856
> URL: https://issues.apache.org/jira/browse/KAFKA-2856
> Project: Kafka
>  Issue Type: Sub-task
>  Components: kafka streams
>Reporter: Yasuhiro Matsuda
>Assignee: Yasuhiro Matsuda
> Fix For: 0.9.1.0
>
>
> KTable is a special type of the stream that represents a changelog of a 
> database table (or a key-value store).
> A changelog has to meet the following requirements.
> * Key-value mapping is surjective in the database table (the key must be the 
> primary key).
> * All insert/update/delete events are delivered in order for the same key
> * An update event has the whole data (not just delta).
> * A delete event is represented by the null value.
> KTable does not necessarily materialized as a local store. It may be 
> materialized when necessary. (see below)
> KTable supports look-up by key. KTable is materialized implicitly when 
> look-up is necessary.
> * KTable may be created from a topic. (Base KTable)
> * KTable may be created from another KTable by filter(), filterOut(), 
> mapValues(). (Derived KTable)
> * A call to the user supplied function is skipped when the value is null 
> since such an event represents a deletion. 
> * Instead of dropping, events filtered out by filter() or filterOut() are 
> converted to delete events. (Can we avoid this?)
> * map(), flatMap() and flatMapValues() are not supported since they may 
> violate the changelog requirements
> A derived KTable may be persisted to a topic by to() or through(). through() 
> creates another base KTable. 
> KTable can be converted to KStream by the toStream() method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Connecting with new consumers and existing group appears to cause existing group to rebalance

2016-01-07 Thread Cliff Rhyne
Using separate groups would be only a slight inconvenience but is entirely
doable.  It sounds like that's the preferred model.

Thanks for the explanation.

On Thu, Jan 7, 2016 at 4:08 PM, Jason Gustafson  wrote:

> Thanks for the explanation. Unfortunately, the consumer doesn't work that
> way. Rebalances affect all members of the group regardless of
> subscriptions. Partial rebalances could be an interesting idea to consider
> for the future, but we haven't had any cases where a group had differing
> subscriptions in a steady state. Usually users just use separate groups.
> Would that not make sense in this case?
>
> -Jason
>
> On Thu, Jan 7, 2016 at 1:54 PM, Cliff Rhyne  wrote:
>
> > I'll explain.
> >
> > Say there are two topics, foo and bar.  Foo has two partitions (foo-0 and
> > foo-1).  Bar has one partition (bar-0).  The application uses one
> consumer
> > group for all KafkaConsumers called "app".  The application has two
> > consumers always consuming from the two foo partitions, but
> > creates/destroys a consumer for bar as needed.  Connecting to bar to
> > consume shouldn't cause a rebalance of foo, but because all three
> consumers
> > use "app" as their group, it does.
> >
> > One of my assumptions is that topics and rebalancing should operate
> > independently from each other regardless of the group.
> >
> > Thanks,
> > Cliff
> >
> > On Thu, Jan 7, 2016 at 3:44 PM, Jason Gustafson 
> > wrote:
> >
> > > >
> > > > If a new consumer joins an existing group, it triggers a rebalance
> even
> > > > if it's
> > > > consuming from a new topic (which matches my symptoms).
> > > >
> > >
> > > Not sure I understand the issue. Rebalances are triggered when either
> 1)
> > > group membership changes, 2) a consumer's subscription changes, or 3)
> > when
> > > the number of partitions for a topic changes.
> > >
> > > -Jason
> > >
> > > On Thu, Jan 7, 2016 at 1:38 PM, Cliff Rhyne  wrote:
> > >
> > > > I'll give the 0.9.0 trunk a try.
> > > >
> > > > By the way, it looks to me that this might be a separate issue.  I
> just
> > > > setup unique group IDs for the stop/start topics from the always-on
> > topic
> > > > and my test passed.  I think the issue is that the
> > > > GroupCoordinator.doJoinGroup() only tracks the group and not the
> topic
> > > when
> > > > deciding whether to rebalance (GroupCoordinator.prepareRebalance()).
> > If
> > > a
> > > > new consumer joins an existing group, it triggers a rebalance even if
> > > it's
> > > > consuming from a new topic (which matches my symptoms).
> > > >
> > > > What do you think?
> > > >
> > > > On Thu, Jan 7, 2016 at 3:27 PM, Jason Gustafson 
> > > > wrote:
> > > >
> > > > > There have been bugs affecting both the client and the server. The
> > one
> > > I
> > > > > mentioned above only affected the client, so you could try updating
> > it
> > > > > alone if that's easier, but it would be better to do both.
> > > > >
> > > > > I'll leave it to others to comment on the release timeline. I
> haven't
> > > > seen
> > > > > any major consumer-related bugs in the past couple weeks, so my
> > feeling
> > > > is
> > > > > that it's starting to stabilize. It would be nice to get KIP-41
> into
> > > the
> > > > > next release though.
> > > > >
> > > > > -Jason
> > > > >
> > > > > On Thu, Jan 7, 2016 at 1:18 PM, Cliff Rhyne 
> > wrote:
> > > > >
> > > > > > Hi Jason,
> > > > > >
> > > > > > I'm just on the 0.9.0.0 release.  Are the fixes in the client,
> the
> > > > kafka
> > > > > > service, or both?  I'll give it a try.
> > > > > >
> > > > > > Is there a timeline for when 0.9.0.1 would be released?
> > > > > >
> > > > > > Thanks,
> > > > > > Cliff
> > > > > >
> > > > > > On Thu, Jan 7, 2016 at 3:14 PM, Jason Gustafson <
> > ja...@confluent.io>
> > > > > > wrote:
> > > > > >
> > > > > > > Hey Cliff,
> > > > > > >
> > > > > > > Are you using the 0.9.0.0 release? We've fixed a few problems
> in
> > > the
> > > > > > 0.9.0
> > > > > > > branch, some of which might explain the behavior you're seeing.
> > > There
> > > > > was
> > > > > > > one bug in particular which resulted in the consumer not
> fetching
> > > > data
> > > > > > for
> > > > > > > a set of partitions after a rebalance.
> > > > > > >
> > > > > > > -Jason
> > > > > > >
> > > > > > > On Thu, Jan 7, 2016 at 1:01 PM, Cliff Rhyne 
> > > > wrote:
> > > > > > >
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > I'm testing out some changes with the 0.9.0.0 new
> KafkaConsumer
> > > > API.
> > > > > > We
> > > > > > > > re-use the same consumer group ID across different components
> > of
> > > > the
> > > > > > > > application (which consume from different topics).  One topic
> > is
> > > > > always
> > > > > > > > being consumed from, the rest are turned on and off.
> > > > > > > >
> > > > > > > > If I only run the always-on consumer, I have a very low
> > > occurrence
> > > > > rate
> > > > > > > of
> > > > > > > > the log message below:
> > > > > > > >
> > > > > > > >
> > org.apache.kafka.clients.consumer.

[jira] [Created] (KAFKA-3077) Enable KafkaLog4jAppender to work with SASL enabled brokers.

2016-01-07 Thread Ashish K Singh (JIRA)
Ashish K Singh created KAFKA-3077:
-

 Summary: Enable KafkaLog4jAppender to work with SASL enabled 
brokers.
 Key: KAFKA-3077
 URL: https://issues.apache.org/jira/browse/KAFKA-3077
 Project: Kafka
  Issue Type: Improvement
  Components: clients
Reporter: Ashish K Singh
Assignee: Ashish K Singh


KafkaLog4jAppender is not enhanced to talk to sasl enabled cluster. This JIRA 
aims at adding that support, thus enabling users using log4j appender to 
publish to a SASL enabled Kafka cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-3078) Add ducktape tests for KafkaLog4jAppender producing to SASL enabled Kafka cluster

2016-01-07 Thread Ashish K Singh (JIRA)
Ashish K Singh created KAFKA-3078:
-

 Summary: Add ducktape tests for KafkaLog4jAppender producing to 
SASL enabled Kafka cluster
 Key: KAFKA-3078
 URL: https://issues.apache.org/jira/browse/KAFKA-3078
 Project: Kafka
  Issue Type: Sub-task
Reporter: Ashish K Singh
Assignee: Ashish K Singh


Parent JIRA, KAFKA-3077, enables KafkaLog4jAppender to produce to SASL enabled 
clusters. This JIRA must add ducktape system tests to verify the functionality.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3077) Enable KafkaLog4jAppender to work with SASL enabled brokers.

2016-01-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15088284#comment-15088284
 ] 

ASF GitHub Bot commented on KAFKA-3077:
---

GitHub user SinghAsDev opened a pull request:

https://github.com/apache/kafka/pull/740

KAFKA-3077: Enable KafkaLog4jAppender to work with SASL enabled brokers



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/SinghAsDev/kafka KAFKA-3077

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/740.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #740


commit 52a9e37f7cee7b6d565dbf9da28595ce6de85c74
Author: Ashish Singh 
Date:   2016-01-07T22:32:51Z

KAFKA-3077: Enable KafkaLog4jAppender to work with SASL enabled brokers




> Enable KafkaLog4jAppender to work with SASL enabled brokers.
> 
>
> Key: KAFKA-3077
> URL: https://issues.apache.org/jira/browse/KAFKA-3077
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Reporter: Ashish K Singh
>Assignee: Ashish K Singh
>
> KafkaLog4jAppender is not enhanced to talk to sasl enabled cluster. This JIRA 
> aims at adding that support, thus enabling users using log4j appender to 
> publish to a SASL enabled Kafka cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-3077: Enable KafkaLog4jAppender to work ...

2016-01-07 Thread SinghAsDev
GitHub user SinghAsDev opened a pull request:

https://github.com/apache/kafka/pull/740

KAFKA-3077: Enable KafkaLog4jAppender to work with SASL enabled brokers



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/SinghAsDev/kafka KAFKA-3077

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/740.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #740


commit 52a9e37f7cee7b6d565dbf9da28595ce6de85c74
Author: Ashish Singh 
Date:   2016-01-07T22:32:51Z

KAFKA-3077: Enable KafkaLog4jAppender to work with SASL enabled brokers




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: KIP-41: KafkaConsumer Max Records

2016-01-07 Thread Guozhang Wang
Thanks Jason. I think it is a good feature to add, +1.

As suggested in KIP-32, we'd better to keep end state of the KIP wiki with
finalized implementation details rather than leaving a list of options. I
agree that for both fairness and pre-fetching the simpler approach would be
sufficient for most of the time. So could we move the other approach to
"rejected"?

Guozhang

On Wed, Jan 6, 2016 at 6:14 PM, Gwen Shapira  wrote:

> I like the fair-consumption approach you chose - "pull as many records as
> possible from each partition in a similar round-robin fashion", it is very
> intuitive and close enough to fair.
>
> Overall, I'm +1 on the KIP. But you'll need a formal vote :)
>
> On Wed, Jan 6, 2016 at 6:05 PM, Jason Gustafson 
> wrote:
>
> > Thanks for the suggestion, Ismael. I updated the KIP.
> >
> > -Jason
> >
> > On Wed, Jan 6, 2016 at 6:57 AM, Ismael Juma  wrote:
> >
> > > Thanks Jason. I read the KIP and it makes sense to me. A minor
> > suggestion:
> > > in the "Ensuring Fair Consumption" section, there are 3 paragraphs
> with 2
> > > examples (2 partitions with 100 max.poll.records and 3 partitions with
> 30
> > > max.poll.records). I think you could simplify this by using one of the
> > > examples in the 3 paragraphs.
> > >
> > > Ismael
> > >
> > > On Tue, Jan 5, 2016 at 7:32 PM, Jason Gustafson 
> > > wrote:
> > >
> > > > I've updated the KIP with some implementation details. I also added
> > more
> > > > discussion on the heartbeat() alternative. The short answer for why
> we
> > > > rejected this API is that it doesn't seem to work well with offset
> > > commits.
> > > > This would tend to make correct usage complicated and difficult to
> > > explain.
> > > > Additionally, we don't see any clear advantages over having a way to
> > set
> > > > the max records. For example, using max.records=1 would be equivalent
> > to
> > > > invoking heartbeat() on each iteration of the message processing
> loop.
> > > >
> > > > Going back to the discussion on whether we should use a configuration
> > > value
> > > > or overload poll(), I'm leaning toward the configuration option
> mainly
> > > for
> > > > compatibility and to keep the KafkaConsumer API from getting any more
> > > > complex. Also, as others have mentioned, it seems reasonable to want
> to
> > > > tune this setting in the same place that the session timeout and
> > > heartbeat
> > > > interval are configured. I still feel a little uncomfortable with the
> > > need
> > > > to do a lot of configuration tuning to get the consumer working for a
> > > > particular environment, but hopefully the defaults are conservative
> > > enough
> > > > that most users won't need to. However, if it remains a problem, then
> > we
> > > > could still look into better options for managing the size of batches
> > > > including overloading poll() with a max records argument or possibly
> by
> > > > implementing a batch scaling algorithm internally.
> > > >
> > > > -Jason
> > > >
> > > >
> > > > On Mon, Jan 4, 2016 at 12:18 PM, Jason Gustafson  >
> > > > wrote:
> > > >
> > > > > Hi Cliff,
> > > > >
> > > > > I think we're all agreed that the current contract of poll() should
> > be
> > > > > kept. The consumer wouldn't wait for max messages to become
> available
> > > in
> > > > > this proposal; it would only sure that it never returns more than
> max
> > > > > messages.
> > > > >
> > > > > -Jason
> > > > >
> > > > > On Mon, Jan 4, 2016 at 11:52 AM, Cliff Rhyne 
> > wrote:
> > > > >
> > > > >> Instead of a heartbeat, I'd prefer poll() to return whatever
> > messages
> > > > the
> > > > >> client has.  Either a) I don't care if I get less than my max
> > message
> > > > >> limit
> > > > >> or b) I do care and will set a larger timeout.  Case B is less
> > common
> > > > than
> > > > >> A and is fairly easy to handle in the application's code.
> > > > >>
> > > > >> On Mon, Jan 4, 2016 at 1:47 PM, Gwen Shapira 
> > > wrote:
> > > > >>
> > > > >> > 1. Agree that TCP window style scaling will be cool. I'll try to
> > > think
> > > > >> of a
> > > > >> > good excuse to use it ;)
> > > > >> >
> > > > >> > 2. I'm very concerned about the challenges of getting the
> > timeouts,
> > > > >> > hearbeats and max messages right.
> > > > >> >
> > > > >> > Another option could be to expose "heartbeat" API to consumers.
> If
> > > my
> > > > >> app
> > > > >> > is still processing data but is still alive, it could initiate a
> > > > >> heartbeat
> > > > >> > to signal its alive without having to handle additional
> messages.
> > > > >> >
> > > > >> > I don't know if this improves more than it complicates though :(
> > > > >> >
> > > > >> > On Mon, Jan 4, 2016 at 11:40 AM, Jason Gustafson <
> > > ja...@confluent.io>
> > > > >> > wrote:
> > > > >> >
> > > > >> > > Hey Gwen,
> > > > >> > >
> > > > >> > > I was thinking along the lines of TCP window scaling in order
> to
> > > > >> > > dynamically find a good consumption rate. Basically you'd
> start
> > > off
> > > > >> > > consuming s

[jira] [Commented] (KAFKA-2649) Add support for custom partitioner in sink nodes

2016-01-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15088312#comment-15088312
 ] 

ASF GitHub Bot commented on KAFKA-2649:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/309


> Add support for custom partitioner in sink nodes
> 
>
> Key: KAFKA-2649
> URL: https://issues.apache.org/jira/browse/KAFKA-2649
> Project: Kafka
>  Issue Type: Sub-task
>  Components: kafka streams
>Reporter: Randall Hauch
>Assignee: Randall Hauch
> Fix For: 0.9.1.0
>
>
> The only way for Processor implementations to control partitioning of 
> forwarded messages is to set the partitioner class as property 
> {{ProducerConfig.PARTITIONER_CLASS_CONFIG}} in the StreamsConfig, which 
> should be set to the name of a 
> {{org.apache.kafka.clients.producer.Partitioner}} implementation. However, 
> doing this requires the partitioner knowing how to properly partition *all* 
> topics, not just the one or few topics used by the Processor.
> Instead, Kafka Streams should make it easy to optionally add a partitioning 
> function for each sink used in a topology. Each sink represents a single 
> output topic, and thus is far simpler to implement. Additionally, the sink is 
> already typed with the key and value types (via serdes for the keys and 
> values), so the partitioner can be also be typed with the key and value 
> types. Finally, this also keeps the logic of choosing partitioning strategies 
> where it belongs, as part of building the topology.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-2649 Add support for custom partitioning...

2016-01-07 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/309


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (KAFKA-2649) Add support for custom partitioner in sink nodes

2016-01-07 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-2649:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Issue resolved by pull request 309
[https://github.com/apache/kafka/pull/309]

> Add support for custom partitioner in sink nodes
> 
>
> Key: KAFKA-2649
> URL: https://issues.apache.org/jira/browse/KAFKA-2649
> Project: Kafka
>  Issue Type: Sub-task
>  Components: kafka streams
>Reporter: Randall Hauch
>Assignee: Randall Hauch
> Fix For: 0.9.1.0
>
>
> The only way for Processor implementations to control partitioning of 
> forwarded messages is to set the partitioner class as property 
> {{ProducerConfig.PARTITIONER_CLASS_CONFIG}} in the StreamsConfig, which 
> should be set to the name of a 
> {{org.apache.kafka.clients.producer.Partitioner}} implementation. However, 
> doing this requires the partitioner knowing how to properly partition *all* 
> topics, not just the one or few topics used by the Processor.
> Instead, Kafka Streams should make it easy to optionally add a partitioning 
> function for each sink used in a topology. Each sink represents a single 
> output topic, and thus is far simpler to implement. Additionally, the sink is 
> already typed with the key and value types (via serdes for the keys and 
> values), so the partitioner can be also be typed with the key and value 
> types. Finally, this also keeps the logic of choosing partitioning strategies 
> where it belongs, as part of building the topology.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: TMR should nopt create topic if not existing.

2016-01-07 Thread Jason Gustafson
Hey Mayuresh,

The ticket that Grant Henke has been working on is here:
https://issues.apache.org/jira/browse/KAFKA-2945. Is that what you were
looking for?

-Jason

On Thu, Jan 7, 2016 at 1:50 PM, Mayuresh Gharat 
wrote:

> + dev
>
> Hi
>
> There has been discussion on the ticket :
> https://issues.apache.org/jira/browse/KAFKA-2887, that we are going to
> deprecate auto-creation of topic or at least turn it off by default once we
> have the CreateTopics API. It also says the patch is available for this.
>
> The only other ticket that I came across on the ticket is
> https://issues.apache.org/jira/browse/KAFKA-1507.
>
> I wanted to confirm that https://issues.apache.org/jira/browse/KAFKA-1507
> is the ticket that has the CreateTopics API patch.
> <%28862%29%20250-7125>
>
>
> --
> -Regards,
> Mayuresh R. Gharat
> (862) 250-7125
>


Build failed in Jenkins: kafka-trunk-jdk8 #274

2016-01-07 Thread Apache Jenkins Server
See 

Changes:

[wangguoz] KAFKA-2649: Add support for custom partitioning in topology sinks

--
Started by an SCM change
[EnvInject] - Loading node environment variables.
Building remotely on H11 (Ubuntu ubuntu) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url 
 > https://git-wip-us.apache.org/repos/asf/kafka.git # timeout=10
Fetching upstream changes from https://git-wip-us.apache.org/repos/asf/kafka.git
 > git --version # timeout=10
 > git -c core.askpass=true fetch --tags --progress 
 > https://git-wip-us.apache.org/repos/asf/kafka.git 
 > +refs/heads/*:refs/remotes/origin/*
 > git rev-parse refs/remotes/origin/trunk^{commit} # timeout=10
 > git rev-parse refs/remotes/origin/origin/trunk^{commit} # timeout=10
Checking out Revision 4836e525c851640e0da9d8d11321621a2c70e8f0 
(refs/remotes/origin/trunk)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 4836e525c851640e0da9d8d11321621a2c70e8f0
 > git rev-list ee1770e00e841ba68dc25f03a8d3f2ac647b0eb3 # timeout=10
Setting 
JDK1_8_0_45_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/jdk1.8.0_45
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2
[kafka-trunk-jdk8] $ /bin/bash -xe /tmp/hudson4515497567174603508.sh
+ 
/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2/bin/gradle
To honour the JVM settings for this build a new JVM will be forked. Please 
consider using the daemon: 
http://gradle.org/docs/2.4-rc-2/userguide/gradle_daemon.html.
Building project 'core' with Scala version 2.10.6
:downloadWrapper

BUILD SUCCESSFUL

Total time: 8.562 secs
Setting 
JDK1_8_0_45_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/jdk1.8.0_45
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2
[kafka-trunk-jdk8] $ /bin/bash -xe /tmp/hudson4253661522591035469.sh
+ export GRADLE_OPTS=-Xmx1024m
+ GRADLE_OPTS=-Xmx1024m
+ ./gradlew -Dorg.gradle.project.maxParallelForks=1 clean jarAll testAll
To honour the JVM settings for this build a new JVM will be forked. Please 
consider using the daemon: 
https://docs.gradle.org/2.10/userguide/gradle_daemon.html.
Building project 'core' with Scala version 2.10.6
:clean UP-TO-DATE
:clients:clean UP-TO-DATE
:connect:clean UP-TO-DATE
:core:clean UP-TO-DATE
:examples:clean UP-TO-DATE
:log4j-appender:clean UP-TO-DATE
:streams:clean UP-TO-DATE
:tools:clean UP-TO-DATE
:connect:api:clean UP-TO-DATE
:connect:file:clean UP-TO-DATE
:connect:json:clean UP-TO-DATE
:connect:runtime:clean UP-TO-DATE
:jar_core_2_10
Building project 'core' with Scala version 2.10.6
:kafka-trunk-jdk8:clients:compileJava
:jar_core_2_10 FAILED

FAILURE: Build failed with an exception.

* What went wrong:
Failed to capture snapshot of input files for task 'compileJava' during 
up-to-date check.  See stacktrace for details.
> Could not add entry 
> '
>  to cache fileHashes.bin 
> (

* Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug 
option to get more log output.

BUILD FAILED

Total time: 9.933 secs
Build step 'Execute shell' marked build as failure
Recording test results
Setting 
JDK1_8_0_45_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/jdk1.8.0_45
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2
ERROR: Publisher 'Publish JUnit test result report' failed: No test report 
files were found. Configuration error?
Setting 
JDK1_8_0_45_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/jdk1.8.0_45
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2


[GitHub] kafka pull request: KAFKA-3021: Centralize dependency version mana...

2016-01-07 Thread granthenke
GitHub user granthenke opened a pull request:

https://github.com/apache/kafka/pull/741

KAFKA-3021: Centralize dependency version management



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/granthenke/kafka central-deps

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/741.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #741


commit 2e7e5bda1bc4801e17441c7ede6c523f69500e15
Author: Grant Henke 
Date:   2015-12-24T05:02:58Z

KAFKA-3021: Centralize dependency version management




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-3021) Centralize dependency version managment

2016-01-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15088367#comment-15088367
 ] 

ASF GitHub Bot commented on KAFKA-3021:
---

GitHub user granthenke opened a pull request:

https://github.com/apache/kafka/pull/741

KAFKA-3021: Centralize dependency version management



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/granthenke/kafka central-deps

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/741.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #741


commit 2e7e5bda1bc4801e17441c7ede6c523f69500e15
Author: Grant Henke 
Date:   2015-12-24T05:02:58Z

KAFKA-3021: Centralize dependency version management




> Centralize dependency version managment
> ---
>
> Key: KAFKA-3021
> URL: https://issues.apache.org/jira/browse/KAFKA-3021
> Project: Kafka
>  Issue Type: Sub-task
>  Components: build
>Reporter: Grant Henke
>Assignee: Grant Henke
> Fix For: 0.9.1.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3021) Centralize dependency version managment

2016-01-07 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KAFKA-3021:
---
Status: Patch Available  (was: Open)

> Centralize dependency version managment
> ---
>
> Key: KAFKA-3021
> URL: https://issues.apache.org/jira/browse/KAFKA-3021
> Project: Kafka
>  Issue Type: Sub-task
>  Components: build
>Reporter: Grant Henke
>Assignee: Grant Henke
> Fix For: 0.9.1.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: TMR should nopt create topic if not existing.

2016-01-07 Thread Mayuresh Gharat
Hi Jason,

Thanks. I had found this ticket before but the KAFKA-1507 also has some
context about this and I was confused basically exactly which patch is
going to go in.
Thanks a lot for confirming.

Thanks,

Mayuresh

On Thu, Jan 7, 2016 at 3:08 PM, Jason Gustafson  wrote:

> Hey Mayuresh,
>
> The ticket that Grant Henke has been working on is here:
> https://issues.apache.org/jira/browse/KAFKA-2945. Is that what you were
> looking for?
>
> -Jason
>
> On Thu, Jan 7, 2016 at 1:50 PM, Mayuresh Gharat <
> gharatmayures...@gmail.com>
> wrote:
>
> > + dev
> >
> > Hi
> >
> > There has been discussion on the ticket :
> > https://issues.apache.org/jira/browse/KAFKA-2887, that we are going to
> > deprecate auto-creation of topic or at least turn it off by default once
> we
> > have the CreateTopics API. It also says the patch is available for this.
> >
> > The only other ticket that I came across on the ticket is
> > https://issues.apache.org/jira/browse/KAFKA-1507.
> >
> > I wanted to confirm that
> https://issues.apache.org/jira/browse/KAFKA-1507
> > is the ticket that has the CreateTopics API patch.
> > <%28862%29%20250-7125>
> >
> >
> > --
> > -Regards,
> > Mayuresh R. Gharat
> > (862) 250-7125
> >
>



-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


[jira] [Assigned] (KAFKA-2886) WorkerSinkTask doesn't catch exceptions from rebalance callbacks

2016-01-07 Thread Jason Gustafson (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson reassigned KAFKA-2886:
--

Assignee: Jason Gustafson  (was: Ewen Cheslack-Postava)

> WorkerSinkTask doesn't catch exceptions from rebalance callbacks
> 
>
> Key: KAFKA-2886
> URL: https://issues.apache.org/jira/browse/KAFKA-2886
> Project: Kafka
>  Issue Type: Bug
>  Components: copycat
>Reporter: Ewen Cheslack-Postava
>Assignee: Jason Gustafson
>
> WorkerSinkTask exposes rebalance callbacks to tasks by invoking 
> onPartitionsRevoked and onPartitionsAssigned on the task. However, these 
> aren't guarded by try/catch blocks, so they can propagate the errors up to 
> the consumer:
> {quote}
> [2015-11-24 15:52:24,071] ERROR User provided listener 
> org.apache.kafka.connect.runtime.WorkerSinkTask$HandleRebalance failed on 
> partition assignment:  
> (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
> java.lang.UnsupportedOperationException
>   at 
> java.util.Collections$UnmodifiableCollection.clear(Collections.java:1094)
>   at 
> io.confluent.connect.hdfs.DataWriter.onPartitionsAssigned(DataWriter.java:207)
>   at 
> io.confluent.connect.hdfs.HdfsSinkTask.onPartitionsAssigned(HdfsSinkTask.java:103)
>   at 
> org.apache.kafka.connect.runtime.WorkerSinkTask$HandleRebalance.onPartitionsAssigned(WorkerSinkTask.java:369)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:189)
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:227)
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.ensurePartitionAssignment(ConsumerCoordinator.java:306)
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:861)
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:829)
>   at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:171)
>   at 
> org.apache.kafka.connect.runtime.WorkerSinkTaskThread.iteration(WorkerSinkTaskThread.java:90)
>   at 
> org.apache.kafka.connect.runtime.WorkerSinkTaskThread.execute(WorkerSinkTaskThread.java:58)
>   at 
> org.apache.kafka.connect.util.ShutdownableThread.run(ShutdownableThread.java:82)
> [2015-11-24 15:52:24,477] INFO Cannot acquire lease on WAL 
> hdfs://worker4:9000/logs/test/0/log (io.confluent.connect.hdfs.wal.FSWAL)
> {quote}
> This actually currently works ok for onPartitionsAssigned because the 
> callback is the last thing invoked. For onPartitionsRevoked, it causes 
> offsets to not be committed and the current message batch being processed to 
> not be cleared. Additionally, we may need to do something more to clean up, 
> e.g. the task may need to stop processing data entirely since the task may 
> now be in a bad state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (KAFKA-2655) Consumer.poll(0)'s overhead too large

2016-01-07 Thread Jason Gustafson (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson resolved KAFKA-2655.

Resolution: Not A Problem

Closing this issue as not a problem (any longer). I think the cause of of the 
observed overhead was the implementation of an earlier version of the inner 
while loop in KafkaConsumer.poll(). The logic was something like this:

{code}
long deadline = time.milliseconds() + now;
while (now <= deadline) {
  // check for available record 
  now = time.milliseconds();
}
{code}

Even when the timeout is 0, at least one millisecond had to tick off in order 
to exit the loop. We changed the logic to use a do/while loop which results in 
just one iteration of the loop when the timeout is 0. 

> Consumer.poll(0)'s overhead too large
> -
>
> Key: KAFKA-2655
> URL: https://issues.apache.org/jira/browse/KAFKA-2655
> Project: Kafka
>  Issue Type: Sub-task
>  Components: consumer
>Reporter: Guozhang Wang
>Assignee: Jason Gustafson
> Fix For: 0.9.0.1
>
>
> Currently with a single partition, even if it is paused, calling poll(0) 
> could still be costing as much as 1ms since it triggers a few system calls. 
> Some of those can possibly be optimized away.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (KAFKA-3073) KafkaConnect should support regular expression for topics

2016-01-07 Thread Liquan Pei (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liquan Pei reassigned KAFKA-3073:
-

Assignee: Liquan Pei

> KafkaConnect should support regular expression for topics
> -
>
> Key: KAFKA-3073
> URL: https://issues.apache.org/jira/browse/KAFKA-3073
> Project: Kafka
>  Issue Type: Bug
>Reporter: Gwen Shapira
>Assignee: Liquan Pei
>
> KafkaConsumer supports both a list of topics or a pattern when subscribing. 
> KafkaConnect only supports a list of topics, which is not just more of a 
> hassle to configure - it also requires more maintenance.
> I suggest adding topics.pattern as a new configuration and letting users 
> choose between 'topics' or 'topics.pattern'.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-3079) org.apache.kafka.common.KafkaException: java.lang.SecurityException: Configuration Error:

2016-01-07 Thread Mohit Anchlia (JIRA)
Mohit Anchlia created KAFKA-3079:


 Summary: org.apache.kafka.common.KafkaException: 
java.lang.SecurityException: Configuration Error:
 Key: KAFKA-3079
 URL: https://issues.apache.org/jira/browse/KAFKA-3079
 Project: Kafka
  Issue Type: Bug
  Components: security
Affects Versions: 0.9.0.0
 Environment: RHEL 6
Reporter: Mohit Anchlia


After enabling security I am seeing the following error even though JAAS file 
has no mention of "Zookeeper". I used the following steps:



http://docs.confluent.io/2.0.0/kafka/sasl.html





[2016-01-07 19:05:15,329] FATAL Fatal error during KafkaServer startup. Prepare 
to shutdown (kafka.server.KafkaServer)
org.apache.kafka.common.KafkaException: java.lang.SecurityException: 
Configuration Error:
Line 8: expected [{], found [Zookeeper]
at 
org.apache.kafka.common.security.JaasUtils.isZkSecurityEnabled(JaasUtils.java:102)
at kafka.server.KafkaServer.initZk(KafkaServer.scala:262)
at kafka.server.KafkaServer.startup(KafkaServer.scala:168)
at 
kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:37)
at kafka.Kafka$.main(Kafka.scala:67)
at kafka.Kafka.main(Kafka.scala)
Caused by: java.lang.SecurityException: Configuration Error:
Line 8: expected [{], found [Zookeeper]
at com.sun.security.auth.login.ConfigFile.(ConfigFile.java:110)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at java.lang.Class.newInstance(Class.java:374)
at javax.security.auth.login.Configuration$2.run(Configuration.java:258)
at javax.security.auth.login.Configuration$2.run(Configuration.java:250)
at java.security.AccessController.doPrivileged(Native Method)
at 
javax.security.auth.login.Configuration.getConfiguration(Configuration.java:249)
at 
org.apache.kafka.common.security.JaasUtils.isZkSecurityEnabled(JaasUtils.java:99)
... 5 more
Caused by: java.io.IOException: Configuration Error:



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-3080) ConsoleConsumerTest.test_version system test fails consistently

2016-01-07 Thread Ewen Cheslack-Postava (JIRA)
Ewen Cheslack-Postava created KAFKA-3080:


 Summary: ConsoleConsumerTest.test_version system test fails 
consistently
 Key: KAFKA-3080
 URL: https://issues.apache.org/jira/browse/KAFKA-3080
 Project: Kafka
  Issue Type: Bug
  Components: system tests
Reporter: Ewen Cheslack-Postava


This test on trunk is failing consistently:

{quote}
test_id:
2016-01-07--001.kafkatest.sanity_checks.test_console_consumer.ConsoleConsumerTest.test_version
status: FAIL
run time:   38.451 seconds


num_produced: 1000, num_consumed: 0
Traceback (most recent call last):
  File 
"/var/lib/jenkins/workspace/kafka_system_tests/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.3.8-py2.7.egg/ducktape/tests/runner.py",
 line 101, in run_all_tests
result.data = self.run_single_test()
  File 
"/var/lib/jenkins/workspace/kafka_system_tests/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.3.8-py2.7.egg/ducktape/tests/runner.py",
 line 151, in run_single_test
return self.current_test_context.function(self.current_test)
  File 
"/var/lib/jenkins/workspace/kafka_system_tests/kafka/tests/kafkatest/sanity_checks/test_console_consumer.py",
 line 93, in test_version
assert num_produced == num_consumed, "num_produced: %d, num_consumed: %d" % 
(num_produced, num_consumed)
AssertionError: num_produced: 1000, num_consumed: 0
{quote}

Example run where it fails: 
http://jenkins.confluent.io/job/kafka_system_tests/79/console



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: TMR should nopt create topic if not existing.

2016-01-07 Thread Mayuresh Gharat
Hi,

I might have missed out the discussions on this :(

I had some more questions :

1) We need a config in KafkaProducer for this. So when the KafkaProducer
issues a TMR for a topic and receives a response that the topic does not
exist, depending on the value of this config it should use the CreateTopic
request to create the topic or exit. Is there a ticket for this?

2) We need a config in KafkaConsumer for this. So when the
KafkaConsumer(old and new) issues a TMR for a topic and receives a response
that the topic does not exist, depending on the value of this config it
should use the CreateTopic request to create the topic or exit. Is there a
ticket for this?

Thanks,

Mayuresh

On Thu, Jan 7, 2016 at 3:27 PM, Mayuresh Gharat 
wrote:

> Hi Jason,
>
> Thanks. I had found this ticket before but the KAFKA-1507 also has some
> context about this and I was confused basically exactly which patch is
> going to go in.
> Thanks a lot for confirming.
>
> Thanks,
>
> Mayuresh
>
> On Thu, Jan 7, 2016 at 3:08 PM, Jason Gustafson 
> wrote:
>
>> Hey Mayuresh,
>>
>> The ticket that Grant Henke has been working on is here:
>> https://issues.apache.org/jira/browse/KAFKA-2945. Is that what you were
>> looking for?
>>
>> -Jason
>>
>> On Thu, Jan 7, 2016 at 1:50 PM, Mayuresh Gharat <
>> gharatmayures...@gmail.com>
>> wrote:
>>
>> > + dev
>> >
>> > Hi
>> >
>> > There has been discussion on the ticket :
>> > https://issues.apache.org/jira/browse/KAFKA-2887, that we are going to
>> > deprecate auto-creation of topic or at least turn it off by default
>> once we
>> > have the CreateTopics API. It also says the patch is available for this.
>> >
>> > The only other ticket that I came across on the ticket is
>> > https://issues.apache.org/jira/browse/KAFKA-1507.
>> >
>> > I wanted to confirm that
>> https://issues.apache.org/jira/browse/KAFKA-1507
>> > is the ticket that has the CreateTopics API patch.
>> > <%28862%29%20250-7125>
>> >
>> >
>> > --
>> > -Regards,
>> > Mayuresh R. Gharat
>> > (862) 250-7125
>> >
>>
>
>
>
> --
> -Regards,
> Mayuresh R. Gharat
> (862) 250-7125
>



-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


[jira] [Commented] (KAFKA-3079) org.apache.kafka.common.KafkaException: java.lang.SecurityException: Configuration Error:

2016-01-07 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15088504#comment-15088504
 ] 

Ismael Juma commented on KAFKA-3079:


C

> org.apache.kafka.common.KafkaException: java.lang.SecurityException: 
> Configuration Error:
> -
>
> Key: KAFKA-3079
> URL: https://issues.apache.org/jira/browse/KAFKA-3079
> Project: Kafka
>  Issue Type: Bug
>  Components: security
>Affects Versions: 0.9.0.0
> Environment: RHEL 6
>Reporter: Mohit Anchlia
>
> After enabling security I am seeing the following error even though JAAS file 
> has no mention of "Zookeeper". I used the following steps:
> http://docs.confluent.io/2.0.0/kafka/sasl.html
> [2016-01-07 19:05:15,329] FATAL Fatal error during KafkaServer startup. 
> Prepare to shutdown (kafka.server.KafkaServer)
> org.apache.kafka.common.KafkaException: java.lang.SecurityException: 
> Configuration Error:
> Line 8: expected [{], found [Zookeeper]
> at 
> org.apache.kafka.common.security.JaasUtils.isZkSecurityEnabled(JaasUtils.java:102)
> at kafka.server.KafkaServer.initZk(KafkaServer.scala:262)
> at kafka.server.KafkaServer.startup(KafkaServer.scala:168)
> at 
> kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:37)
> at kafka.Kafka$.main(Kafka.scala:67)
> at kafka.Kafka.main(Kafka.scala)
> Caused by: java.lang.SecurityException: Configuration Error:
> Line 8: expected [{], found [Zookeeper]
> at com.sun.security.auth.login.ConfigFile.(ConfigFile.java:110)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at java.lang.Class.newInstance(Class.java:374)
> at 
> javax.security.auth.login.Configuration$2.run(Configuration.java:258)
> at 
> javax.security.auth.login.Configuration$2.run(Configuration.java:250)
> at java.security.AccessController.doPrivileged(Native Method)
> at 
> javax.security.auth.login.Configuration.getConfiguration(Configuration.java:249)
> at 
> org.apache.kafka.common.security.JaasUtils.isZkSecurityEnabled(JaasUtils.java:99)
> ... 5 more
> Caused by: java.io.IOException: Configuration Error:



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-3079) org.apache.kafka.common.KafkaException: java.lang.SecurityException: Configuration Error:

2016-01-07 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15088504#comment-15088504
 ] 

Ismael Juma edited comment on KAFKA-3079 at 1/8/16 12:52 AM:
-

Can you please include your JAAS config file?


was (Author: ijuma):
C

> org.apache.kafka.common.KafkaException: java.lang.SecurityException: 
> Configuration Error:
> -
>
> Key: KAFKA-3079
> URL: https://issues.apache.org/jira/browse/KAFKA-3079
> Project: Kafka
>  Issue Type: Bug
>  Components: security
>Affects Versions: 0.9.0.0
> Environment: RHEL 6
>Reporter: Mohit Anchlia
>
> After enabling security I am seeing the following error even though JAAS file 
> has no mention of "Zookeeper". I used the following steps:
> http://docs.confluent.io/2.0.0/kafka/sasl.html
> [2016-01-07 19:05:15,329] FATAL Fatal error during KafkaServer startup. 
> Prepare to shutdown (kafka.server.KafkaServer)
> org.apache.kafka.common.KafkaException: java.lang.SecurityException: 
> Configuration Error:
> Line 8: expected [{], found [Zookeeper]
> at 
> org.apache.kafka.common.security.JaasUtils.isZkSecurityEnabled(JaasUtils.java:102)
> at kafka.server.KafkaServer.initZk(KafkaServer.scala:262)
> at kafka.server.KafkaServer.startup(KafkaServer.scala:168)
> at 
> kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:37)
> at kafka.Kafka$.main(Kafka.scala:67)
> at kafka.Kafka.main(Kafka.scala)
> Caused by: java.lang.SecurityException: Configuration Error:
> Line 8: expected [{], found [Zookeeper]
> at com.sun.security.auth.login.ConfigFile.(ConfigFile.java:110)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at java.lang.Class.newInstance(Class.java:374)
> at 
> javax.security.auth.login.Configuration$2.run(Configuration.java:258)
> at 
> javax.security.auth.login.Configuration$2.run(Configuration.java:250)
> at java.security.AccessController.doPrivileged(Native Method)
> at 
> javax.security.auth.login.Configuration.getConfiguration(Configuration.java:249)
> at 
> org.apache.kafka.common.security.JaasUtils.isZkSecurityEnabled(JaasUtils.java:99)
> ... 5 more
> Caused by: java.io.IOException: Configuration Error:



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: KIP-41: KafkaConsumer Max Records

2016-01-07 Thread Aarti Gupta
Hi Json,

I am concerned about how many records can be prefetched into consumer
memory.
Currently we control the maximum number of bytes per topic and partition by
setting fetch.message.max.bytes

The max.partition.fetch.bytes = #no of partitions * fetch.message.max.bytes
However, partitions can be added dynamically, which would mean that a
single process (for example a single JVM with multiple consumers), that
consumes messages from large number of partitions may not able to keep all
the pre fetched messages in memory.

Additionally, if the relative size of messages is highly variable, it would
be hard to correlate the max size in bytes for message fetch with the
number of records returned on a poll.
We previously observed (in a production setup), that, if the size of the
message is greater than fetch.message.max.bytes, the consumer gets stuck.
This encouraged us to increase the fetch.message.max.bytes to a
significantly large value. This would worsen the memory consumption fear
described above,( when the number of partitions is also large.)

While there may not be a single magic formula to predict the correct
combination of fetch.message.max.bytes and #*max.poll.records, **maybe we
can make the prefetch algorithm a mathematical function of the
f*etch.message.max.bytes
and #noofpartitions?

thoughts?
Thanks
aarti

additional unimportant note: the link to the JIRA in the KIP is broken

On Thu, Jan 7, 2016 at 2:37 PM, Guozhang Wang  wrote:

> Thanks Jason. I think it is a good feature to add, +1.
>
> As suggested in KIP-32, we'd better to keep end state of the KIP wiki with
> finalized implementation details rather than leaving a list of options. I
> agree that for both fairness and pre-fetching the simpler approach would be
> sufficient for most of the time. So could we move the other approach to
> "rejected"?
>
> Guozhang
>
> On Wed, Jan 6, 2016 at 6:14 PM, Gwen Shapira  wrote:
>
> > I like the fair-consumption approach you chose - "pull as many records as
> > possible from each partition in a similar round-robin fashion", it is
> very
> > intuitive and close enough to fair.
> >
> > Overall, I'm +1 on the KIP. But you'll need a formal vote :)
> >
> > On Wed, Jan 6, 2016 at 6:05 PM, Jason Gustafson 
> > wrote:
> >
> > > Thanks for the suggestion, Ismael. I updated the KIP.
> > >
> > > -Jason
> > >
> > > On Wed, Jan 6, 2016 at 6:57 AM, Ismael Juma  wrote:
> > >
> > > > Thanks Jason. I read the KIP and it makes sense to me. A minor
> > > suggestion:
> > > > in the "Ensuring Fair Consumption" section, there are 3 paragraphs
> > with 2
> > > > examples (2 partitions with 100 max.poll.records and 3 partitions
> with
> > 30
> > > > max.poll.records). I think you could simplify this by using one of
> the
> > > > examples in the 3 paragraphs.
> > > >
> > > > Ismael
> > > >
> > > > On Tue, Jan 5, 2016 at 7:32 PM, Jason Gustafson 
> > > > wrote:
> > > >
> > > > > I've updated the KIP with some implementation details. I also added
> > > more
> > > > > discussion on the heartbeat() alternative. The short answer for why
> > we
> > > > > rejected this API is that it doesn't seem to work well with offset
> > > > commits.
> > > > > This would tend to make correct usage complicated and difficult to
> > > > explain.
> > > > > Additionally, we don't see any clear advantages over having a way
> to
> > > set
> > > > > the max records. For example, using max.records=1 would be
> equivalent
> > > to
> > > > > invoking heartbeat() on each iteration of the message processing
> > loop.
> > > > >
> > > > > Going back to the discussion on whether we should use a
> configuration
> > > > value
> > > > > or overload poll(), I'm leaning toward the configuration option
> > mainly
> > > > for
> > > > > compatibility and to keep the KafkaConsumer API from getting any
> more
> > > > > complex. Also, as others have mentioned, it seems reasonable to
> want
> > to
> > > > > tune this setting in the same place that the session timeout and
> > > > heartbeat
> > > > > interval are configured. I still feel a little uncomfortable with
> the
> > > > need
> > > > > to do a lot of configuration tuning to get the consumer working
> for a
> > > > > particular environment, but hopefully the defaults are conservative
> > > > enough
> > > > > that most users won't need to. However, if it remains a problem,
> then
> > > we
> > > > > could still look into better options for managing the size of
> batches
> > > > > including overloading poll() with a max records argument or
> possibly
> > by
> > > > > implementing a batch scaling algorithm internally.
> > > > >
> > > > > -Jason
> > > > >
> > > > >
> > > > > On Mon, Jan 4, 2016 at 12:18 PM, Jason Gustafson <
> ja...@confluent.io
> > >
> > > > > wrote:
> > > > >
> > > > > > Hi Cliff,
> > > > > >
> > > > > > I think we're all agreed that the current contract of poll()
> should
> > > be
> > > > > > kept. The consumer wouldn't wait for max messages to become
> > available
> > > > in
> > > > > > this proposal; it wo

[GitHub] kafka pull request: KAFKA-2653: Alternative Kafka Streams Stateful...

2016-01-07 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/730


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (KAFKA-2653) Stateful operations in the KStream DSL layer

2016-01-07 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang resolved KAFKA-2653.
--
   Resolution: Fixed
Fix Version/s: 0.9.1.0

Issue resolved by pull request 730
[https://github.com/apache/kafka/pull/730]

> Stateful operations in the KStream DSL layer
> 
>
> Key: KAFKA-2653
> URL: https://issues.apache.org/jira/browse/KAFKA-2653
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Guozhang Wang
>Assignee: Guozhang Wang
> Fix For: 0.9.1.0
>
>
> This includes the interface design the implementation for stateful operations 
> including:
> 0. table representation in KStream.
> 1. stream-stream join.
> 2. stream-table join.
> 3. table-table join.
> 4. stream / table aggregations.
> With 0 and 3 being tackled in KAFKA-2856 and KAFKA-2962 separately, this 
> ticket is going to only focus on windowing definition and 1 / 2 / 4 above.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2653) Stateful operations in the KStream DSL layer

2016-01-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15088533#comment-15088533
 ] 

ASF GitHub Bot commented on KAFKA-2653:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/730


> Stateful operations in the KStream DSL layer
> 
>
> Key: KAFKA-2653
> URL: https://issues.apache.org/jira/browse/KAFKA-2653
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Guozhang Wang
>Assignee: Guozhang Wang
> Fix For: 0.9.1.0
>
>
> This includes the interface design the implementation for stateful operations 
> including:
> 0. table representation in KStream.
> 1. stream-stream join.
> 2. stream-table join.
> 3. table-table join.
> 4. stream / table aggregations.
> With 0 and 3 being tackled in KAFKA-2856 and KAFKA-2962 separately, this 
> ticket is going to only focus on windowing definition and 1 / 2 / 4 above.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-3081) KTable Aggregation Implementation

2016-01-07 Thread Guozhang Wang (JIRA)
Guozhang Wang created KAFKA-3081:


 Summary: KTable Aggregation Implementation
 Key: KAFKA-3081
 URL: https://issues.apache.org/jira/browse/KAFKA-3081
 Project: Kafka
  Issue Type: Sub-task
Reporter: Guozhang Wang
Assignee: Guozhang Wang
 Fix For: 0.9.1.0


We need to add the implementation of the KTable aggregation operation. We will 
translate it into two stages in the underlying topology:

Stage One:
1. No stores attached.

2. When receiving the record > from the upstream processor, call 
selector.apply on both Change.newValue and Change.oldValue.

3. Forward the resulted two messages to an intermediate topic (no compaction) 
with key  and value  where isAdd is a boolean.

Stage Two:

1. Add a K-V store with format  :  with  ser-de and  
ser-de.

2. Upon consuming a record from the intermediate topic:
2.1. First try fetch from the store, if not exist call initialValue().
2.2. Based on "isAdd" determine to call add(..) or remove(..).
2.3. Forward the aggregate value periodically based on the emit duration to the 
sink node with the intermediate topic with key  and value 
Change.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: KIP-41: KafkaConsumer Max Records

2016-01-07 Thread Aarti Gupta
@Jason,
(apologies, got your name wrong the first time round)

On Thu, Jan 7, 2016 at 5:15 PM, Aarti Gupta  wrote:

> Hi Json,
>
> I am concerned about how many records can be prefetched into consumer
> memory.
> Currently we control the maximum number of bytes per topic and partition
> by setting fetch.message.max.bytes
>
> The max.partition.fetch.bytes = #no of partitions *
> fetch.message.max.bytes
> However, partitions can be added dynamically, which would mean that a
> single process (for example a single JVM with multiple consumers), that
> consumes messages from large number of partitions may not able to keep all
> the pre fetched messages in memory.
>
> Additionally, if the relative size of messages is highly variable, it
> would be hard to correlate the max size in bytes for message fetch with the
> number of records returned on a poll.
> We previously observed (in a production setup), that, if the size of the
> message is greater than fetch.message.max.bytes, the consumer gets stuck.
> This encouraged us to increase the fetch.message.max.bytes to a
> significantly large value. This would worsen the memory consumption fear
> described above,( when the number of partitions is also large.)
>
> While there may not be a single magic formula to predict the correct
> combination of fetch.message.max.bytes and #*max.poll.records, **maybe we
> can make the prefetch algorithm a mathematical function of the 
> f*etch.message.max.bytes
> and #noofpartitions?
>
> thoughts?
> Thanks
> aarti
>
> additional unimportant note: the link to the JIRA in the KIP is broken
>
> On Thu, Jan 7, 2016 at 2:37 PM, Guozhang Wang  wrote:
>
>> Thanks Jason. I think it is a good feature to add, +1.
>>
>> As suggested in KIP-32, we'd better to keep end state of the KIP wiki with
>> finalized implementation details rather than leaving a list of options. I
>> agree that for both fairness and pre-fetching the simpler approach would
>> be
>> sufficient for most of the time. So could we move the other approach to
>> "rejected"?
>>
>> Guozhang
>>
>> On Wed, Jan 6, 2016 at 6:14 PM, Gwen Shapira  wrote:
>>
>> > I like the fair-consumption approach you chose - "pull as many records
>> as
>> > possible from each partition in a similar round-robin fashion", it is
>> very
>> > intuitive and close enough to fair.
>> >
>> > Overall, I'm +1 on the KIP. But you'll need a formal vote :)
>> >
>> > On Wed, Jan 6, 2016 at 6:05 PM, Jason Gustafson 
>> > wrote:
>> >
>> > > Thanks for the suggestion, Ismael. I updated the KIP.
>> > >
>> > > -Jason
>> > >
>> > > On Wed, Jan 6, 2016 at 6:57 AM, Ismael Juma 
>> wrote:
>> > >
>> > > > Thanks Jason. I read the KIP and it makes sense to me. A minor
>> > > suggestion:
>> > > > in the "Ensuring Fair Consumption" section, there are 3 paragraphs
>> > with 2
>> > > > examples (2 partitions with 100 max.poll.records and 3 partitions
>> with
>> > 30
>> > > > max.poll.records). I think you could simplify this by using one of
>> the
>> > > > examples in the 3 paragraphs.
>> > > >
>> > > > Ismael
>> > > >
>> > > > On Tue, Jan 5, 2016 at 7:32 PM, Jason Gustafson > >
>> > > > wrote:
>> > > >
>> > > > > I've updated the KIP with some implementation details. I also
>> added
>> > > more
>> > > > > discussion on the heartbeat() alternative. The short answer for
>> why
>> > we
>> > > > > rejected this API is that it doesn't seem to work well with offset
>> > > > commits.
>> > > > > This would tend to make correct usage complicated and difficult to
>> > > > explain.
>> > > > > Additionally, we don't see any clear advantages over having a way
>> to
>> > > set
>> > > > > the max records. For example, using max.records=1 would be
>> equivalent
>> > > to
>> > > > > invoking heartbeat() on each iteration of the message processing
>> > loop.
>> > > > >
>> > > > > Going back to the discussion on whether we should use a
>> configuration
>> > > > value
>> > > > > or overload poll(), I'm leaning toward the configuration option
>> > mainly
>> > > > for
>> > > > > compatibility and to keep the KafkaConsumer API from getting any
>> more
>> > > > > complex. Also, as others have mentioned, it seems reasonable to
>> want
>> > to
>> > > > > tune this setting in the same place that the session timeout and
>> > > > heartbeat
>> > > > > interval are configured. I still feel a little uncomfortable with
>> the
>> > > > need
>> > > > > to do a lot of configuration tuning to get the consumer working
>> for a
>> > > > > particular environment, but hopefully the defaults are
>> conservative
>> > > > enough
>> > > > > that most users won't need to. However, if it remains a problem,
>> then
>> > > we
>> > > > > could still look into better options for managing the size of
>> batches
>> > > > > including overloading poll() with a max records argument or
>> possibly
>> > by
>> > > > > implementing a batch scaling algorithm internally.
>> > > > >
>> > > > > -Jason
>> > > > >
>> > > > >
>> > > > > On Mon, Jan 4, 2016 at 12:18 PM, Jason Gustafson <
>> ja...@

Re: KIP-41: KafkaConsumer Max Records

2016-01-07 Thread Guozhang Wang
I think it is a general issue to bound the memory footprint on the Java
consumer, no matter whether we do the prefetching in this KIP as even today
we do not have anyway to manage memory usage on the consumer.

Today on the producer side we bound the memory usage, and we may need to do
the same on the consumer. Here are some discussions about this:

https://issues.apache.org/jira/browse/KAFKA-2045

Guozhang


On Thu, Jan 7, 2016 at 5:21 PM, Aarti Gupta  wrote:

> @Jason,
> (apologies, got your name wrong the first time round)
>
> On Thu, Jan 7, 2016 at 5:15 PM, Aarti Gupta  wrote:
>
> > Hi Json,
> >
> > I am concerned about how many records can be prefetched into consumer
> > memory.
> > Currently we control the maximum number of bytes per topic and partition
> > by setting fetch.message.max.bytes
> >
> > The max.partition.fetch.bytes = #no of partitions *
> > fetch.message.max.bytes
> > However, partitions can be added dynamically, which would mean that a
> > single process (for example a single JVM with multiple consumers), that
> > consumes messages from large number of partitions may not able to keep
> all
> > the pre fetched messages in memory.
> >
> > Additionally, if the relative size of messages is highly variable, it
> > would be hard to correlate the max size in bytes for message fetch with
> the
> > number of records returned on a poll.
> > We previously observed (in a production setup), that, if the size of the
> > message is greater than fetch.message.max.bytes, the consumer gets stuck.
> > This encouraged us to increase the fetch.message.max.bytes to a
> > significantly large value. This would worsen the memory consumption fear
> > described above,( when the number of partitions is also large.)
> >
> > While there may not be a single magic formula to predict the correct
> > combination of fetch.message.max.bytes and #*max.poll.records, **maybe we
> > can make the prefetch algorithm a mathematical function of the
> f*etch.message.max.bytes
> > and #noofpartitions?
> >
> > thoughts?
> > Thanks
> > aarti
> >
> > additional unimportant note: the link to the JIRA in the KIP is broken
> >
> > On Thu, Jan 7, 2016 at 2:37 PM, Guozhang Wang 
> wrote:
> >
> >> Thanks Jason. I think it is a good feature to add, +1.
> >>
> >> As suggested in KIP-32, we'd better to keep end state of the KIP wiki
> with
> >> finalized implementation details rather than leaving a list of options.
> I
> >> agree that for both fairness and pre-fetching the simpler approach would
> >> be
> >> sufficient for most of the time. So could we move the other approach to
> >> "rejected"?
> >>
> >> Guozhang
> >>
> >> On Wed, Jan 6, 2016 at 6:14 PM, Gwen Shapira  wrote:
> >>
> >> > I like the fair-consumption approach you chose - "pull as many records
> >> as
> >> > possible from each partition in a similar round-robin fashion", it is
> >> very
> >> > intuitive and close enough to fair.
> >> >
> >> > Overall, I'm +1 on the KIP. But you'll need a formal vote :)
> >> >
> >> > On Wed, Jan 6, 2016 at 6:05 PM, Jason Gustafson 
> >> > wrote:
> >> >
> >> > > Thanks for the suggestion, Ismael. I updated the KIP.
> >> > >
> >> > > -Jason
> >> > >
> >> > > On Wed, Jan 6, 2016 at 6:57 AM, Ismael Juma 
> >> wrote:
> >> > >
> >> > > > Thanks Jason. I read the KIP and it makes sense to me. A minor
> >> > > suggestion:
> >> > > > in the "Ensuring Fair Consumption" section, there are 3 paragraphs
> >> > with 2
> >> > > > examples (2 partitions with 100 max.poll.records and 3 partitions
> >> with
> >> > 30
> >> > > > max.poll.records). I think you could simplify this by using one of
> >> the
> >> > > > examples in the 3 paragraphs.
> >> > > >
> >> > > > Ismael
> >> > > >
> >> > > > On Tue, Jan 5, 2016 at 7:32 PM, Jason Gustafson <
> ja...@confluent.io
> >> >
> >> > > > wrote:
> >> > > >
> >> > > > > I've updated the KIP with some implementation details. I also
> >> added
> >> > > more
> >> > > > > discussion on the heartbeat() alternative. The short answer for
> >> why
> >> > we
> >> > > > > rejected this API is that it doesn't seem to work well with
> offset
> >> > > > commits.
> >> > > > > This would tend to make correct usage complicated and difficult
> to
> >> > > > explain.
> >> > > > > Additionally, we don't see any clear advantages over having a
> way
> >> to
> >> > > set
> >> > > > > the max records. For example, using max.records=1 would be
> >> equivalent
> >> > > to
> >> > > > > invoking heartbeat() on each iteration of the message processing
> >> > loop.
> >> > > > >
> >> > > > > Going back to the discussion on whether we should use a
> >> configuration
> >> > > > value
> >> > > > > or overload poll(), I'm leaning toward the configuration option
> >> > mainly
> >> > > > for
> >> > > > > compatibility and to keep the KafkaConsumer API from getting any
> >> more
> >> > > > > complex. Also, as others have mentioned, it seems reasonable to
> >> want
> >> > to
> >> > > > > tune this setting in the same place that the session timeout and
> >> > > 

Jenkins build is back to normal : kafka-trunk-jdk8 #275

2016-01-07 Thread Apache Jenkins Server
See 



[GitHub] kafka pull request: KAFKA-2979: Enable authorizer and ACLs in duck...

2016-01-07 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/683


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-2979) Enable authorizer and ACLs in ducktape tests

2016-01-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15088663#comment-15088663
 ] 

ASF GitHub Bot commented on KAFKA-2979:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/683


> Enable authorizer and ACLs in ducktape tests
> 
>
> Key: KAFKA-2979
> URL: https://issues.apache.org/jira/browse/KAFKA-2979
> Project: Kafka
>  Issue Type: Test
>  Components: system tests
>Affects Versions: 0.9.0.0
>Reporter: Flavio Junqueira
>Assignee: Flavio Junqueira
> Fix For: 0.9.1.0
>
>
> Add some support to test ACLs with ducktape tests and enable some test cases 
> to use it. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (KAFKA-2979) Enable authorizer and ACLs in ducktape tests

2016-01-07 Thread Ewen Cheslack-Postava (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava resolved KAFKA-2979.
--
Resolution: Fixed

Issue resolved by pull request 683
[https://github.com/apache/kafka/pull/683]

> Enable authorizer and ACLs in ducktape tests
> 
>
> Key: KAFKA-2979
> URL: https://issues.apache.org/jira/browse/KAFKA-2979
> Project: Kafka
>  Issue Type: Test
>  Components: system tests
>Affects Versions: 0.9.0.0
>Reporter: Flavio Junqueira
>Assignee: Flavio Junqueira
> Fix For: 0.9.1.0
>
>
> Add some support to test ACLs with ducktape tests and enable some test cases 
> to use it. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Build failed in Jenkins: kafka-trunk-jdk8 #276

2016-01-07 Thread Apache Jenkins Server
See 

Changes:

[me] KAFKA-2979: Enable authorizer and ACLs in ducktape tests

--
Started by an SCM change
[EnvInject] - Loading node environment variables.
Building remotely on H11 (Ubuntu ubuntu) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url 
 > https://git-wip-us.apache.org/repos/asf/kafka.git # timeout=10
Fetching upstream changes from https://git-wip-us.apache.org/repos/asf/kafka.git
 > git --version # timeout=10
 > git -c core.askpass=true fetch --tags --progress 
 > https://git-wip-us.apache.org/repos/asf/kafka.git 
 > +refs/heads/*:refs/remotes/origin/*
 > git rev-parse refs/remotes/origin/trunk^{commit} # timeout=10
 > git rev-parse refs/remotes/origin/origin/trunk^{commit} # timeout=10
Checking out Revision 49778b18446d691321026415aeaac1b265057ece 
(refs/remotes/origin/trunk)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 49778b18446d691321026415aeaac1b265057ece
 > git rev-list 40d731b8712950122915795acca43886851a73b6 # timeout=10
Setting 
JDK1_8_0_45_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/jdk1.8.0_45
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2
[kafka-trunk-jdk8] $ /bin/bash -xe /tmp/hudson1626206107519545826.sh
+ 
/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2/bin/gradle
To honour the JVM settings for this build a new JVM will be forked. Please 
consider using the daemon: 
http://gradle.org/docs/2.4-rc-2/userguide/gradle_daemon.html.
Building project 'core' with Scala version 2.10.6
:downloadWrapper

BUILD SUCCESSFUL

Total time: 10.297 secs
Setting 
JDK1_8_0_45_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/jdk1.8.0_45
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2
[kafka-trunk-jdk8] $ /bin/bash -xe /tmp/hudson7298859838563530839.sh
+ export GRADLE_OPTS=-Xmx1024m
+ GRADLE_OPTS=-Xmx1024m
+ ./gradlew -Dorg.gradle.project.maxParallelForks=1 clean jarAll testAll
To honour the JVM settings for this build a new JVM will be forked. Please 
consider using the daemon: 
https://docs.gradle.org/2.10/userguide/gradle_daemon.html.
Building project 'core' with Scala version 2.10.6
:clean UP-TO-DATE
:clients:clean UP-TO-DATE
:connect:clean UP-TO-DATE
:core:clean UP-TO-DATE
:examples:clean UP-TO-DATE
:log4j-appender:clean UP-TO-DATE
:streams:clean UP-TO-DATE
:tools:clean UP-TO-DATE
:connect:api:clean UP-TO-DATE
:connect:file:clean UP-TO-DATE
:connect:json:clean UP-TO-DATE
:connect:runtime:clean UP-TO-DATE
:jar_core_2_10
Building project 'core' with Scala version 2.10.6
:kafka-trunk-jdk8:clients:compileJava
:jar_core_2_10 FAILED

FAILURE: Build failed with an exception.

* What went wrong:
Failed to capture snapshot of input files for task 'compileJava' during 
up-to-date check.  See stacktrace for details.
> Could not add entry 
> '
>  to cache fileHashes.bin 
> (

* Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug 
option to get more log output.

BUILD FAILED

Total time: 10.175 secs
Build step 'Execute shell' marked build as failure
Recording test results
Setting 
JDK1_8_0_45_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/jdk1.8.0_45
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2
ERROR: Publisher 'Publish JUnit test result report' failed: No test report 
files were found. Configuration error?
Setting 
JDK1_8_0_45_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/jdk1.8.0_45
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2


Build failed in Jenkins: kafka-trunk-jdk7 #946

2016-01-07 Thread Apache Jenkins Server
See 

Changes:

[me] KAFKA-2979: Enable authorizer and ACLs in ducktape tests

--
[...truncated 47 lines...]
Building project 'core' with Scala version 2.10.6
:kafka-trunk-jdk7:clients:compileJavaNote: 

 uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
Note: Some input files use unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.

:kafka-trunk-jdk7:clients:processResources UP-TO-DATE
:kafka-trunk-jdk7:clients:classes
:kafka-trunk-jdk7:clients:determineCommitId UP-TO-DATE
:kafka-trunk-jdk7:clients:createVersionFile
:kafka-trunk-jdk7:clients:jar
:kafka-trunk-jdk7:core:compileJava UP-TO-DATE
:kafka-trunk-jdk7:core:compileScala
:79:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresponding Javadoc for more information.

org.apache.kafka.common.requests.OffsetCommitRequest.DEFAULT_TIMESTAMP
 ^
:36:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresponding Javadoc for more information.
 commitTimestamp: Long = 
org.apache.kafka.common.requests.OffsetCommitRequest.DEFAULT_TIMESTAMP,

  ^
:37:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresponding Javadoc for more information.
 expireTimestamp: Long = 
org.apache.kafka.common.requests.OffsetCommitRequest.DEFAULT_TIMESTAMP) {

  ^
:394:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresponding Javadoc for more information.
  if (value.expireTimestamp == 
org.apache.kafka.common.requests.OffsetCommitRequest.DEFAULT_TIMESTAMP)

^
:293:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresponding Javadoc for more information.
if (offsetAndMetadata.commitTimestamp == 
org.apache.kafka.common.requests.OffsetCommitRequest.DEFAULT_TIMESTAMP)

  ^
:301:
 a pure expression does nothing in statement position; you may be omitting 
necessary parentheses
ControllerStats.uncleanLeaderElectionRate
^
:302:
 a pure expression does nothing in statement position; you may be omitting 
necessary parentheses
ControllerStats.leaderElectionTimer
^
:389:
 class BrokerEndPoint in object UpdateMetadataRequest is deprecated: see 
corresponding Javadoc for more information.
  new UpdateMetadataRequest.BrokerEndPoint(brokerEndPoint.id, 
brokerEndPoint.host, brokerEndPoint.port)
^
:391:
 constructor UpdateMetadataRequest in class UpdateMetadataRequest is 
deprecated: see corresponding Javadoc for more information.
new UpdateMetadataRequest(controllerId, controllerEpoch, 
liveBrokers.asJava, partitionStates.asJava)
^
:129:
 method readFromReadableChannel in class NetworkReceive is deprecated: see 
corresponding Javadoc for more information.
  response.readFromReadableChannel(channel)
   ^
there were 15 feature warning(s); re-run with -feature for details
11 warnings found
:kafka-trunk-jdk7:core:processResources UP-TO-DATE
:kafka-trunk-jdk7:core:classes
:kafka-tr

[jira] [Created] (KAFKA-3082) Make LogManager.InitialTaskDelayMs configurable

2016-01-07 Thread Rado Buransky (JIRA)
Rado Buransky created KAFKA-3082:


 Summary: Make LogManager.InitialTaskDelayMs configurable
 Key: KAFKA-3082
 URL: https://issues.apache.org/jira/browse/KAFKA-3082
 Project: Kafka
  Issue Type: Improvement
Affects Versions: 0.9.0.0
Reporter: Rado Buransky
Priority: Minor


At the moment it is hardcoded to 30 seconds which makes it difficult to 
simulate some scenarios for application testing purposes.

Specifically I am trying to write integration tests for a Spark Streaming 
application to ensure that it behaves correctly even in case when Kafka log 
starts to be cleaned up and I have to wait 30 seconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Upcoming Kafka Summit this April

2016-01-07 Thread Jun Rao
Hi, Everyone,

I wanted to remind you of a couple of Kafka Summit related deadlines coming
up.

On Monday Jan 11 the Call for Proposals closes at 11:59pm Pacific Time. We
encourage you to participate - share your stories and best practices.
Submit your proposal at www.kafka-summit.org

On Friday Jan 15 the conference Early Bird price of $495 will expire -
don't miss out on this $100 discount off the full price. Additionally,
members of this mailing list can get an additional $50 off by entering this
promotional code: COMMUNITY-KS2016-50D

Remember not to tweet this code - it's a special code for the the folks on
the mailing list.

We've been reviewing the proposal submissions and assure you it'll be worth
the time to spend the day to learn, network, recruit, etc. Register at
www.kafka-summit.org

See you at Kafka Summit in San Francisco on Tuesday, April 26!

Thanks,

Jun