[jira] [Commented] (KAFKA-3007) new Consumer should expose mechanism to fetch single message, consumer.poll(timeout, maxMessageLimit)

2016-01-08 Thread Jens Rantil (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15088919#comment-15088919
 ] 

Jens Rantil commented on KAFKA-3007:


Seems like I don't have access to change the title here, but I propose a new 
title for this which is more in line with the KIP: New Consumer should expose 
mechanism to limit number of messages returned by consumer.poll(...)

> new Consumer should expose mechanism to fetch single message, 
> consumer.poll(timeout, maxMessageLimit)
> -
>
> Key: KAFKA-3007
> URL: https://issues.apache.org/jira/browse/KAFKA-3007
> Project: Kafka
>  Issue Type: Improvement
>  Components: consumer
>Affects Versions: 0.9.0.0
>Reporter: aarti gupta
>Assignee: Neha Narkhede
>
> Currently, the consumer.poll(timeout)
> returns all messages that have not been acked since the last fetch
> The only way to process a single message, is to throw away all but the first 
> message in the list
> This would mean we are required to fetch all messages into memory, and this 
> coupled with the client being not thread-safe, (i.e. we cannot use a 
> different thread to ack messages, makes it hard to consume messages when the 
> order of message arrival is important, and a large number of messages are 
> pending to be consumed)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: KIP-41: KafkaConsumer Max Records

2016-01-08 Thread Jens Rantil
Hi,

I just publicly wanted to thank Jason for the work he's done with the KIP
and say that I've been in touch with him privately back and forth to work
out of some of its details. Thanks!

Since it feels like I initiated this KIP a bit I also want to say that I'm
happy with it and that its proposal solves the initial issue I reported in
https://issues.apache.org/jira/browse/KAFKA-2986. That said, I open for a
[VOTE] on my behalf. I propose Jason decides when voting starts.

Cheers and keep up the good work,
Jens

On Tue, Jan 5, 2016 at 8:32 PM, Jason Gustafson  wrote:

> I've updated the KIP with some implementation details. I also added more
> discussion on the heartbeat() alternative. The short answer for why we
> rejected this API is that it doesn't seem to work well with offset commits.
> This would tend to make correct usage complicated and difficult to explain.
> Additionally, we don't see any clear advantages over having a way to set
> the max records. For example, using max.records=1 would be equivalent to
> invoking heartbeat() on each iteration of the message processing loop.
>
> Going back to the discussion on whether we should use a configuration value
> or overload poll(), I'm leaning toward the configuration option mainly for
> compatibility and to keep the KafkaConsumer API from getting any more
> complex. Also, as others have mentioned, it seems reasonable to want to
> tune this setting in the same place that the session timeout and heartbeat
> interval are configured. I still feel a little uncomfortable with the need
> to do a lot of configuration tuning to get the consumer working for a
> particular environment, but hopefully the defaults are conservative enough
> that most users won't need to. However, if it remains a problem, then we
> could still look into better options for managing the size of batches
> including overloading poll() with a max records argument or possibly by
> implementing a batch scaling algorithm internally.
>
> -Jason
>
>
> On Mon, Jan 4, 2016 at 12:18 PM, Jason Gustafson 
> wrote:
>
> > Hi Cliff,
> >
> > I think we're all agreed that the current contract of poll() should be
> > kept. The consumer wouldn't wait for max messages to become available in
> > this proposal; it would only sure that it never returns more than max
> > messages.
> >
> > -Jason
> >
> > On Mon, Jan 4, 2016 at 11:52 AM, Cliff Rhyne  wrote:
> >
> >> Instead of a heartbeat, I'd prefer poll() to return whatever messages
> the
> >> client has.  Either a) I don't care if I get less than my max message
> >> limit
> >> or b) I do care and will set a larger timeout.  Case B is less common
> than
> >> A and is fairly easy to handle in the application's code.
> >>
> >> On Mon, Jan 4, 2016 at 1:47 PM, Gwen Shapira  wrote:
> >>
> >> > 1. Agree that TCP window style scaling will be cool. I'll try to think
> >> of a
> >> > good excuse to use it ;)
> >> >
> >> > 2. I'm very concerned about the challenges of getting the timeouts,
> >> > hearbeats and max messages right.
> >> >
> >> > Another option could be to expose "heartbeat" API to consumers. If my
> >> app
> >> > is still processing data but is still alive, it could initiate a
> >> heartbeat
> >> > to signal its alive without having to handle additional messages.
> >> >
> >> > I don't know if this improves more than it complicates though :(
> >> >
> >> > On Mon, Jan 4, 2016 at 11:40 AM, Jason Gustafson 
> >> > wrote:
> >> >
> >> > > Hey Gwen,
> >> > >
> >> > > I was thinking along the lines of TCP window scaling in order to
> >> > > dynamically find a good consumption rate. Basically you'd start off
> >> > > consuming say 100 records and you'd let it increase until the
> >> consumption
> >> > > took longer than half the session timeout (for example). You /might/
> >> be
> >> > > able to achieve the same thing using pause/resume, but it would be a
> >> lot
> >> > > trickier since you have to do it at the granularity of partitions.
> But
> >> > > yeah, database write performance doesn't always scale in a
> predictable
> >> > > enough way to accommodate this, so I'm not sure how useful it would
> >> be in
> >> > > practice. It might also be more difficult to implement since it
> >> wouldn't
> >> > be
> >> > > as clear when to initiate the next fetch. With a static setting, the
> >> > > consumer knows exactly how many records will be returned on the next
> >> call
> >> > > to poll() and can send fetches accordingly.
> >> > >
> >> > > On the other hand, I do feel a little wary of the need to tune the
> >> > session
> >> > > timeout and max messages though since these settings might depend on
> >> the
> >> > > environment that the consumer is deployed in. It wouldn't be a big
> >> deal
> >> > if
> >> > > the impact was relatively minor, but getting them wrong can cause a
> >> lot
> >> > of
> >> > > rebalance churn which could keep the consumer from making any
> >> progress.
> >> > > It's not a particularly graceful failure.
> >> > >
> >> > > -Jason
> >> > >
>

KIP process

2016-01-08 Thread Jens Rantil
Hi,

I just read through the KIP process[1] and have three specific questions:

   1. I notice that the voting is executed using "Lazy Majority"[2].
   However, I was surprised that "Lazy Majority" doesn't stipulate a time for
   how long a voting process will go on. I assume I can't vote "No" to issues
   that were voted on a month ago, right?
   2. Is there a list of the people who have binding votes somewhere? That
   is, a list of the active committers.
   3. What does "PMC" stand for?

[1]
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals
[2]
https://cwiki.apache.org/confluence/display/KAFKA/Bylaws#Bylaws-Approvals

Thanks,
Jens

-- 
Jens Rantil
Backend engineer
Tink AB

Email: jens.ran...@tink.se
Phone: +46 708 84 18 32
Web: www.tink.se

Facebook  Linkedin

 Twitter 


[GitHub] kafka pull request: MINOR: Fix typos in Kafka website page

2016-01-08 Thread smalldirector
GitHub user smalldirector opened a pull request:

https://github.com/apache/kafka/pull/742

MINOR: Fix typos in Kafka website page

Fix two minor typos in Kafka official website page.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/smalldirector/kafka kafka-document-typos-fix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/742.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #742


commit cf6635367115bde7ad337eb2e448ae0c9bd7b69f
Author: Gabriel Zhang 
Date:   2016-01-08T12:33:56Z

fix typos in Kafka document




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-1382) Update zkVersion on partition state update failures

2016-01-08 Thread dude (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15089248#comment-15089248
 ] 

dude commented on KAFKA-1382:
-

we hit this bug in kafka0.8.2.1, three nodes. zookeeper version is 3.4.6. the 
log is :


[2016-01-05 08:49:27,047] INFO Partition 
[error-signatureId-956a8fd7-a3ec-4718-bb77-45b3a97eb0cd,0] on broker 3: 
Shrinking ISR for partition [error-signatureId-956a8fd
7-a3ec-4718-bb77-45b3a97eb0cd,0] from 3,1 to 3 (kafka.cluster.Partition)
[2016-01-05 08:49:27,227] ERROR Uncaught exception in thread 
'kafka-network-thread-39091-0': (kafka.utils.Utils$)
java.util.NoSuchElementException: None.get
at scala.None$.get(Option.scala:347)
at scala.None$.get(Option.scala:345)
at kafka.network.ConnectionQuotas.dec(SocketServer.scala:524)
at kafka.network.AbstractServerThread.close(SocketServer.scala:165)
at kafka.network.AbstractServerThread.close(SocketServer.scala:157)
at kafka.network.Processor.close(SocketServer.scala:374)
at kafka.network.Processor.processNewResponses(SocketServer.scala:406)
at kafka.network.Processor.run(SocketServer.scala:318)
at java.lang.Thread.run(Thread.java:745)
[2016-01-05 08:49:27,248] INFO Reconnect due to socket error: 
java.io.IOException: connection timeout (kafka.consumer.SimpleConsumer)
[2016-01-05 08:49:27,248] INFO Reconnect due to socket error: 
java.io.IOException: Connection reset by peer (kafka.consumer.SimpleConsumer)
[2016-01-05 08:49:27,278] ERROR Uncaught exception in thread 
'kafka-network-thread-39091-1': (kafka.utils.Utils$)
java.util.NoSuchElementException: None.get
at scala.None$.get(Option.scala:347)
at scala.None$.get(Option.scala:345)
at kafka.network.ConnectionQuotas.dec(SocketServer.scala:524)
at kafka.network.AbstractServerThread.close(SocketServer.scala:165)
at kafka.network.AbstractServerThread.close(SocketServer.scala:157)
at kafka.network.Processor.close(SocketServer.scala:374)
at kafka.network.Processor.processNewResponses(SocketServer.scala:406)
at kafka.network.Processor.run(SocketServer.scala:318)
at java.lang.Thread.run(Thread.java:745)
[2016-01-05 08:49:27,918] INFO re-registering broker info in ZK for broker 3 
(kafka.server.KafkaHealthcheck)
[2016-01-05 08:49:36,312] INFO Registered broker 3 at path /brokers/ids/3 with 
address AI-iPaaS-ATS03:39091. (kafka.utils.ZkUtils$)
[2016-01-05 08:49:36,312] INFO done re-registering broker 
(kafka.server.KafkaHealthcheck)
[2016-01-05 08:49:36,313] INFO Subscribing to /brokers/topics path to watch for 
new topics (kafka.server.KafkaHealthcheck)
[2016-01-05 08:49:36,332] INFO New leader is 1 
(kafka.server.ZookeeperLeaderElector$LeaderChangeListener)
[2016-01-05 08:49:36,343] INFO Partition 
[error-signatureId-956a8fd7-a3ec-4718-bb77-45b3a97eb0cd,0] on broker 3: Cached 
zkVersion [0] not equal to that in zookeeper, s
kip updating ISR (kafka.cluster.Partition)
[2016-01-05 08:49:36,343] INFO Partition 
[error-signatureId-e8c1c145-4109-48d8-a46f-4eca92143594,0] on broker 3: 
Shrinking ISR for partition [error-signatureId-e8c1c14
5-4109-48d8-a46f-4eca92143594,0] from 3,2 to 3 (kafka.cluster.Partition)
[2016-01-05 08:49:36,372] INFO Partition 
[error-signatureId-e8c1c145-4109-48d8-a46f-4eca92143594,0] on broker 3: Cached 
zkVersion [0] not equal to that in zookeeper, s
kip updating ISR (kafka.cluster.Partition)
[2016-01-05 08:49:36,373] INFO Partition 
[error-signatureId-59206ee6-e9b7-470d-9b1d-638e2cc7ebad,0] on broker 3: 
Shrinking ISR for partition [error-signatureId-59206ee
6-e9b7-470d-9b1d-638e2cc7ebad,0] from 3,2 to 3 (kafka.cluster.Partition)
[2016-01-05 08:49:36,402] INFO Partition 
[error-signatureId-59206ee6-e9b7-470d-9b1d-638e2cc7ebad,0] on broker 3: Cached 
zkVersion [0] not equal to that in zookeeper, s
kip updating ISR (kafka.cluster.Partition)
[2016-01-05 08:49:36,402] INFO Partition 
[error-signatureId-be6798c3-57d8-4ddc-a155-04983987b160,0] on broker 3: 
Shrinking ISR for partition [error-signatureId-be6798c
3-57d8-4ddc-a155-04983987b160,0] from 3,1 to 3 (kafka.cluster.Partition)
[2016-01-05 08:49:36,426] INFO Partition 
[error-signatureId-be6798c3-57d8-4ddc-a155-04983987b160,0] on broker 3: Cached 
zkVersion [0] not equal to that in zookeeper, s
kip updating ISR (kafka.cluster.Partition)
[2016-01-05 08:49:36,426] INFO Partition 
[error-signatureId-38fd31e8-3a0a-4b06-b278-a8f10bab232f,0] on broker 3: 
Shrinking ISR for partition [error-signatureId-38fd31e


> Update zkVersion on partition state update failures
> ---
>
> Key: KAFKA-1382
> URL: https://issues.apache.org/jira/browse/KAFKA-1382
> Project: Kafka
>  Issue Type: Bug
>Reporter: Joel Koshy
>Assignee: Sriharsha Chintalapani
> Fix For: 0.8.1.2, 

Status of multi-tenancy

2016-01-08 Thread Luciano Afranllie
Hi there

We are interested in adding support for multiple tenants into Kafka and I
reached to this thread

http://grokbase.com/t/kafka/dev/154wsscrsk/adding-multi-tenancy-capabilities-to-kafka

Could you please let me know the status of this proposal?

Is this something we can move forward?

Regards
Luciano


Status of multi-tenancy

2016-01-08 Thread Ashish Singh
Hello Luciano,

We have started a KIP to make progress along this direction. [KIP-37|
https://cwiki.apache.org/confluence/display/KAFKA/KIP-37+-+Add+Namespaces+to+Kafka]
covers the details. It will be nice if you can go over the
KIP, its associated discussion thread and provide your thoughts. Honestly,
the discussion on this has dwindled down for some time, but we will be
actively working on this in coming weeks.

Thanks!


On Friday, January 8, 2016, Luciano Afranllie > wrote:

> Hi there
>
> We are interested in adding support for multiple tenants into Kafka and I
> reached to this thread
>
>
> http://grokbase.com/t/kafka/dev/154wsscrsk/adding-multi-tenancy-capabilities-to-kafka
>
> Could you please let me know the status of this proposal?
>
> Is this something we can move forward?
>
> Regards
> Luciano
>


-- 
Ashish 🎤h


[jira] [Created] (KAFKA-3083) a soft failure in controller may leader a topic partition in an inconsistent state

2016-01-08 Thread Jun Rao (JIRA)
Jun Rao created KAFKA-3083:
--

 Summary: a soft failure in controller may leader a topic partition 
in an inconsistent state
 Key: KAFKA-3083
 URL: https://issues.apache.org/jira/browse/KAFKA-3083
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.9.0.0
Reporter: Jun Rao


The following sequence can happen.

1. Broker A is the controller and is in the middle of processing a broker 
change event. As part of this process, let's say it's about to shrink the isr 
of a partition.

2. Then broker A's session expires and broker B takes over as the new 
controller. Broker B sends the initial leaderAndIsr request to all brokers.

3. Broker A continues by shrinking the isr of the partition in ZK and sends the 
new leaderAndIsr request to the broker (say C) that leads the partition. Broker 
C will reject this leaderAndIsr since the request comes from a controller with 
an older epoch. Now we could be in a situation that Broker C thinks the isr has 
all replicas, but the isr stored in ZK is different.




1. Originally, broker 12 was the controller with controller epoch 4. It 
received the following broker change event and was in the middle of processing 
this event by selecting new leaders and shrinking ISRs.

2015-12-25 09:10:57,339 INFO kafka.utils.Logging$class:68 
[ZkClient-EventThread-93-ec2-107-20-175-177.compute-1.amazonaws.com:2181,ec2-107-20-175-179.compute-1.amazonaws.com:2181,ec2-107-20-175-226.compute-1.amazonaws.com:2181,ec2-107-20-175-229.compute-1.amazonaws.com:2181,ec2-107-20-175-232.compute-1.amazonaws.com:2181/kskafka/everest]
 [info] [BrokerChangeListener on Controller 12]: Newly added brokers: , deleted 
brokers: 
0,10,56,42,25,20,29,1,33,9,53,41,64,59,27,49,7,39,35,11,55,8,30,19,4,47,68, all 
live brokers: 
5,24,37,52,14,46,57,61,6,60,28,38,70,21,65,13,2,32,34,45,17,22,44,71,54,66,3,48,63,18,50,67,16,31,43,40,26,23,58,36,51,15,62

2. Then broker 12's ZK session expired and broker 30 took over as the 
controller with controller epoch 6.

2015-12-25 09:11:11,012 INFO kafka.utils.Logging$class:68 
[ZkClient-EventThread-93-ec2-107-20-175-177.compute-1.amazonaws.com:2181,ec2-107-20-175-179.compute-1.amazonaws.com:2181,ec2-107-20-175-226.compute-1.amazonaws.com:2181,ec2-107-20-175-229.compute-1.amazonaws.com:2181,ec2-107-20-175-232.compute-1.amazonaws.com:2181/kskafka/everest]
 [info] [Controller 30]: Controller 30 incremented epoch to 6

3. Controller 30 read the current leaderAndIsr for [streaming_client_log,3] 
(with leader epoch 5) from ZK during initialization and sent it to broker 31 
(the leader of streaming_client_log,3) with controller epoch 6

4. Old controller 12 continued from step 1. It shrank the ISR for 
[streaming_client_log,3] and changed leader epoch to 6. 
2015-12-25 09:11:13,274 INFO kafka.utils.Logging$class:68 
[ZkClient-EventThread-93-ec2-107-20-175-177.compute-1.amazonaws.com:2181,ec2-107-20-175-179.compute-1.amazonaws.com:2181,ec2-107-20-175-226.compute-1.amazonaws.com:2181,ec2-107-20-175-229.compute-1.amazonaws.com:2181,ec2-107-20-175-232.compute-1.amazonaws.com:2181/kskafka/everest]
 [info] [Controller 12]: New leader and ISR for partition 
[streaming_client_log,3] is {"leader":31,"leader_epoch":6,"isr":[31]}

5. Old controller 12 sent leaderAndIsr to broker 31, but it's ignored since the 
highest controller epoch on broker 31 is 6, which is higher than the controller 
epoch 4 in leaderAndIsr. 
2015-12-25 09:11:15,484 WARN kafka.utils.Logging$class:83 
[kafka-request-handler-6] [warn] Broker 31 ignoring LeaderAndIsr request from 
controller 12 with correlation id 769 since its controller epoch 4 is old. 
Latest known controller epoch is 6

6. Old controller 12 finally received the ZK session expiration event and 
stopped acting as the controller.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3083) a soft failure in controller may leader a topic partition in an inconsistent state

2016-01-08 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao updated KAFKA-3083:
---
Description: 
The following sequence can happen.

1. Broker A is the controller and is in the middle of processing a broker 
change event. As part of this process, let's say it's about to shrink the isr 
of a partition.

2. Then broker A's session expires and broker B takes over as the new 
controller. Broker B sends the initial leaderAndIsr request to all brokers.

3. Broker A continues by shrinking the isr of the partition in ZK and sends the 
new leaderAndIsr request to the broker (say C) that leads the partition. Broker 
C will reject this leaderAndIsr since the request comes from a controller with 
an older epoch. Now we could be in a situation that Broker C thinks the isr has 
all replicas, but the isr stored in ZK is different.


  was:
The following sequence can happen.

1. Broker A is the controller and is in the middle of processing a broker 
change event. As part of this process, let's say it's about to shrink the isr 
of a partition.

2. Then broker A's session expires and broker B takes over as the new 
controller. Broker B sends the initial leaderAndIsr request to all brokers.

3. Broker A continues by shrinking the isr of the partition in ZK and sends the 
new leaderAndIsr request to the broker (say C) that leads the partition. Broker 
C will reject this leaderAndIsr since the request comes from a controller with 
an older epoch. Now we could be in a situation that Broker C thinks the isr has 
all replicas, but the isr stored in ZK is different.




1. Originally, broker 12 was the controller with controller epoch 4. It 
received the following broker change event and was in the middle of processing 
this event by selecting new leaders and shrinking ISRs.

2015-12-25 09:10:57,339 INFO kafka.utils.Logging$class:68 
[ZkClient-EventThread-93-ec2-107-20-175-177.compute-1.amazonaws.com:2181,ec2-107-20-175-179.compute-1.amazonaws.com:2181,ec2-107-20-175-226.compute-1.amazonaws.com:2181,ec2-107-20-175-229.compute-1.amazonaws.com:2181,ec2-107-20-175-232.compute-1.amazonaws.com:2181/kskafka/everest]
 [info] [BrokerChangeListener on Controller 12]: Newly added brokers: , deleted 
brokers: 
0,10,56,42,25,20,29,1,33,9,53,41,64,59,27,49,7,39,35,11,55,8,30,19,4,47,68, all 
live brokers: 
5,24,37,52,14,46,57,61,6,60,28,38,70,21,65,13,2,32,34,45,17,22,44,71,54,66,3,48,63,18,50,67,16,31,43,40,26,23,58,36,51,15,62

2. Then broker 12's ZK session expired and broker 30 took over as the 
controller with controller epoch 6.

2015-12-25 09:11:11,012 INFO kafka.utils.Logging$class:68 
[ZkClient-EventThread-93-ec2-107-20-175-177.compute-1.amazonaws.com:2181,ec2-107-20-175-179.compute-1.amazonaws.com:2181,ec2-107-20-175-226.compute-1.amazonaws.com:2181,ec2-107-20-175-229.compute-1.amazonaws.com:2181,ec2-107-20-175-232.compute-1.amazonaws.com:2181/kskafka/everest]
 [info] [Controller 30]: Controller 30 incremented epoch to 6

3. Controller 30 read the current leaderAndIsr for [streaming_client_log,3] 
(with leader epoch 5) from ZK during initialization and sent it to broker 31 
(the leader of streaming_client_log,3) with controller epoch 6

4. Old controller 12 continued from step 1. It shrank the ISR for 
[streaming_client_log,3] and changed leader epoch to 6. 
2015-12-25 09:11:13,274 INFO kafka.utils.Logging$class:68 
[ZkClient-EventThread-93-ec2-107-20-175-177.compute-1.amazonaws.com:2181,ec2-107-20-175-179.compute-1.amazonaws.com:2181,ec2-107-20-175-226.compute-1.amazonaws.com:2181,ec2-107-20-175-229.compute-1.amazonaws.com:2181,ec2-107-20-175-232.compute-1.amazonaws.com:2181/kskafka/everest]
 [info] [Controller 12]: New leader and ISR for partition 
[streaming_client_log,3] is {"leader":31,"leader_epoch":6,"isr":[31]}

5. Old controller 12 sent leaderAndIsr to broker 31, but it's ignored since the 
highest controller epoch on broker 31 is 6, which is higher than the controller 
epoch 4 in leaderAndIsr. 
2015-12-25 09:11:15,484 WARN kafka.utils.Logging$class:83 
[kafka-request-handler-6] [warn] Broker 31 ignoring LeaderAndIsr request from 
controller 12 with correlation id 769 since its controller epoch 4 is old. 
Latest known controller epoch is 6

6. Old controller 12 finally received the ZK session expiration event and 
stopped acting as the controller.



> a soft failure in controller may leader a topic partition in an inconsistent 
> state
> --
>
> Key: KAFKA-3083
> URL: https://issues.apache.org/jira/browse/KAFKA-3083
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.9.0.0
>Reporter: Jun Rao
>
> The following sequence can happen.
> 1. Broker A is the controller and is in the middle of processing a broker 
> change event. As part of this process

[jira] [Commented] (KAFKA-3083) a soft failure in controller may leader a topic partition in an inconsistent state

2016-01-08 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15089422#comment-15089422
 ] 

Jun Rao commented on KAFKA-3083:


[~fpj] suggested that reason that this could happen is the misuse of ZK in 
Kafka controller.

> a soft failure in controller may leader a topic partition in an inconsistent 
> state
> --
>
> Key: KAFKA-3083
> URL: https://issues.apache.org/jira/browse/KAFKA-3083
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.9.0.0
>Reporter: Jun Rao
>
> The following sequence can happen.
> 1. Broker A is the controller and is in the middle of processing a broker 
> change event. As part of this process, let's say it's about to shrink the isr 
> of a partition.
> 2. Then broker A's session expires and broker B takes over as the new 
> controller. Broker B sends the initial leaderAndIsr request to all brokers.
> 3. Broker A continues by shrinking the isr of the partition in ZK and sends 
> the new leaderAndIsr request to the broker (say C) that leads the partition. 
> Broker C will reject this leaderAndIsr since the request comes from a 
> controller with an older epoch. Now we could be in a situation that Broker C 
> thinks the isr has all replicas, but the isr stored in ZK is different.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3064) Improve resync method to waste less time and data transfer

2016-01-08 Thread Michael Graff (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15089449#comment-15089449
 ] 

Michael Graff commented on KAFKA-3064:
--

Example log messages indicating this issue is occurring:

Jan  8 16:22:25 broker2 kafka [ReplicaFetcherThread-3-1] WARN 
[ReplicaFetcherThread-3-1], Replica 2 for partition [nom-dns-base-text,3] reset 
its fetch offset from 372718324 to current leader 1's start offset 372718324 
(kafka.server.ReplicaFetcherThread)
Jan  8 16:22:25 broker2 kafka [ReplicaFetcherThread-3-1] ERROR 
[ReplicaFetcherThread-3-1], Current offset 372712344 for partition 
[nom-dns-base-text,3] out of range; reset offset to 372718324 
(kafka.server.ReplicaFetcherThread)


> Improve resync method to waste less time and data transfer
> --
>
> Key: KAFKA-3064
> URL: https://issues.apache.org/jira/browse/KAFKA-3064
> Project: Kafka
>  Issue Type: Improvement
>  Components: controller, network
>Reporter: Michael Graff
>Assignee: Neha Narkhede
>
> We have several topics which are large (65 GB per partition) with 12 
> partitions.  Data rates into each topic vary, but in general each one has its 
> own rate.
> After a raid rebuild, we are pulling all the data over to the newly rebuild 
> raid.  This takes forever, and has yet to complete after nearly 8 hours.
> Here are my observations:
> (1)  The Kafka broker seems to pull from all topics on all partitions at the 
> same time, starting at the oldest message.
> (2)  When you divide total disk bandwidth available across all partitions 
> (really, only 48 of which have significant amounts of data, about 65 * 12 = 
> 780 GB each topic) the ingest rate of 36 out of 48 of them is higher than the 
> available bandwidth.
> (3)  The effect of (2) is that one topic SLOWLY catches up, while the other 4 
> topics continue to retrieve data at 75% of the bandwidth, just to toss it 
> away because the source broker has discarded it already.
> (4)  Eventually that one topic catches up, and the remaining bandwidth is 
> then divided into the remaining 36 partitions, one group of which starts to 
> catch up again.
> What I want to see is a way to say “don’t transfer more than X partitions at 
> the same time” and ideally a priority rule that says, “Transfer partitions 
> you are responsible for first, then transfer ones you are not.  Also, 
> transfer these first, then those, but no more than 1 topic at a time”
> What I REALLY want is for Kafka to track the new data (track the head of the 
> log) and then ask for the tail in chunks.  Ideally this would request from 
> the source, “what is the next logical older starting point?” and then start 
> there.  This way, the transfer basically becomes a file transfer of the log 
> stored on the source’s disk. Once that block is retrieved, it moves on to the 
> next oldest.  This way, there is almost zero waste as both the head and tail 
> grow, but the tail runs the risk of losing the final chunk only.  Thus, 
> bandwidth is not significantly wasted.
> All this changes the ISR check to be is “am I caught up on head AND tail?” 
> when the tail part is implied right now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: TMR should nopt create topic if not existing.

2016-01-08 Thread Grant Henke
Hi Mayuresh,

I am working on KIP-4 and the create topic wire protocol is a part of that.
I have a patch but it needs to be changed to be synchronous on the server
side yet. You can see the discussion on the pull request.
That work can be tracked here:
https://issues.apache.org/jira/browse/KAFKA-2945

Once the create topic work is done, clients can then create missing topics
themselves and we wont depend on the server auto creating them. the client,
auto creation could be baked into the client, or users could use the error
codes to make the call themselves, that work and details has not been
vetted yet.
There is a jira to track that here:
https://issues.apache.org/jira/browse/KAFKA-2410

Hope that helps.

Thanks,
Grant

On Thu, Jan 7, 2016 at 6:27 PM, Mayuresh Gharat 
wrote:

> Hi,
>
> I might have missed out the discussions on this :(
>
> I had some more questions :
>
> 1) We need a config in KafkaProducer for this. So when the KafkaProducer
> issues a TMR for a topic and receives a response that the topic does not
> exist, depending on the value of this config it should use the CreateTopic
> request to create the topic or exit. Is there a ticket for this?
>
> 2) We need a config in KafkaConsumer for this. So when the
> KafkaConsumer(old and new) issues a TMR for a topic and receives a response
> that the topic does not exist, depending on the value of this config it
> should use the CreateTopic request to create the topic or exit. Is there a
> ticket for this?
>
> Thanks,
>
> Mayuresh
>
> On Thu, Jan 7, 2016 at 3:27 PM, Mayuresh Gharat <
> gharatmayures...@gmail.com>
> wrote:
>
> > Hi Jason,
> >
> > Thanks. I had found this ticket before but the KAFKA-1507 also has some
> > context about this and I was confused basically exactly which patch is
> > going to go in.
> > Thanks a lot for confirming.
> >
> > Thanks,
> >
> > Mayuresh
> >
> > On Thu, Jan 7, 2016 at 3:08 PM, Jason Gustafson 
> > wrote:
> >
> >> Hey Mayuresh,
> >>
> >> The ticket that Grant Henke has been working on is here:
> >> https://issues.apache.org/jira/browse/KAFKA-2945. Is that what you were
> >> looking for?
> >>
> >> -Jason
> >>
> >> On Thu, Jan 7, 2016 at 1:50 PM, Mayuresh Gharat <
> >> gharatmayures...@gmail.com>
> >> wrote:
> >>
> >> > + dev
> >> >
> >> > Hi
> >> >
> >> > There has been discussion on the ticket :
> >> > https://issues.apache.org/jira/browse/KAFKA-2887, that we are going
> to
> >> > deprecate auto-creation of topic or at least turn it off by default
> >> once we
> >> > have the CreateTopics API. It also says the patch is available for
> this.
> >> >
> >> > The only other ticket that I came across on the ticket is
> >> > https://issues.apache.org/jira/browse/KAFKA-1507.
> >> >
> >> > I wanted to confirm that
> >> https://issues.apache.org/jira/browse/KAFKA-1507
> >> > is the ticket that has the CreateTopics API patch.
> >> > <%28862%29%20250-7125>
> >> >
> >> >
> >> > --
> >> > -Regards,
> >> > Mayuresh R. Gharat
> >> > (862) 250-7125
> >> >
> >>
> >
> >
> >
> > --
> > -Regards,
> > Mayuresh R. Gharat
> > (862) 250-7125
> >
>
>
>
> --
> -Regards,
> Mayuresh R. Gharat
> (862) 250-7125
>



-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


ScalaStyle Feedback

2016-01-08 Thread Grant Henke
A while back I did a some preliminary work on KAFKA-2423: Introduce
ScalaStyle. There is a pull request here:
https://github.com/apache/kafka/pull/560. The important content for review
is the rule set defined in scalastyle_config.xml.

I know complete consensus on something like this may not be feasible, but I
think there is value in defining some basic style rules to expedite the
patch review process and improve code readability, clarity, and quality.
The Java Checkstyle rules added to Kafka have worked well, and I think
having similar rules for Scala are important.

If there are no major oppositions to the rules, I would like to first:

   1. Add an optional build task to check the code for ScalaStyle rule
   violations
   2. Fix the very basic violations (whitespace, line length, etc)
   3. Over time fix more impactful violations

During this time period we can also identify if the rules are too strict,
annoying or need to be adjusted. Once the issues are fixed we can then run
ScalaStyle checks as a part of the build to be sure no new violations
occur.

Do we agree these checks would be valuable? Do the rules and approach
proposed in my pull request seam reasonable?

Thanks,
Grant
-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: ScalaStyle Feedback

2016-01-08 Thread Ismael Juma
Hey Grant,

As you know I think this is definitely valuable. My take:

1. The choice of rules is really important and it's a good plan to be
conservative initially (as you suggested)
2. We should only use errors, not warnings
3. To make the previous one feasible, the rules we enable must not have
false positives (or they must be _really_ rare and with a reasonable
workaround).

Ismael

On Fri, Jan 8, 2016 at 5:18 PM, Grant Henke  wrote:

> A while back I did a some preliminary work on KAFKA-2423: Introduce
> ScalaStyle. There is a pull request here:
> https://github.com/apache/kafka/pull/560. The important content for review
> is the rule set defined in scalastyle_config.xml.
>
> I know complete consensus on something like this may not be feasible, but I
> think there is value in defining some basic style rules to expedite the
> patch review process and improve code readability, clarity, and quality.
> The Java Checkstyle rules added to Kafka have worked well, and I think
> having similar rules for Scala are important.
>
> If there are no major oppositions to the rules, I would like to first:
>
>1. Add an optional build task to check the code for ScalaStyle rule
>violations
>2. Fix the very basic violations (whitespace, line length, etc)
>3. Over time fix more impactful violations
>
> During this time period we can also identify if the rules are too strict,
> annoying or need to be adjusted. Once the issues are fixed we can then run
> ScalaStyle checks as a part of the build to be sure no new violations
> occur.
>
> Do we agree these checks would be valuable? Do the rules and approach
> proposed in my pull request seam reasonable?
>
> Thanks,
> Grant
> --
> Grant Henke
> Software Engineer | Cloudera
> gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke
>


Re: TMR should nopt create topic if not existing.

2016-01-08 Thread Mayuresh Gharat
Hi Grant,

Thanks a lot for the explanation.This is great.
It was little confusing with KIP-4 and KAFKA-1507 both talking about the
somewhat same thing and KAFKA-1507 has been lying around since last August
and not checked in or iterated on.

Here is what I understand is :
1) KAFKA-2945 will implement the CreateTopic Request that will block on the
server until the topic is successfully created. This will return success or
should return failure on timeout.
2) Once this is done, whether we want to have a config in Kafka Producer
and Consumer OR we want to let the users create the topic explicitly using
the createTopicRequest API is still under discussion.

Am I right on both the above points?

The reason I want to confirm this is because it is very important for us
when we want to delete topics in a pipeline and there are several clients
producing, consuming and mirror-makers and monitoring/auditing systems
running at the same time.


Thanks,

Mayuresh

On Fri, Jan 8, 2016 at 8:41 AM, Grant Henke  wrote:

> Hi Mayuresh,
>
> I am working on KIP-4 and the create topic wire protocol is a part of that.
> I have a patch but it needs to be changed to be synchronous on the server
> side yet. You can see the discussion on the pull request.
> That work can be tracked here:
> https://issues.apache.org/jira/browse/KAFKA-2945
>
> Once the create topic work is done, clients can then create missing topics
> themselves and we wont depend on the server auto creating them. the client,
> auto creation could be baked into the client, or users could use the error
> codes to make the call themselves, that work and details has not been
> vetted yet.
> There is a jira to track that here:
> https://issues.apache.org/jira/browse/KAFKA-2410
>
> Hope that helps.
>
> Thanks,
> Grant
>
> On Thu, Jan 7, 2016 at 6:27 PM, Mayuresh Gharat <
> gharatmayures...@gmail.com>
> wrote:
>
> > Hi,
> >
> > I might have missed out the discussions on this :(
> >
> > I had some more questions :
> >
> > 1) We need a config in KafkaProducer for this. So when the KafkaProducer
> > issues a TMR for a topic and receives a response that the topic does not
> > exist, depending on the value of this config it should use the
> CreateTopic
> > request to create the topic or exit. Is there a ticket for this?
> >
> > 2) We need a config in KafkaConsumer for this. So when the
> > KafkaConsumer(old and new) issues a TMR for a topic and receives a
> response
> > that the topic does not exist, depending on the value of this config it
> > should use the CreateTopic request to create the topic or exit. Is there
> a
> > ticket for this?
> >
> > Thanks,
> >
> > Mayuresh
> >
> > On Thu, Jan 7, 2016 at 3:27 PM, Mayuresh Gharat <
> > gharatmayures...@gmail.com>
> > wrote:
> >
> > > Hi Jason,
> > >
> > > Thanks. I had found this ticket before but the KAFKA-1507 also has some
> > > context about this and I was confused basically exactly which patch is
> > > going to go in.
> > > Thanks a lot for confirming.
> > >
> > > Thanks,
> > >
> > > Mayuresh
> > >
> > > On Thu, Jan 7, 2016 at 3:08 PM, Jason Gustafson 
> > > wrote:
> > >
> > >> Hey Mayuresh,
> > >>
> > >> The ticket that Grant Henke has been working on is here:
> > >> https://issues.apache.org/jira/browse/KAFKA-2945. Is that what you
> were
> > >> looking for?
> > >>
> > >> -Jason
> > >>
> > >> On Thu, Jan 7, 2016 at 1:50 PM, Mayuresh Gharat <
> > >> gharatmayures...@gmail.com>
> > >> wrote:
> > >>
> > >> > + dev
> > >> >
> > >> > Hi
> > >> >
> > >> > There has been discussion on the ticket :
> > >> > https://issues.apache.org/jira/browse/KAFKA-2887, that we are going
> > to
> > >> > deprecate auto-creation of topic or at least turn it off by default
> > >> once we
> > >> > have the CreateTopics API. It also says the patch is available for
> > this.
> > >> >
> > >> > The only other ticket that I came across on the ticket is
> > >> > https://issues.apache.org/jira/browse/KAFKA-1507.
> > >> >
> > >> > I wanted to confirm that
> > >> https://issues.apache.org/jira/browse/KAFKA-1507
> > >> > is the ticket that has the CreateTopics API patch.
> > >> > <%28862%29%20250-7125>
> > >> >
> > >> >
> > >> > --
> > >> > -Regards,
> > >> > Mayuresh R. Gharat
> > >> > (862) 250-7125
> > >> >
> > >>
> > >
> > >
> > >
> > > --
> > > -Regards,
> > > Mayuresh R. Gharat
> > > (862) 250-7125
> > >
> >
> >
> >
> > --
> > -Regards,
> > Mayuresh R. Gharat
> > (862) 250-7125
> >
>
>
>
> --
> Grant Henke
> Software Engineer | Cloudera
> gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke
>



-- 
-Regards,
Mayuresh R. Gharat
(862) 250-7125


Re: ScalaStyle Feedback

2016-01-08 Thread Ismael Juma
One more thing: once we apply this PR, backporting to 0.9.0 will become
much harder. This is one of the reasons (the other being that I'd like to
see the impact of the rules on the existing codebase and haven't had the
time to check) that I haven't commented on your PR in more detail. We are
still backporting PRs to 0.9.0 regularly, but maybe this will change in the
near future as the important bugs are tackled.

Ismael

On Fri, Jan 8, 2016 at 5:33 PM, Ismael Juma  wrote:

> Hey Grant,
>
> As you know I think this is definitely valuable. My take:
>
> 1. The choice of rules is really important and it's a good plan to be
> conservative initially (as you suggested)
> 2. We should only use errors, not warnings
> 3. To make the previous one feasible, the rules we enable must not have
> false positives (or they must be _really_ rare and with a reasonable
> workaround).
>
> Ismael
>
> On Fri, Jan 8, 2016 at 5:18 PM, Grant Henke  wrote:
>
>> A while back I did a some preliminary work on KAFKA-2423: Introduce
>> ScalaStyle. There is a pull request here:
>> https://github.com/apache/kafka/pull/560. The important content for
>> review
>> is the rule set defined in scalastyle_config.xml.
>>
>> I know complete consensus on something like this may not be feasible, but
>> I
>> think there is value in defining some basic style rules to expedite the
>> patch review process and improve code readability, clarity, and quality.
>> The Java Checkstyle rules added to Kafka have worked well, and I think
>> having similar rules for Scala are important.
>>
>> If there are no major oppositions to the rules, I would like to first:
>>
>>1. Add an optional build task to check the code for ScalaStyle rule
>>violations
>>2. Fix the very basic violations (whitespace, line length, etc)
>>3. Over time fix more impactful violations
>>
>> During this time period we can also identify if the rules are too strict,
>> annoying or need to be adjusted. Once the issues are fixed we can then run
>> ScalaStyle checks as a part of the build to be sure no new violations
>> occur.
>>
>> Do we agree these checks would be valuable? Do the rules and approach
>> proposed in my pull request seam reasonable?
>>
>> Thanks,
>> Grant
>> --
>> Grant Henke
>> Software Engineer | Cloudera
>> gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke
>>
>
>


Re: KIP-41: KafkaConsumer Max Records

2016-01-08 Thread Jason Gustafson
Hi Aarti,

Thanks for the feedback. I think the concern about memory overhead is
valid. As Guozhang mentioned, the problem already exists in the current
consumer, so this probably deserves consideration outside of this KIP. That
said, it's a good question whether our prefetching strategy makes it more
difficult to control the memory overhead. The approach we've proposed for
prefetching is basically the following: fetch all partitions whenever the
number of retained messages is less than max.poll.records. In the worst
case, this increases the maximum memory used by the consumer by the size of
those retained messages. As you've pointed out, messages could be very
large. We could reduce this requirement with a slight change: instead of
fetching all partitions, we could fetch only those with no retained data.
That would reduce the worst-case overhead to #no partitions *
max.partition.fetch.bytes, which matches the existing memory overhead.
Would that address your concern?

A couple other points worth mentioning is that users have the option not to
use max.poll.records, in which case the behavior will be the same as in the
current consumer. Additionally, the implementation can be changed over time
without affecting users, so we can adjust it in particular when we address
memory concerns in KAFKA-2045.

On a side note, I'm wondering if it would be useful to extend this KIP to
include a max.poll.bytes? For some use cases, it may make more sense to
control the processing time by the size of data instead of the number of
records. Not that I'm in anxious to draw this out, but if we'll need this
setting eventually, we may as well do it now. Thoughts?


-Jason

On Fri, Jan 8, 2016 at 1:03 AM, Jens Rantil  wrote:

> Hi,
>
> I just publicly wanted to thank Jason for the work he's done with the KIP
> and say that I've been in touch with him privately back and forth to work
> out of some of its details. Thanks!
>
> Since it feels like I initiated this KIP a bit I also want to say that I'm
> happy with it and that its proposal solves the initial issue I reported in
> https://issues.apache.org/jira/browse/KAFKA-2986. That said, I open for a
> [VOTE] on my behalf. I propose Jason decides when voting starts.
>
> Cheers and keep up the good work,
> Jens
>
> On Tue, Jan 5, 2016 at 8:32 PM, Jason Gustafson 
> wrote:
>
> > I've updated the KIP with some implementation details. I also added more
> > discussion on the heartbeat() alternative. The short answer for why we
> > rejected this API is that it doesn't seem to work well with offset
> commits.
> > This would tend to make correct usage complicated and difficult to
> explain.
> > Additionally, we don't see any clear advantages over having a way to set
> > the max records. For example, using max.records=1 would be equivalent to
> > invoking heartbeat() on each iteration of the message processing loop.
> >
> > Going back to the discussion on whether we should use a configuration
> value
> > or overload poll(), I'm leaning toward the configuration option mainly
> for
> > compatibility and to keep the KafkaConsumer API from getting any more
> > complex. Also, as others have mentioned, it seems reasonable to want to
> > tune this setting in the same place that the session timeout and
> heartbeat
> > interval are configured. I still feel a little uncomfortable with the
> need
> > to do a lot of configuration tuning to get the consumer working for a
> > particular environment, but hopefully the defaults are conservative
> enough
> > that most users won't need to. However, if it remains a problem, then we
> > could still look into better options for managing the size of batches
> > including overloading poll() with a max records argument or possibly by
> > implementing a batch scaling algorithm internally.
> >
> > -Jason
> >
> >
> > On Mon, Jan 4, 2016 at 12:18 PM, Jason Gustafson 
> > wrote:
> >
> > > Hi Cliff,
> > >
> > > I think we're all agreed that the current contract of poll() should be
> > > kept. The consumer wouldn't wait for max messages to become available
> in
> > > this proposal; it would only sure that it never returns more than max
> > > messages.
> > >
> > > -Jason
> > >
> > > On Mon, Jan 4, 2016 at 11:52 AM, Cliff Rhyne  wrote:
> > >
> > >> Instead of a heartbeat, I'd prefer poll() to return whatever messages
> > the
> > >> client has.  Either a) I don't care if I get less than my max message
> > >> limit
> > >> or b) I do care and will set a larger timeout.  Case B is less common
> > than
> > >> A and is fairly easy to handle in the application's code.
> > >>
> > >> On Mon, Jan 4, 2016 at 1:47 PM, Gwen Shapira 
> wrote:
> > >>
> > >> > 1. Agree that TCP window style scaling will be cool. I'll try to
> think
> > >> of a
> > >> > good excuse to use it ;)
> > >> >
> > >> > 2. I'm very concerned about the challenges of getting the timeouts,
> > >> > hearbeats and max messages right.
> > >> >
> > >> > Another option could be to expose "heartbeat" API to consumers. If
> m

[jira] [Commented] (KAFKA-3083) a soft failure in controller may leader a topic partition in an inconsistent state

2016-01-08 Thread Mayuresh Gharat (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15089586#comment-15089586
 ] 

Mayuresh Gharat commented on KAFKA-3083:


[~fpj] can you shed some light on what you meant by misuse?
[~junrao] if this is actually an issue, I would like to give a shot at it.

> a soft failure in controller may leader a topic partition in an inconsistent 
> state
> --
>
> Key: KAFKA-3083
> URL: https://issues.apache.org/jira/browse/KAFKA-3083
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.9.0.0
>Reporter: Jun Rao
>
> The following sequence can happen.
> 1. Broker A is the controller and is in the middle of processing a broker 
> change event. As part of this process, let's say it's about to shrink the isr 
> of a partition.
> 2. Then broker A's session expires and broker B takes over as the new 
> controller. Broker B sends the initial leaderAndIsr request to all brokers.
> 3. Broker A continues by shrinking the isr of the partition in ZK and sends 
> the new leaderAndIsr request to the broker (say C) that leads the partition. 
> Broker C will reject this leaderAndIsr since the request comes from a 
> controller with an older epoch. Now we could be in a situation that Broker C 
> thinks the isr has all replicas, but the isr stored in ZK is different.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: MINOR: Fix bug in `waitUntilLeaderIsElectedOrC...

2016-01-08 Thread ijuma
GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/743

MINOR: Fix bug in `waitUntilLeaderIsElectedOrChanged` and simplify result 
type

The bug was for the following case:

`leader.isDefined && oldLeaderOpt.isEmpty && newLeaderOpt.isDefined && 
newLeaderOpt.get != leader.get`

We would consider it a successful election/change even though we should not.

I also changed the result type as we never return `None` (we throw an 
exception instead).

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka 
fix-wait-until-leader-is-elected-or-changed

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/743.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #743


commit 4b17f8501d44e3b54dff2b1a4f86673b1e514776
Author: Ismael Juma 
Date:   2016-01-08T16:48:47Z

Fix bug in `waitUntilLeaderIsElectedOrChanged` and simplify result type

The bug was for the following case:

`leader.isDefined && oldLeaderOpt.isEmpty && newLeaderOpt.isDefined && 
newLeaderOpt.get != leader.get`

We would consider it a successful election even though we should not.

I also changed the result type is we never return `None` (we throw an 
exception instead).




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: TMR should nopt create topic if not existing.

2016-01-08 Thread Grant Henke
Hi Mayuresh,

Your understand aligns with my understanding.

Thanks,
Grant

On Fri, Jan 8, 2016 at 11:38 AM, Mayuresh Gharat  wrote:

> Hi Grant,
>
> Thanks a lot for the explanation.This is great.
> It was little confusing with KIP-4 and KAFKA-1507 both talking about the
> somewhat same thing and KAFKA-1507 has been lying around since last August
> and not checked in or iterated on.
>
> Here is what I understand is :
> 1) KAFKA-2945 will implement the CreateTopic Request that will block on the
> server until the topic is successfully created. This will return success or
> should return failure on timeout.
> 2) Once this is done, whether we want to have a config in Kafka Producer
> and Consumer OR we want to let the users create the topic explicitly using
> the createTopicRequest API is still under discussion.
>
> Am I right on both the above points?
>
> The reason I want to confirm this is because it is very important for us
> when we want to delete topics in a pipeline and there are several clients
> producing, consuming and mirror-makers and monitoring/auditing systems
> running at the same time.
>
>
> Thanks,
>
> Mayuresh
>
> On Fri, Jan 8, 2016 at 8:41 AM, Grant Henke  wrote:
>
> > Hi Mayuresh,
> >
> > I am working on KIP-4 and the create topic wire protocol is a part of
> that.
> > I have a patch but it needs to be changed to be synchronous on the server
> > side yet. You can see the discussion on the pull request.
> > That work can be tracked here:
> > https://issues.apache.org/jira/browse/KAFKA-2945
> >
> > Once the create topic work is done, clients can then create missing
> topics
> > themselves and we wont depend on the server auto creating them. the
> client,
> > auto creation could be baked into the client, or users could use the
> error
> > codes to make the call themselves, that work and details has not been
> > vetted yet.
> > There is a jira to track that here:
> > https://issues.apache.org/jira/browse/KAFKA-2410
> >
> > Hope that helps.
> >
> > Thanks,
> > Grant
> >
> > On Thu, Jan 7, 2016 at 6:27 PM, Mayuresh Gharat <
> > gharatmayures...@gmail.com>
> > wrote:
> >
> > > Hi,
> > >
> > > I might have missed out the discussions on this :(
> > >
> > > I had some more questions :
> > >
> > > 1) We need a config in KafkaProducer for this. So when the
> KafkaProducer
> > > issues a TMR for a topic and receives a response that the topic does
> not
> > > exist, depending on the value of this config it should use the
> > CreateTopic
> > > request to create the topic or exit. Is there a ticket for this?
> > >
> > > 2) We need a config in KafkaConsumer for this. So when the
> > > KafkaConsumer(old and new) issues a TMR for a topic and receives a
> > response
> > > that the topic does not exist, depending on the value of this config it
> > > should use the CreateTopic request to create the topic or exit. Is
> there
> > a
> > > ticket for this?
> > >
> > > Thanks,
> > >
> > > Mayuresh
> > >
> > > On Thu, Jan 7, 2016 at 3:27 PM, Mayuresh Gharat <
> > > gharatmayures...@gmail.com>
> > > wrote:
> > >
> > > > Hi Jason,
> > > >
> > > > Thanks. I had found this ticket before but the KAFKA-1507 also has
> some
> > > > context about this and I was confused basically exactly which patch
> is
> > > > going to go in.
> > > > Thanks a lot for confirming.
> > > >
> > > > Thanks,
> > > >
> > > > Mayuresh
> > > >
> > > > On Thu, Jan 7, 2016 at 3:08 PM, Jason Gustafson 
> > > > wrote:
> > > >
> > > >> Hey Mayuresh,
> > > >>
> > > >> The ticket that Grant Henke has been working on is here:
> > > >> https://issues.apache.org/jira/browse/KAFKA-2945. Is that what you
> > were
> > > >> looking for?
> > > >>
> > > >> -Jason
> > > >>
> > > >> On Thu, Jan 7, 2016 at 1:50 PM, Mayuresh Gharat <
> > > >> gharatmayures...@gmail.com>
> > > >> wrote:
> > > >>
> > > >> > + dev
> > > >> >
> > > >> > Hi
> > > >> >
> > > >> > There has been discussion on the ticket :
> > > >> > https://issues.apache.org/jira/browse/KAFKA-2887, that we are
> going
> > > to
> > > >> > deprecate auto-creation of topic or at least turn it off by
> default
> > > >> once we
> > > >> > have the CreateTopics API. It also says the patch is available for
> > > this.
> > > >> >
> > > >> > The only other ticket that I came across on the ticket is
> > > >> > https://issues.apache.org/jira/browse/KAFKA-1507.
> > > >> >
> > > >> > I wanted to confirm that
> > > >> https://issues.apache.org/jira/browse/KAFKA-1507
> > > >> > is the ticket that has the CreateTopics API patch.
> > > >> > <%28862%29%20250-7125>
> > > >> >
> > > >> >
> > > >> > --
> > > >> > -Regards,
> > > >> > Mayuresh R. Gharat
> > > >> > (862) 250-7125
> > > >> >
> > > >>
> > > >
> > > >
> > > >
> > > > --
> > > > -Regards,
> > > > Mayuresh R. Gharat
> > > > (862) 250-7125
> > > >
> > >
> > >
> > >
> > > --
> > > -Regards,
> > > Mayuresh R. Gharat
> > > (862) 250-7125
> > >
> >
> >
> >
> > --
> > Grant Henke
> > Software Engineer | Cloudera
> > gr...@cloudera.com | twitter.com/gchenke | linkedin.com/i

[jira] [Commented] (KAFKA-3079) org.apache.kafka.common.KafkaException: java.lang.SecurityException: Configuration Error:

2016-01-08 Thread Mohit Anchlia (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15089625#comment-15089625
 ] 

Mohit Anchlia commented on KAFKA-3079:
--

KafkaServer {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=true
storeKey=true
keyTab="/mnt/kafka/kafka/kafka.keytab"
principal="kafka/10.24.251@example.com";
};

# Zookeeper client authentication
Client {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=true
storeKey=true
keyTab="/mnt/kafka/kafka/kafka.keytab"
principal="kafka/10.24.251@example.com";
};


> org.apache.kafka.common.KafkaException: java.lang.SecurityException: 
> Configuration Error:
> -
>
> Key: KAFKA-3079
> URL: https://issues.apache.org/jira/browse/KAFKA-3079
> Project: Kafka
>  Issue Type: Bug
>  Components: security
>Affects Versions: 0.9.0.0
> Environment: RHEL 6
>Reporter: Mohit Anchlia
>
> After enabling security I am seeing the following error even though JAAS file 
> has no mention of "Zookeeper". I used the following steps:
> http://docs.confluent.io/2.0.0/kafka/sasl.html
> [2016-01-07 19:05:15,329] FATAL Fatal error during KafkaServer startup. 
> Prepare to shutdown (kafka.server.KafkaServer)
> org.apache.kafka.common.KafkaException: java.lang.SecurityException: 
> Configuration Error:
> Line 8: expected [{], found [Zookeeper]
> at 
> org.apache.kafka.common.security.JaasUtils.isZkSecurityEnabled(JaasUtils.java:102)
> at kafka.server.KafkaServer.initZk(KafkaServer.scala:262)
> at kafka.server.KafkaServer.startup(KafkaServer.scala:168)
> at 
> kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:37)
> at kafka.Kafka$.main(Kafka.scala:67)
> at kafka.Kafka.main(Kafka.scala)
> Caused by: java.lang.SecurityException: Configuration Error:
> Line 8: expected [{], found [Zookeeper]
> at com.sun.security.auth.login.ConfigFile.(ConfigFile.java:110)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at java.lang.Class.newInstance(Class.java:374)
> at 
> javax.security.auth.login.Configuration$2.run(Configuration.java:258)
> at 
> javax.security.auth.login.Configuration$2.run(Configuration.java:250)
> at java.security.AccessController.doPrivileged(Native Method)
> at 
> javax.security.auth.login.Configuration.getConfiguration(Configuration.java:249)
> at 
> org.apache.kafka.common.security.JaasUtils.isZkSecurityEnabled(JaasUtils.java:99)
> ... 5 more
> Caused by: java.io.IOException: Configuration Error:



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (KAFKA-3079) org.apache.kafka.common.KafkaException: java.lang.SecurityException: Configuration Error:

2016-01-08 Thread Mohit Anchlia (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohit Anchlia updated KAFKA-3079:
-
Comment: was deleted

(was: KafkaServer {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=true
storeKey=true
keyTab="/mnt/kafka/kafka/kafka.keytab"
principal="kafka/10.24.251@example.com";
};

# Zookeeper client authentication
Client {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=true
storeKey=true
keyTab="/mnt/kafka/kafka/kafka.keytab"
principal="kafka/10.24.251@example.com";
};
)

> org.apache.kafka.common.KafkaException: java.lang.SecurityException: 
> Configuration Error:
> -
>
> Key: KAFKA-3079
> URL: https://issues.apache.org/jira/browse/KAFKA-3079
> Project: Kafka
>  Issue Type: Bug
>  Components: security
>Affects Versions: 0.9.0.0
> Environment: RHEL 6
>Reporter: Mohit Anchlia
>
> After enabling security I am seeing the following error even though JAAS file 
> has no mention of "Zookeeper". I used the following steps:
> http://docs.confluent.io/2.0.0/kafka/sasl.html
> [2016-01-07 19:05:15,329] FATAL Fatal error during KafkaServer startup. 
> Prepare to shutdown (kafka.server.KafkaServer)
> org.apache.kafka.common.KafkaException: java.lang.SecurityException: 
> Configuration Error:
> Line 8: expected [{], found [Zookeeper]
> at 
> org.apache.kafka.common.security.JaasUtils.isZkSecurityEnabled(JaasUtils.java:102)
> at kafka.server.KafkaServer.initZk(KafkaServer.scala:262)
> at kafka.server.KafkaServer.startup(KafkaServer.scala:168)
> at 
> kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:37)
> at kafka.Kafka$.main(Kafka.scala:67)
> at kafka.Kafka.main(Kafka.scala)
> Caused by: java.lang.SecurityException: Configuration Error:
> Line 8: expected [{], found [Zookeeper]
> at com.sun.security.auth.login.ConfigFile.(ConfigFile.java:110)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at java.lang.Class.newInstance(Class.java:374)
> at 
> javax.security.auth.login.Configuration$2.run(Configuration.java:258)
> at 
> javax.security.auth.login.Configuration$2.run(Configuration.java:250)
> at java.security.AccessController.doPrivileged(Native Method)
> at 
> javax.security.auth.login.Configuration.getConfiguration(Configuration.java:249)
> at 
> org.apache.kafka.common.security.JaasUtils.isZkSecurityEnabled(JaasUtils.java:99)
> ... 5 more
> Caused by: java.io.IOException: Configuration Error:



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3079) org.apache.kafka.common.KafkaException: java.lang.SecurityException: Configuration Error:

2016-01-08 Thread Mohit Anchlia (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohit Anchlia updated KAFKA-3079:
-
Attachment: kafka_server_jaas.conf

Jass file attached

> org.apache.kafka.common.KafkaException: java.lang.SecurityException: 
> Configuration Error:
> -
>
> Key: KAFKA-3079
> URL: https://issues.apache.org/jira/browse/KAFKA-3079
> Project: Kafka
>  Issue Type: Bug
>  Components: security
>Affects Versions: 0.9.0.0
> Environment: RHEL 6
>Reporter: Mohit Anchlia
> Attachments: kafka_server_jaas.conf
>
>
> After enabling security I am seeing the following error even though JAAS file 
> has no mention of "Zookeeper". I used the following steps:
> http://docs.confluent.io/2.0.0/kafka/sasl.html
> [2016-01-07 19:05:15,329] FATAL Fatal error during KafkaServer startup. 
> Prepare to shutdown (kafka.server.KafkaServer)
> org.apache.kafka.common.KafkaException: java.lang.SecurityException: 
> Configuration Error:
> Line 8: expected [{], found [Zookeeper]
> at 
> org.apache.kafka.common.security.JaasUtils.isZkSecurityEnabled(JaasUtils.java:102)
> at kafka.server.KafkaServer.initZk(KafkaServer.scala:262)
> at kafka.server.KafkaServer.startup(KafkaServer.scala:168)
> at 
> kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:37)
> at kafka.Kafka$.main(Kafka.scala:67)
> at kafka.Kafka.main(Kafka.scala)
> Caused by: java.lang.SecurityException: Configuration Error:
> Line 8: expected [{], found [Zookeeper]
> at com.sun.security.auth.login.ConfigFile.(ConfigFile.java:110)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at java.lang.Class.newInstance(Class.java:374)
> at 
> javax.security.auth.login.Configuration$2.run(Configuration.java:258)
> at 
> javax.security.auth.login.Configuration$2.run(Configuration.java:250)
> at java.security.AccessController.doPrivileged(Native Method)
> at 
> javax.security.auth.login.Configuration.getConfiguration(Configuration.java:249)
> at 
> org.apache.kafka.common.security.JaasUtils.isZkSecurityEnabled(JaasUtils.java:99)
> ... 5 more
> Caused by: java.io.IOException: Configuration Error:



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: MINOR: Fix typos in Kafka website page

2016-01-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/742


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-2410) Implement "Auto Topic Creation" client side and remove support from Broker side

2016-01-08 Thread Mayuresh Gharat (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15089686#comment-15089686
 ] 

Mayuresh Gharat commented on KAFKA-2410:


I would like to bump up discussion on this. 
The way I understand this is that we have 2 points :

1) Have a config on the producer and consumer.
   a) TMR is disabled on the broker side.
   b) Producer and consumer have this config set to TRUE.
  c) The producer does the send(), it issues a TMR and finds that the topic 
does not exist. It will check this config and issue a createTopicRequest. When 
the createTopicRequest returns  success, it will start producing the actual 
data.
  d) Same goes for Consumer, if the topic does not exist, consumer will create 
it if the config is enabled. This is not a hard requirement on the consumer 
side.

2) No config and the topic creation is left to user explicitly.

The problem with 2) is that, if we have mirror makers running, we have to do 
the topic creation in producer, in some sort of callback may be, but it would 
be better to go with 1) in this scenario.

+ Joel [~jjkoshy] 



> Implement "Auto Topic Creation" client side and remove support from Broker 
> side
> ---
>
> Key: KAFKA-2410
> URL: https://issues.apache.org/jira/browse/KAFKA-2410
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, core
>Affects Versions: 0.8.2.1
>Reporter: Grant Henke
>Assignee: Grant Henke
>
> Auto topic creation on the broker has caused pain in the past; And today it 
> still causes unusual error handling requirements on the client side, added 
> complexity in the broker, mixed responsibility of the TopicMetadataRequest, 
> and limits configuration of the option to be cluster wide. In the future 
> having it broker side will also make features such as authorization very 
> difficult. 
> There have been discussions in the past of implementing this feature client 
> side. 
> [example|https://issues.apache.org/jira/browse/KAFKA-689?focusedCommentId=13548746&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13548746]
> This Jira is to track that discussion and implementation once the necessary 
> protocol support exists: KAFKA-2229



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-2410) Implement "Auto Topic Creation" client side and remove support from Broker side

2016-01-08 Thread Mayuresh Gharat (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15089686#comment-15089686
 ] 

Mayuresh Gharat edited comment on KAFKA-2410 at 1/8/16 6:48 PM:


I would like to bump up discussion on this. 
The way I understand this is that we have 2 points :

1) Have a config on the producer and consumer.
   -> TMR is disabled on the broker side.
   -> Producer and consumer have this config set to TRUE.
  -> The producer does the send(), it issues a TMR and finds that the topic 
does not exist. It will check this config and issue a createTopicRequest. When 
the createTopicRequest returns  success, it will start producing the actual 
data.
  -> Same goes for Consumer, if the topic does not exist, consumer will create 
it if the config is enabled. This is not a hard requirement on the consumer 
side.

2) No config and the topic creation is left to user explicitly.

The problem with 2) is that, if we have mirror makers running, we have to do 
the topic creation in producer, in some sort of callback may be, but it would 
be better to go with 1) in this scenario.

+ Joel [~jjkoshy] 




was (Author: mgharat):
I would like to bump up discussion on this. 
The way I understand this is that we have 2 points :

1) Have a config on the producer and consumer.
   a) TMR is disabled on the broker side.
   b) Producer and consumer have this config set to TRUE.
  c) The producer does the send(), it issues a TMR and finds that the topic 
does not exist. It will check this config and issue a createTopicRequest. When 
the createTopicRequest returns  success, it will start producing the actual 
data.
  d) Same goes for Consumer, if the topic does not exist, consumer will create 
it if the config is enabled. This is not a hard requirement on the consumer 
side.

2) No config and the topic creation is left to user explicitly.

The problem with 2) is that, if we have mirror makers running, we have to do 
the topic creation in producer, in some sort of callback may be, but it would 
be better to go with 1) in this scenario.

+ Joel [~jjkoshy] 



> Implement "Auto Topic Creation" client side and remove support from Broker 
> side
> ---
>
> Key: KAFKA-2410
> URL: https://issues.apache.org/jira/browse/KAFKA-2410
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, core
>Affects Versions: 0.8.2.1
>Reporter: Grant Henke
>Assignee: Grant Henke
>
> Auto topic creation on the broker has caused pain in the past; And today it 
> still causes unusual error handling requirements on the client side, added 
> complexity in the broker, mixed responsibility of the TopicMetadataRequest, 
> and limits configuration of the option to be cluster wide. In the future 
> having it broker side will also make features such as authorization very 
> difficult. 
> There have been discussions in the past of implementing this feature client 
> side. 
> [example|https://issues.apache.org/jira/browse/KAFKA-689?focusedCommentId=13548746&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13548746]
> This Jira is to track that discussion and implementation once the necessary 
> protocol support exists: KAFKA-2229



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: KIP-41: KafkaConsumer Max Records

2016-01-08 Thread Jason Gustafson
Thanks Jens for all of your work as well! Unless there are any more
concerns, perhaps we can open the vote early next week.

As a quick summary for newcomers to this thread, the problem we're trying
to solve in this KIP is how to give users more predictable control over the
message processing loop. Because the new consumer is single-threaded, the
poll() API must be called frequently enough to ensure that the consumer can
send heartbeats before its session timeout expires. Typically we recommend
setting the session timeout large enough to make expiration unlikely, but
that can be difficult advice to follow in practice when either the number
of partitions is unknown or increases over time. In some cases, such as in
Jens' initial bug report, the processing time does not even depend directly
on the size of the total data to be processed.

To address this problem, we have proposed to offer a new configuration
option "max.poll.records" which sets an upper bound on the number of
records returned in a single call to poll(). The point is to give users a
way to limit message processing time so that the session timeout can be set
without risking unexpected rebalances. This change is backward compatible
with the current API and users only need to change their configuration to
take advantage of it. As a bonus, it provides an easy mechanism to
implement commit policies which ensure commits at least as often as every N
records.

As a final subject for consideration, it may make sense to also add a
configuration "max.poll.bytes," which places an upper bound on the total
size of the data returned in a call to poll(). This would solve the problem
more generally since some use cases may actually have processing time which
is more dependent on the total size of the data than the number of records.
Others might require a mix of the two.

-Jason

On Fri, Jan 8, 2016 at 9:42 AM, Jason Gustafson  wrote:

> Hi Aarti,
>
> Thanks for the feedback. I think the concern about memory overhead is
> valid. As Guozhang mentioned, the problem already exists in the current
> consumer, so this probably deserves consideration outside of this KIP. That
> said, it's a good question whether our prefetching strategy makes it more
> difficult to control the memory overhead. The approach we've proposed for
> prefetching is basically the following: fetch all partitions whenever the
> number of retained messages is less than max.poll.records. In the worst
> case, this increases the maximum memory used by the consumer by the size of
> those retained messages. As you've pointed out, messages could be very
> large. We could reduce this requirement with a slight change: instead of
> fetching all partitions, we could fetch only those with no retained data.
> That would reduce the worst-case overhead to #no partitions *
> max.partition.fetch.bytes, which matches the existing memory overhead.
> Would that address your concern?
>
> A couple other points worth mentioning is that users have the option not
> to use max.poll.records, in which case the behavior will be the same as in
> the current consumer. Additionally, the implementation can be changed over
> time without affecting users, so we can adjust it in particular when we
> address memory concerns in KAFKA-2045.
>
> On a side note, I'm wondering if it would be useful to extend this KIP to
> include a max.poll.bytes? For some use cases, it may make more sense to
> control the processing time by the size of data instead of the number of
> records. Not that I'm in anxious to draw this out, but if we'll need this
> setting eventually, we may as well do it now. Thoughts?
>
>
> -Jason
>
> On Fri, Jan 8, 2016 at 1:03 AM, Jens Rantil  wrote:
>
>> Hi,
>>
>> I just publicly wanted to thank Jason for the work he's done with the KIP
>> and say that I've been in touch with him privately back and forth to work
>> out of some of its details. Thanks!
>>
>> Since it feels like I initiated this KIP a bit I also want to say that I'm
>> happy with it and that its proposal solves the initial issue I reported in
>> https://issues.apache.org/jira/browse/KAFKA-2986. That said, I open for a
>> [VOTE] on my behalf. I propose Jason decides when voting starts.
>>
>> Cheers and keep up the good work,
>> Jens
>>
>> On Tue, Jan 5, 2016 at 8:32 PM, Jason Gustafson 
>> wrote:
>>
>> > I've updated the KIP with some implementation details. I also added more
>> > discussion on the heartbeat() alternative. The short answer for why we
>> > rejected this API is that it doesn't seem to work well with offset
>> commits.
>> > This would tend to make correct usage complicated and difficult to
>> explain.
>> > Additionally, we don't see any clear advantages over having a way to set
>> > the max records. For example, using max.records=1 would be equivalent to
>> > invoking heartbeat() on each iteration of the message processing loop.
>> >
>> > Going back to the discussion on whether we should use a configuration
>> value
>> > or overload poll(), I'

Re: [VOTE] KIP-32 Add CreateTime and LogAppendTime to Kafka message.

2016-01-08 Thread Joel Koshy
+1 from me

Looking through this thread it seems there was some confusion on the
migration discussion. This discussion in fact happened in the KIP-31
discuss thread, not so much in the KIP hangout. There is considerable
overlap in discussions between KIP-3[1,2,3] so it makes sense to
cross-reference all of these.

I'm finding the Apache list archive a little cumbersome to use (e.g., the
current link in KIP-31 points to the beginning of September archives) but
the emails discussing migration were in October:
http://mail-archives.apache.org/mod_mbox/kafka-dev/201510.mbox/thread

Markmail has a better interface but interestingly it has not indexed any of
the emails from August, September and early October (
http://markmail.org/search/?q=list%3Aorg.apache.incubator.kafka-dev+date%3A201509-201511+order%3Adate-backward).
Perhaps KIPs should include a permalink to the first message of the DISCUSS
thread. E.g.,
http://mail-archives.apache.org/mod_mbox/kafka-dev/201509.mbox/%3CCAHrRUm5jvL_dPeZWnfBD-vONgSZWOq1VL1Ss8OSUOCPXmtg8rQ%40mail.gmail.com%3E

Also, just to clarify Jay's comments on the content of KIPs: I think having
a pseudo-code spec/implementation guide is useful (especially for
client-side KIPs). While the motivation should definitely capture “why we
are doing the KIP” it probably shouldn’t have to exhaustively capture “why
we are doing the KIP *this way*”. i.e., some of the discussions are
extremely nuanced and in this case spans multiple KIPs so links to other
KIPs and the discuss threads and KIP hangout recordings are perhaps
sufficient to fill this gap - or maybe a new section that summarizes the
discussions.

Thanks,

Joel

On Wed, Jan 6, 2016 at 9:29 AM, Jun Rao  wrote:

> Hi, Jiangjie,
>
> 52. Replacing MessageSet with o.a.k.common.record will be ideal.
> Unfortunately, we use MessageSet in SimpleConsumer, which is part of the
> public api. Replacing MessageSet with o.a.k.common.record will be an
> incompatible api change. So, we probably should do this after we deprecate
> SimpleConsumer.
>
> My original question is actually whether we just bump up magic byte in
> Message once to incorporate both the offset and the timestamp change. It
> seems that the answer is yes. Could you reflect that in the KIP?
>
> Thanks,
>
> Jun
>
>
> On Wed, Jan 6, 2016 at 7:01 AM, Becket Qin  wrote:
>
> > Thanks a lot for the careful reading, Jun.
> > Please see inline replies.
> >
> >
> > > On Jan 6, 2016, at 3:24 AM, Jun Rao  wrote:
> > >
> > > Jiangjie,
> > >
> > > Thanks for the updated KIP. Overall, a +1 on the proposal. A few minor
> > > comments on the KIP.
> > >
> > > KIP-32:
> > > 50. 6.c says "The log rolling has to depend on the earliest timestamp",
> > > which is inconsistent with KIP-33.
> > Corrected.
> > >
> > > 51. 8.b "If the time difference threshold is set to 0. The timestamp in
> > the
> > > message is equivalent to LogAppendTime." If the time difference is 0
> and
> > > CreateTime is used, all messages will likely be rejected in this
> > proposal.
> > > So, it's not equivalent to LogAppendTime.
> > Corrected.
> > >
> > > 52. Could you include the new value of magic byte in message format
> > change?
> > > Also, do we have a single new message format that includes both the
> > offset
> > > change (relative offset for inner messages) and the addition of
> > timestamp?
> > I am actually thinking about this when I am writing the patch.
> > The timestamp will be added to the o.a.k.common.record.Record and
> > Kafka.message.Message. The offset change is in
> > o.a.k.common.record.MemoryRecords and Kafka.message.MessageSet. To avoid
> > unnecessary changes, my current patch did not merge them together but
> > simply make sure the version of  Record(Message) and
> > MemoryRecords(MessageSet) matches.
> >
> > Currently new clients uses classes in o.a.k.common.record, and the broker
> > and old clients uses classes in kafka.message.
> > I am thinking about doing the followings:
> > 1. Migrate broker to use o.a.k.common.record.
> > 2. Add message format V0 and V1 to o.a.k.common.protocol.Protocols.
> > Ideally we should be able to define all the wire protocols in
> > o.a.k.common.protocol.Protocol. So instead of having Record class to
> parse
> > byte arrays by itself, we can use Schema to parse the records.
> >
> > Would that be better?
> > >
> > > 53. Could you document the changes in ProducerRequest V2 and
> FetchRequest
> > > V2 (and the responses)?
> > Done.
> > >
> > > 54. In migration phase 1, step 2, does internal ApiVersion mean
> > > inter.broker.protocol.version?
> > Yes.
> > >
> > > 55. In canary step 2.b, it says "It will only see
> > > ProduceRequest/FetchRequest V1 from other brokers and clietns.". But in
> > > phase 2, a broker will receive FetchRequest V2 from other brokers.
> > I meant when we canary a broker in phase 2, there will be only one broker
> > entering phase 2, the other brokers will remain at phase 1.
> > >
> > >
> > > KIP-33:
> > > 60. The KIP still says maintaining index at "at 

[jira] [Created] (KAFKA-3084) Topic existence checks in topic commands (create, alter, delete)

2016-01-08 Thread Grant Henke (JIRA)
Grant Henke created KAFKA-3084:
--

 Summary: Topic existence checks in topic commands (create, alter, 
delete)
 Key: KAFKA-3084
 URL: https://issues.apache.org/jira/browse/KAFKA-3084
 Project: Kafka
  Issue Type: Improvement
Reporter: Grant Henke
Assignee: Grant Henke


In Kafka 0.9.0 error codes were added to the topic commands. However, often 
users only want to perform an action based on the existence of a topic. And 
they don't want to error if the topic does or does not exist.

Adding if-exists option for the topic delete and alter commands and 
if-not-exists for the create command allows users to build scripts that can 
handle this expected state without error codes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-3084: Topic existence checks in topic co...

2016-01-08 Thread granthenke
GitHub user granthenke opened a pull request:

https://github.com/apache/kafka/pull/744

KAFKA-3084: Topic existence checks in topic commands (create, alter, …

…delete)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/granthenke/kafka exists-checks

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/744.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #744


commit d85b01e84cf3c5aadd0eaeba696e309873ab5bb7
Author: Grant Henke 
Date:   2016-01-08T20:21:12Z

KAFKA-3084: Topic existence checks in topic commands (create, alter, delete)




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-3084) Topic existence checks in topic commands (create, alter, delete)

2016-01-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15089874#comment-15089874
 ] 

ASF GitHub Bot commented on KAFKA-3084:
---

GitHub user granthenke opened a pull request:

https://github.com/apache/kafka/pull/744

KAFKA-3084: Topic existence checks in topic commands (create, alter, …

…delete)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/granthenke/kafka exists-checks

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/744.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #744


commit d85b01e84cf3c5aadd0eaeba696e309873ab5bb7
Author: Grant Henke 
Date:   2016-01-08T20:21:12Z

KAFKA-3084: Topic existence checks in topic commands (create, alter, delete)




> Topic existence checks in topic commands (create, alter, delete)
> 
>
> Key: KAFKA-3084
> URL: https://issues.apache.org/jira/browse/KAFKA-3084
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Grant Henke
>Assignee: Grant Henke
>
> In Kafka 0.9.0 error codes were added to the topic commands. However, often 
> users only want to perform an action based on the existence of a topic. And 
> they don't want to error if the topic does or does not exist.
> Adding if-exists option for the topic delete and alter commands and 
> if-not-exists for the create command allows users to build scripts that can 
> handle this expected state without error codes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3084) Topic existence checks in topic commands (create, alter, delete)

2016-01-08 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KAFKA-3084:
---
Status: Patch Available  (was: Open)

> Topic existence checks in topic commands (create, alter, delete)
> 
>
> Key: KAFKA-3084
> URL: https://issues.apache.org/jira/browse/KAFKA-3084
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Grant Henke
>Assignee: Grant Henke
>
> In Kafka 0.9.0 error codes were added to the topic commands. However, often 
> users only want to perform an action based on the existence of a topic. And 
> they don't want to error if the topic does or does not exist.
> Adding if-exists option for the topic delete and alter commands and 
> if-not-exists for the create command allows users to build scripts that can 
> handle this expected state without error codes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] KIP-32 Add CreateTime and LogAppendTime to Kafka message.

2016-01-08 Thread Anna Povzner
Hi Becket and everyone,

Could we please add the following functionality to this KIP. I think it
would be very useful for the broker to return the timestamp in the ack to
the producer (in response: timestamp per partition) and propagate it back
to client in RecordMetadata. This way, if timestamp type is LogAppendTime,
the producer client will see what timestamp was actually set -- and it
would match the timestamp that consumer sees. Also, returning the timestamp
in RecordMetadata is also useful for timestamp type = CreateTime, since
timestamp could be also set in KafkaProducer (if client set timestamp in
ProducerRecord to 0).

Since this requires protocol change as well, it will be better to implement
this as part of KIP-32, rather than proposing a new KIP.

Thanks,
Anna


On Fri, Jan 8, 2016 at 12:53 PM, Joel Koshy  wrote:

> +1 from me
>
> Looking through this thread it seems there was some confusion on the
> migration discussion. This discussion in fact happened in the KIP-31
> discuss thread, not so much in the KIP hangout. There is considerable
> overlap in discussions between KIP-3[1,2,3] so it makes sense to
> cross-reference all of these.
>
> I'm finding the Apache list archive a little cumbersome to use (e.g., the
> current link in KIP-31 points to the beginning of September archives) but
> the emails discussing migration were in October:
> http://mail-archives.apache.org/mod_mbox/kafka-dev/201510.mbox/thread
>
> Markmail has a better interface but interestingly it has not indexed any of
> the emails from August, September and early October (
>
> http://markmail.org/search/?q=list%3Aorg.apache.incubator.kafka-dev+date%3A201509-201511+order%3Adate-backward
> ).
> Perhaps KIPs should include a permalink to the first message of the DISCUSS
> thread. E.g.,
>
> http://mail-archives.apache.org/mod_mbox/kafka-dev/201509.mbox/%3CCAHrRUm5jvL_dPeZWnfBD-vONgSZWOq1VL1Ss8OSUOCPXmtg8rQ%40mail.gmail.com%3E
>
> Also, just to clarify Jay's comments on the content of KIPs: I think having
> a pseudo-code spec/implementation guide is useful (especially for
> client-side KIPs). While the motivation should definitely capture “why we
> are doing the KIP” it probably shouldn’t have to exhaustively capture “why
> we are doing the KIP *this way*”. i.e., some of the discussions are
> extremely nuanced and in this case spans multiple KIPs so links to other
> KIPs and the discuss threads and KIP hangout recordings are perhaps
> sufficient to fill this gap - or maybe a new section that summarizes the
> discussions.
>
> Thanks,
>
> Joel
>
> On Wed, Jan 6, 2016 at 9:29 AM, Jun Rao  wrote:
>
> > Hi, Jiangjie,
> >
> > 52. Replacing MessageSet with o.a.k.common.record will be ideal.
> > Unfortunately, we use MessageSet in SimpleConsumer, which is part of the
> > public api. Replacing MessageSet with o.a.k.common.record will be an
> > incompatible api change. So, we probably should do this after we
> deprecate
> > SimpleConsumer.
> >
> > My original question is actually whether we just bump up magic byte in
> > Message once to incorporate both the offset and the timestamp change. It
> > seems that the answer is yes. Could you reflect that in the KIP?
> >
> > Thanks,
> >
> > Jun
> >
> >
> > On Wed, Jan 6, 2016 at 7:01 AM, Becket Qin  wrote:
> >
> > > Thanks a lot for the careful reading, Jun.
> > > Please see inline replies.
> > >
> > >
> > > > On Jan 6, 2016, at 3:24 AM, Jun Rao  wrote:
> > > >
> > > > Jiangjie,
> > > >
> > > > Thanks for the updated KIP. Overall, a +1 on the proposal. A few
> minor
> > > > comments on the KIP.
> > > >
> > > > KIP-32:
> > > > 50. 6.c says "The log rolling has to depend on the earliest
> timestamp",
> > > > which is inconsistent with KIP-33.
> > > Corrected.
> > > >
> > > > 51. 8.b "If the time difference threshold is set to 0. The timestamp
> in
> > > the
> > > > message is equivalent to LogAppendTime." If the time difference is 0
> > and
> > > > CreateTime is used, all messages will likely be rejected in this
> > > proposal.
> > > > So, it's not equivalent to LogAppendTime.
> > > Corrected.
> > > >
> > > > 52. Could you include the new value of magic byte in message format
> > > change?
> > > > Also, do we have a single new message format that includes both the
> > > offset
> > > > change (relative offset for inner messages) and the addition of
> > > timestamp?
> > > I am actually thinking about this when I am writing the patch.
> > > The timestamp will be added to the o.a.k.common.record.Record and
> > > Kafka.message.Message. The offset change is in
> > > o.a.k.common.record.MemoryRecords and Kafka.message.MessageSet. To
> avoid
> > > unnecessary changes, my current patch did not merge them together but
> > > simply make sure the version of  Record(Message) and
> > > MemoryRecords(MessageSet) matches.
> > >
> > > Currently new clients uses classes in o.a.k.common.record, and the
> broker
> > > and old clients uses classes in kafka.message.
> > > I am thinking about doing the followings:
> > > 1. 

Jenkins build is back to normal : kafka-trunk-jdk7 #947

2016-01-08 Thread Apache Jenkins Server
See 



[jira] [Commented] (KAFKA-2985) Consumer group stuck in rebalancing state

2016-01-08 Thread Cosmin Marginean (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15089962#comment-15089962
 ] 

Cosmin Marginean commented on KAFKA-2985:
-

I can confirm that I can also reproduce this with a very similar setup.
This is not the case thought when using the old (pre 0.9.0.0) consumer on a 
0.9.0.0 server

> Consumer group stuck in rebalancing state
> -
>
> Key: KAFKA-2985
> URL: https://issues.apache.org/jira/browse/KAFKA-2985
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.9.0.0
> Environment: Kafka 0.9.0.0.
> Kafka Java consumer 0.9.0.0
> 2 Java producers.
> 3 Java consumers using the new consumer API.
> 2 kafka brokers.
>Reporter: Jens Rantil
>Assignee: Jason Gustafson
>
> We've doing some load testing on Kafka. _After_ the load test when our 
> consumers and have two times now seen Kafka become stuck in consumer group 
> rebalancing. This is after all our consumers are done consuming and 
> essentially polling periodically without getting any records.
> The brokers list the consumer group (named "default"), but I can't query the 
> offsets:
> {noformat}
> jrantil@queue-0:/srv/kafka/kafka$ ./bin/kafka-consumer-groups.sh 
> --new-consumer --bootstrap-server localhost:9092 --list
> default
> jrantil@queue-0:/srv/kafka/kafka$ ./bin/kafka-consumer-groups.sh 
> --new-consumer --bootstrap-server localhost:9092 --describe --group 
> default|sort
> Consumer group `default` does not exist or is rebalancing.
> {noformat}
> Retrying to query the offsets for 15 minutes or so still said it was 
> rebalancing. After restarting our first broker, the group immediately started 
> rebalancing. That broker was logging this before restart:
> {noformat}
> [2015-12-12 13:09:48,517] INFO [Group Metadata Manager on Broker 0]: Removed 
> 0 expired offsets in 0 milliseconds. (kafka.coordinator.GroupMetadataManager)
> [2015-12-12 13:10:16,139] INFO [GroupCoordinator 0]: Stabilized group default 
> generation 16 (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:10:16,141] INFO [GroupCoordinator 0]: Assignment received from 
> leader for group default for generation 16 
> (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:10:16,575] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group default with old generation 16 (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:11:15,141] INFO [GroupCoordinator 0]: Stabilized group default 
> generation 17 (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:11:15,143] INFO [GroupCoordinator 0]: Assignment received from 
> leader for group default for generation 17 
> (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:11:15,314] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group default with old generation 17 (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:12:14,144] INFO [GroupCoordinator 0]: Stabilized group default 
> generation 18 (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:12:14,145] INFO [GroupCoordinator 0]: Assignment received from 
> leader for group default for generation 18 
> (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:12:14,340] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group default with old generation 18 (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:13:13,146] INFO [GroupCoordinator 0]: Stabilized group default 
> generation 19 (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:13:13,148] INFO [GroupCoordinator 0]: Assignment received from 
> leader for group default for generation 19 
> (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:13:13,238] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group default with old generation 19 (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:14:12,148] INFO [GroupCoordinator 0]: Stabilized group default 
> generation 20 (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:14:12,149] INFO [GroupCoordinator 0]: Assignment received from 
> leader for group default for generation 20 
> (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:14:12,360] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group default with old generation 20 (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:15:11,150] INFO [GroupCoordinator 0]: Stabilized group default 
> generation 21 (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:15:11,152] INFO [GroupCoordinator 0]: Assignment received from 
> leader for group default for generation 21 
> (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:15:11,217] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group default with old generation 21 (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:16:10,152] INFO [GroupCoordinator 0]: Stabilized group default 
> generation 22 (kafka.coordinator.GroupCoordinator)
> [2015-12-1

Re: [VOTE] KIP-32 Add CreateTime and LogAppendTime to Kafka message.

2016-01-08 Thread Joel Koshy
Hi Anna,

That sounds good to me - Becket/others any thoughts?

Thanks,

Joel

On Fri, Jan 8, 2016 at 12:41 PM, Anna Povzner  wrote:

> Hi Becket and everyone,
>
> Could we please add the following functionality to this KIP. I think it
> would be very useful for the broker to return the timestamp in the ack to
> the producer (in response: timestamp per partition) and propagate it back
> to client in RecordMetadata. This way, if timestamp type is LogAppendTime,
> the producer client will see what timestamp was actually set -- and it
> would match the timestamp that consumer sees. Also, returning the timestamp
> in RecordMetadata is also useful for timestamp type = CreateTime, since
> timestamp could be also set in KafkaProducer (if client set timestamp in
> ProducerRecord to 0).
>
> Since this requires protocol change as well, it will be better to implement
> this as part of KIP-32, rather than proposing a new KIP.
>
> Thanks,
> Anna
>
>
> On Fri, Jan 8, 2016 at 12:53 PM, Joel Koshy  wrote:
>
> > +1 from me
> >
> > Looking through this thread it seems there was some confusion on the
> > migration discussion. This discussion in fact happened in the KIP-31
> > discuss thread, not so much in the KIP hangout. There is considerable
> > overlap in discussions between KIP-3[1,2,3] so it makes sense to
> > cross-reference all of these.
> >
> > I'm finding the Apache list archive a little cumbersome to use (e.g., the
> > current link in KIP-31 points to the beginning of September archives) but
> > the emails discussing migration were in October:
> > http://mail-archives.apache.org/mod_mbox/kafka-dev/201510.mbox/thread
> >
> > Markmail has a better interface but interestingly it has not indexed any
> of
> > the emails from August, September and early October (
> >
> >
> http://markmail.org/search/?q=list%3Aorg.apache.incubator.kafka-dev+date%3A201509-201511+order%3Adate-backward
> > ).
> > Perhaps KIPs should include a permalink to the first message of the
> DISCUSS
> > thread. E.g.,
> >
> >
> http://mail-archives.apache.org/mod_mbox/kafka-dev/201509.mbox/%3CCAHrRUm5jvL_dPeZWnfBD-vONgSZWOq1VL1Ss8OSUOCPXmtg8rQ%40mail.gmail.com%3E
> >
> > Also, just to clarify Jay's comments on the content of KIPs: I think
> having
> > a pseudo-code spec/implementation guide is useful (especially for
> > client-side KIPs). While the motivation should definitely capture “why we
> > are doing the KIP” it probably shouldn’t have to exhaustively capture
> “why
> > we are doing the KIP *this way*”. i.e., some of the discussions are
> > extremely nuanced and in this case spans multiple KIPs so links to other
> > KIPs and the discuss threads and KIP hangout recordings are perhaps
> > sufficient to fill this gap - or maybe a new section that summarizes the
> > discussions.
> >
> > Thanks,
> >
> > Joel
> >
> > On Wed, Jan 6, 2016 at 9:29 AM, Jun Rao  wrote:
> >
> > > Hi, Jiangjie,
> > >
> > > 52. Replacing MessageSet with o.a.k.common.record will be ideal.
> > > Unfortunately, we use MessageSet in SimpleConsumer, which is part of
> the
> > > public api. Replacing MessageSet with o.a.k.common.record will be an
> > > incompatible api change. So, we probably should do this after we
> > deprecate
> > > SimpleConsumer.
> > >
> > > My original question is actually whether we just bump up magic byte in
> > > Message once to incorporate both the offset and the timestamp change.
> It
> > > seems that the answer is yes. Could you reflect that in the KIP?
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > >
> > > On Wed, Jan 6, 2016 at 7:01 AM, Becket Qin 
> wrote:
> > >
> > > > Thanks a lot for the careful reading, Jun.
> > > > Please see inline replies.
> > > >
> > > >
> > > > > On Jan 6, 2016, at 3:24 AM, Jun Rao  wrote:
> > > > >
> > > > > Jiangjie,
> > > > >
> > > > > Thanks for the updated KIP. Overall, a +1 on the proposal. A few
> > minor
> > > > > comments on the KIP.
> > > > >
> > > > > KIP-32:
> > > > > 50. 6.c says "The log rolling has to depend on the earliest
> > timestamp",
> > > > > which is inconsistent with KIP-33.
> > > > Corrected.
> > > > >
> > > > > 51. 8.b "If the time difference threshold is set to 0. The
> timestamp
> > in
> > > > the
> > > > > message is equivalent to LogAppendTime." If the time difference is
> 0
> > > and
> > > > > CreateTime is used, all messages will likely be rejected in this
> > > > proposal.
> > > > > So, it's not equivalent to LogAppendTime.
> > > > Corrected.
> > > > >
> > > > > 52. Could you include the new value of magic byte in message format
> > > > change?
> > > > > Also, do we have a single new message format that includes both the
> > > > offset
> > > > > change (relative offset for inner messages) and the addition of
> > > > timestamp?
> > > > I am actually thinking about this when I am writing the patch.
> > > > The timestamp will be added to the o.a.k.common.record.Record and
> > > > Kafka.message.Message. The offset change is in
> > > > o.a.k.common.record.MemoryRecords and Kafka.message.MessageSet.

[jira] [Created] (KAFKA-3085) BrokerChangeListener computes inconsistent live/dead broker list

2016-01-08 Thread Jun Rao (JIRA)
Jun Rao created KAFKA-3085:
--

 Summary: BrokerChangeListener computes inconsistent live/dead 
broker list
 Key: KAFKA-3085
 URL: https://issues.apache.org/jira/browse/KAFKA-3085
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.9.0.0
Reporter: Jun Rao


On a broker change ZK event, BrokerChangeListener gets the current broker list 
from ZK. It then computes a new broker list, a dead broker list, and a live 
broker list with more detailed broker info. The new and live broker list are 
computed by reading the value associated with each of the current broker twice. 
If a broker is de-registered in between, these two list will not be consistent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: MINOR: Security doc fixes

2016-01-08 Thread ijuma
GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/745

MINOR: Security doc fixes

Simple fixes that have tripped users.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka security-doc-improvements

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/745.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #745


commit 0417c5fdcfbb2046a1dcde3cea2607462207d649
Author: Ismael Juma 
Date:   2016-01-08T21:31:35Z

Minor security doc fixes




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-2985) Consumer group stuck in rebalancing state

2016-01-08 Thread Jason Gustafson (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15089983#comment-15089983
 ] 

Jason Gustafson commented on KAFKA-2985:


[~cosmin.marginean] Have you tested with the 0.9.0 branch?

> Consumer group stuck in rebalancing state
> -
>
> Key: KAFKA-2985
> URL: https://issues.apache.org/jira/browse/KAFKA-2985
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.9.0.0
> Environment: Kafka 0.9.0.0.
> Kafka Java consumer 0.9.0.0
> 2 Java producers.
> 3 Java consumers using the new consumer API.
> 2 kafka brokers.
>Reporter: Jens Rantil
>Assignee: Jason Gustafson
>
> We've doing some load testing on Kafka. _After_ the load test when our 
> consumers and have two times now seen Kafka become stuck in consumer group 
> rebalancing. This is after all our consumers are done consuming and 
> essentially polling periodically without getting any records.
> The brokers list the consumer group (named "default"), but I can't query the 
> offsets:
> {noformat}
> jrantil@queue-0:/srv/kafka/kafka$ ./bin/kafka-consumer-groups.sh 
> --new-consumer --bootstrap-server localhost:9092 --list
> default
> jrantil@queue-0:/srv/kafka/kafka$ ./bin/kafka-consumer-groups.sh 
> --new-consumer --bootstrap-server localhost:9092 --describe --group 
> default|sort
> Consumer group `default` does not exist or is rebalancing.
> {noformat}
> Retrying to query the offsets for 15 minutes or so still said it was 
> rebalancing. After restarting our first broker, the group immediately started 
> rebalancing. That broker was logging this before restart:
> {noformat}
> [2015-12-12 13:09:48,517] INFO [Group Metadata Manager on Broker 0]: Removed 
> 0 expired offsets in 0 milliseconds. (kafka.coordinator.GroupMetadataManager)
> [2015-12-12 13:10:16,139] INFO [GroupCoordinator 0]: Stabilized group default 
> generation 16 (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:10:16,141] INFO [GroupCoordinator 0]: Assignment received from 
> leader for group default for generation 16 
> (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:10:16,575] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group default with old generation 16 (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:11:15,141] INFO [GroupCoordinator 0]: Stabilized group default 
> generation 17 (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:11:15,143] INFO [GroupCoordinator 0]: Assignment received from 
> leader for group default for generation 17 
> (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:11:15,314] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group default with old generation 17 (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:12:14,144] INFO [GroupCoordinator 0]: Stabilized group default 
> generation 18 (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:12:14,145] INFO [GroupCoordinator 0]: Assignment received from 
> leader for group default for generation 18 
> (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:12:14,340] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group default with old generation 18 (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:13:13,146] INFO [GroupCoordinator 0]: Stabilized group default 
> generation 19 (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:13:13,148] INFO [GroupCoordinator 0]: Assignment received from 
> leader for group default for generation 19 
> (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:13:13,238] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group default with old generation 19 (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:14:12,148] INFO [GroupCoordinator 0]: Stabilized group default 
> generation 20 (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:14:12,149] INFO [GroupCoordinator 0]: Assignment received from 
> leader for group default for generation 20 
> (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:14:12,360] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group default with old generation 20 (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:15:11,150] INFO [GroupCoordinator 0]: Stabilized group default 
> generation 21 (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:15:11,152] INFO [GroupCoordinator 0]: Assignment received from 
> leader for group default for generation 21 
> (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:15:11,217] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group default with old generation 21 (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:16:10,152] INFO [GroupCoordinator 0]: Stabilized group default 
> generation 22 (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:16:10,154] INFO [GroupCoordinator 0]: Assignment received from 
> leader for group default for generat

[jira] [Assigned] (KAFKA-3076) BrokerChangeListener should log the brokers in order

2016-01-08 Thread Konrad Kalita (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konrad Kalita reassigned KAFKA-3076:


Assignee: Konrad Kalita

> BrokerChangeListener should log the brokers in order
> 
>
> Key: KAFKA-3076
> URL: https://issues.apache.org/jira/browse/KAFKA-3076
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 0.9.0.0
>Reporter: Jun Rao
>Assignee: Konrad Kalita
>  Labels: newbie
>
> Currently, in BrokerChangeListener, we log the full, new and deleted broker 
> set in random order. It would be better if we log them in sorted order.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2985) Consumer group stuck in rebalancing state

2016-01-08 Thread Cosmin Marginean (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15089993#comment-15089993
 ] 

Cosmin Marginean commented on KAFKA-2985:
-

I have't yet, no, as I'm struggling to build a complete server distro on that 
(and reverting now to old consumer to solve the immediate issue was a priority).


> Consumer group stuck in rebalancing state
> -
>
> Key: KAFKA-2985
> URL: https://issues.apache.org/jira/browse/KAFKA-2985
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.9.0.0
> Environment: Kafka 0.9.0.0.
> Kafka Java consumer 0.9.0.0
> 2 Java producers.
> 3 Java consumers using the new consumer API.
> 2 kafka brokers.
>Reporter: Jens Rantil
>Assignee: Jason Gustafson
>
> We've doing some load testing on Kafka. _After_ the load test when our 
> consumers and have two times now seen Kafka become stuck in consumer group 
> rebalancing. This is after all our consumers are done consuming and 
> essentially polling periodically without getting any records.
> The brokers list the consumer group (named "default"), but I can't query the 
> offsets:
> {noformat}
> jrantil@queue-0:/srv/kafka/kafka$ ./bin/kafka-consumer-groups.sh 
> --new-consumer --bootstrap-server localhost:9092 --list
> default
> jrantil@queue-0:/srv/kafka/kafka$ ./bin/kafka-consumer-groups.sh 
> --new-consumer --bootstrap-server localhost:9092 --describe --group 
> default|sort
> Consumer group `default` does not exist or is rebalancing.
> {noformat}
> Retrying to query the offsets for 15 minutes or so still said it was 
> rebalancing. After restarting our first broker, the group immediately started 
> rebalancing. That broker was logging this before restart:
> {noformat}
> [2015-12-12 13:09:48,517] INFO [Group Metadata Manager on Broker 0]: Removed 
> 0 expired offsets in 0 milliseconds. (kafka.coordinator.GroupMetadataManager)
> [2015-12-12 13:10:16,139] INFO [GroupCoordinator 0]: Stabilized group default 
> generation 16 (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:10:16,141] INFO [GroupCoordinator 0]: Assignment received from 
> leader for group default for generation 16 
> (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:10:16,575] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group default with old generation 16 (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:11:15,141] INFO [GroupCoordinator 0]: Stabilized group default 
> generation 17 (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:11:15,143] INFO [GroupCoordinator 0]: Assignment received from 
> leader for group default for generation 17 
> (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:11:15,314] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group default with old generation 17 (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:12:14,144] INFO [GroupCoordinator 0]: Stabilized group default 
> generation 18 (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:12:14,145] INFO [GroupCoordinator 0]: Assignment received from 
> leader for group default for generation 18 
> (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:12:14,340] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group default with old generation 18 (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:13:13,146] INFO [GroupCoordinator 0]: Stabilized group default 
> generation 19 (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:13:13,148] INFO [GroupCoordinator 0]: Assignment received from 
> leader for group default for generation 19 
> (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:13:13,238] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group default with old generation 19 (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:14:12,148] INFO [GroupCoordinator 0]: Stabilized group default 
> generation 20 (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:14:12,149] INFO [GroupCoordinator 0]: Assignment received from 
> leader for group default for generation 20 
> (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:14:12,360] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group default with old generation 20 (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:15:11,150] INFO [GroupCoordinator 0]: Stabilized group default 
> generation 21 (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:15:11,152] INFO [GroupCoordinator 0]: Assignment received from 
> leader for group default for generation 21 
> (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:15:11,217] INFO [GroupCoordinator 0]: Preparing to restabilize 
> group default with old generation 21 (kafka.coordinator.GroupCoordinator)
> [2015-12-12 13:16:10,152] INFO [GroupCoordinator 0]: Stabilized group default 
> generation 22 (kafka.coordinator.GroupCoordinator)
> [2015-12-12 1

[jira] [Created] (KAFKA-3086) unused handle method in MirrorMakerMessageHandler

2016-01-08 Thread Jun Rao (JIRA)
Jun Rao created KAFKA-3086:
--

 Summary: unused handle method in MirrorMakerMessageHandler
 Key: KAFKA-3086
 URL: https://issues.apache.org/jira/browse/KAFKA-3086
 Project: Kafka
  Issue Type: Improvement
Affects Versions: 0.9.0.0
Reporter: Jun Rao


The following method is never used by MirrorMaker.

  trait MirrorMakerMessageHandler {
def handle(record: MessageAndMetadata[Array[Byte], Array[Byte]]): 
util.List[ProducerRecord[Array[Byte], Array[Byte]]]




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: MINOR: speed up connect startup when full conn...

2016-01-08 Thread hachikuji
GitHub user hachikuji opened a pull request:

https://github.com/apache/kafka/pull/746

MINOR: speed up connect startup when full connector class name is provided



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hachikuji/kafka worker-startup-improvement

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/746.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #746


commit 1bd4182d5690c47df8878c3a930af6a542a6a77a
Author: Jason Gustafson 
Date:   2016-01-08T22:00:33Z

MINOR: speed up connect startup when full connector class name is provided




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [VOTE] KIP-32 Add CreateTime and LogAppendTime to Kafka message.

2016-01-08 Thread Gwen Shapira
Sounds good to me too. Seems pretty easy to add and can be useful for
producers.

On Fri, Jan 8, 2016 at 1:22 PM, Joel Koshy  wrote:

> Hi Anna,
>
> That sounds good to me - Becket/others any thoughts?
>
> Thanks,
>
> Joel
>
> On Fri, Jan 8, 2016 at 12:41 PM, Anna Povzner  wrote:
>
> > Hi Becket and everyone,
> >
> > Could we please add the following functionality to this KIP. I think it
> > would be very useful for the broker to return the timestamp in the ack to
> > the producer (in response: timestamp per partition) and propagate it back
> > to client in RecordMetadata. This way, if timestamp type is
> LogAppendTime,
> > the producer client will see what timestamp was actually set -- and it
> > would match the timestamp that consumer sees. Also, returning the
> timestamp
> > in RecordMetadata is also useful for timestamp type = CreateTime, since
> > timestamp could be also set in KafkaProducer (if client set timestamp in
> > ProducerRecord to 0).
> >
> > Since this requires protocol change as well, it will be better to
> implement
> > this as part of KIP-32, rather than proposing a new KIP.
> >
> > Thanks,
> > Anna
> >
> >
> > On Fri, Jan 8, 2016 at 12:53 PM, Joel Koshy  wrote:
> >
> > > +1 from me
> > >
> > > Looking through this thread it seems there was some confusion on the
> > > migration discussion. This discussion in fact happened in the KIP-31
> > > discuss thread, not so much in the KIP hangout. There is considerable
> > > overlap in discussions between KIP-3[1,2,3] so it makes sense to
> > > cross-reference all of these.
> > >
> > > I'm finding the Apache list archive a little cumbersome to use (e.g.,
> the
> > > current link in KIP-31 points to the beginning of September archives)
> but
> > > the emails discussing migration were in October:
> > > http://mail-archives.apache.org/mod_mbox/kafka-dev/201510.mbox/thread
> > >
> > > Markmail has a better interface but interestingly it has not indexed
> any
> > of
> > > the emails from August, September and early October (
> > >
> > >
> >
> http://markmail.org/search/?q=list%3Aorg.apache.incubator.kafka-dev+date%3A201509-201511+order%3Adate-backward
> > > ).
> > > Perhaps KIPs should include a permalink to the first message of the
> > DISCUSS
> > > thread. E.g.,
> > >
> > >
> >
> http://mail-archives.apache.org/mod_mbox/kafka-dev/201509.mbox/%3CCAHrRUm5jvL_dPeZWnfBD-vONgSZWOq1VL1Ss8OSUOCPXmtg8rQ%40mail.gmail.com%3E
> > >
> > > Also, just to clarify Jay's comments on the content of KIPs: I think
> > having
> > > a pseudo-code spec/implementation guide is useful (especially for
> > > client-side KIPs). While the motivation should definitely capture “why
> we
> > > are doing the KIP” it probably shouldn’t have to exhaustively capture
> > “why
> > > we are doing the KIP *this way*”. i.e., some of the discussions are
> > > extremely nuanced and in this case spans multiple KIPs so links to
> other
> > > KIPs and the discuss threads and KIP hangout recordings are perhaps
> > > sufficient to fill this gap - or maybe a new section that summarizes
> the
> > > discussions.
> > >
> > > Thanks,
> > >
> > > Joel
> > >
> > > On Wed, Jan 6, 2016 at 9:29 AM, Jun Rao  wrote:
> > >
> > > > Hi, Jiangjie,
> > > >
> > > > 52. Replacing MessageSet with o.a.k.common.record will be ideal.
> > > > Unfortunately, we use MessageSet in SimpleConsumer, which is part of
> > the
> > > > public api. Replacing MessageSet with o.a.k.common.record will be an
> > > > incompatible api change. So, we probably should do this after we
> > > deprecate
> > > > SimpleConsumer.
> > > >
> > > > My original question is actually whether we just bump up magic byte
> in
> > > > Message once to incorporate both the offset and the timestamp change.
> > It
> > > > seems that the answer is yes. Could you reflect that in the KIP?
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > > >
> > > > On Wed, Jan 6, 2016 at 7:01 AM, Becket Qin 
> > wrote:
> > > >
> > > > > Thanks a lot for the careful reading, Jun.
> > > > > Please see inline replies.
> > > > >
> > > > >
> > > > > > On Jan 6, 2016, at 3:24 AM, Jun Rao  wrote:
> > > > > >
> > > > > > Jiangjie,
> > > > > >
> > > > > > Thanks for the updated KIP. Overall, a +1 on the proposal. A few
> > > minor
> > > > > > comments on the KIP.
> > > > > >
> > > > > > KIP-32:
> > > > > > 50. 6.c says "The log rolling has to depend on the earliest
> > > timestamp",
> > > > > > which is inconsistent with KIP-33.
> > > > > Corrected.
> > > > > >
> > > > > > 51. 8.b "If the time difference threshold is set to 0. The
> > timestamp
> > > in
> > > > > the
> > > > > > message is equivalent to LogAppendTime." If the time difference
> is
> > 0
> > > > and
> > > > > > CreateTime is used, all messages will likely be rejected in this
> > > > > proposal.
> > > > > > So, it's not equivalent to LogAppendTime.
> > > > > Corrected.
> > > > > >
> > > > > > 52. Could you include the new value of magic byte in message
> format
> > > > > change?
> > > > > > Also, do we have a single 

Re: Consider increasing the default reserved.broker.max.id

2016-01-08 Thread Grant Henke
I agree that many people id their brokers differently and increasing the
default will only handle a subset of those schemes. Though I think
increasing it to some reasonable value may help decrease issues drastically
regardless.

I also think some longer term fix that avoids collisions all together would
be nice. Though I am not sure what that long term solution is. We would
need to introduce something that a configured broker id is not allowed to
set. Any ideas?

I also wanted to note here that while investigating this I found some
interesting special cases/rules for the reserved.broker.max.id config.

1. Because a zookeeper sequence value is added to that value to generate
the unique id's, the value once configured and used *cannot be decreased*
and still guaranteeing no collisions.

2. Because the id was generated it can never be manually set in the config.
Therefore if you need to stand up a new machine with the same broker ids
(perhaps for recovery) you can't set this value manually. The workaround
would be to set the value in the meta.properties file of all the log
directories. (note: I haven't fully vetted this yet)




On Wed, Dec 23, 2015 at 5:25 PM, Ewen Cheslack-Postava 
wrote:

> Which other numbering schemes do we want to be able to un-break by
> increasing this default? For example, I know some people use the IP address
> with dots removed -- we'd have to use a very large # to make sure that
> worked. Before making another change, it'd be good to know what other
> schemes people are using and that we'd really be fixing the issue for
> someone.
>
> -Ewen
>
> On Fri, Dec 18, 2015 at 9:37 AM, Ismael Juma  wrote:
>
> > On Fri, Dec 18, 2015 at 4:44 PM, Grant Henke 
> wrote:
> >
> > > There is some discussion on KAFKA-1070
> > >  around the design
> > > choice
> > > and compatibility. The value 1000 was thrown out as a quick example but
> > it
> > > was never discussed beyond that. The discussion also sites a few cases
> > > where a value of 1000 would cause issue.
> > >
> >
> > Thanks for digging that up. Also worth noting that Jay said:
> >
> > "I think we can get around the problem you point out by just defaulting
> the
> > node id sequence to 1000. This could theoretically conflict but most
> people
> > number from 0 or 1 and we can discuss this in the release notes. Our plan
> > will be to release with support for both configured node ids and assigned
> > node ids for compatibility. After a couple of releases we will remove the
> > config."
> >
> > Ismael
> >
>
>
>
> --
> Thanks,
> Ewen
>



-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: [VOTE] KIP-32 Add CreateTime and LogAppendTime to Kafka message.

2016-01-08 Thread Aditya Auradkar
Anna,

That sounds good to me as well.

Aditya

On Fri, Jan 8, 2016 at 2:11 PM, Gwen Shapira  wrote:

> Sounds good to me too. Seems pretty easy to add and can be useful for
> producers.
>
> On Fri, Jan 8, 2016 at 1:22 PM, Joel Koshy  wrote:
>
> > Hi Anna,
> >
> > That sounds good to me - Becket/others any thoughts?
> >
> > Thanks,
> >
> > Joel
> >
> > On Fri, Jan 8, 2016 at 12:41 PM, Anna Povzner  wrote:
> >
> > > Hi Becket and everyone,
> > >
> > > Could we please add the following functionality to this KIP. I think it
> > > would be very useful for the broker to return the timestamp in the ack
> to
> > > the producer (in response: timestamp per partition) and propagate it
> back
> > > to client in RecordMetadata. This way, if timestamp type is
> > LogAppendTime,
> > > the producer client will see what timestamp was actually set -- and it
> > > would match the timestamp that consumer sees. Also, returning the
> > timestamp
> > > in RecordMetadata is also useful for timestamp type = CreateTime, since
> > > timestamp could be also set in KafkaProducer (if client set timestamp
> in
> > > ProducerRecord to 0).
> > >
> > > Since this requires protocol change as well, it will be better to
> > implement
> > > this as part of KIP-32, rather than proposing a new KIP.
> > >
> > > Thanks,
> > > Anna
> > >
> > >
> > > On Fri, Jan 8, 2016 at 12:53 PM, Joel Koshy 
> wrote:
> > >
> > > > +1 from me
> > > >
> > > > Looking through this thread it seems there was some confusion on the
> > > > migration discussion. This discussion in fact happened in the KIP-31
> > > > discuss thread, not so much in the KIP hangout. There is considerable
> > > > overlap in discussions between KIP-3[1,2,3] so it makes sense to
> > > > cross-reference all of these.
> > > >
> > > > I'm finding the Apache list archive a little cumbersome to use (e.g.,
> > the
> > > > current link in KIP-31 points to the beginning of September archives)
> > but
> > > > the emails discussing migration were in October:
> > > >
> http://mail-archives.apache.org/mod_mbox/kafka-dev/201510.mbox/thread
> > > >
> > > > Markmail has a better interface but interestingly it has not indexed
> > any
> > > of
> > > > the emails from August, September and early October (
> > > >
> > > >
> > >
> >
> http://markmail.org/search/?q=list%3Aorg.apache.incubator.kafka-dev+date%3A201509-201511+order%3Adate-backward
> > > > ).
> > > > Perhaps KIPs should include a permalink to the first message of the
> > > DISCUSS
> > > > thread. E.g.,
> > > >
> > > >
> > >
> >
> http://mail-archives.apache.org/mod_mbox/kafka-dev/201509.mbox/%3CCAHrRUm5jvL_dPeZWnfBD-vONgSZWOq1VL1Ss8OSUOCPXmtg8rQ%40mail.gmail.com%3E
> > > >
> > > > Also, just to clarify Jay's comments on the content of KIPs: I think
> > > having
> > > > a pseudo-code spec/implementation guide is useful (especially for
> > > > client-side KIPs). While the motivation should definitely capture
> “why
> > we
> > > > are doing the KIP” it probably shouldn’t have to exhaustively capture
> > > “why
> > > > we are doing the KIP *this way*”. i.e., some of the discussions are
> > > > extremely nuanced and in this case spans multiple KIPs so links to
> > other
> > > > KIPs and the discuss threads and KIP hangout recordings are perhaps
> > > > sufficient to fill this gap - or maybe a new section that summarizes
> > the
> > > > discussions.
> > > >
> > > > Thanks,
> > > >
> > > > Joel
> > > >
> > > > On Wed, Jan 6, 2016 at 9:29 AM, Jun Rao  wrote:
> > > >
> > > > > Hi, Jiangjie,
> > > > >
> > > > > 52. Replacing MessageSet with o.a.k.common.record will be ideal.
> > > > > Unfortunately, we use MessageSet in SimpleConsumer, which is part
> of
> > > the
> > > > > public api. Replacing MessageSet with o.a.k.common.record will be
> an
> > > > > incompatible api change. So, we probably should do this after we
> > > > deprecate
> > > > > SimpleConsumer.
> > > > >
> > > > > My original question is actually whether we just bump up magic byte
> > in
> > > > > Message once to incorporate both the offset and the timestamp
> change.
> > > It
> > > > > seems that the answer is yes. Could you reflect that in the KIP?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jun
> > > > >
> > > > >
> > > > > On Wed, Jan 6, 2016 at 7:01 AM, Becket Qin 
> > > wrote:
> > > > >
> > > > > > Thanks a lot for the careful reading, Jun.
> > > > > > Please see inline replies.
> > > > > >
> > > > > >
> > > > > > > On Jan 6, 2016, at 3:24 AM, Jun Rao  wrote:
> > > > > > >
> > > > > > > Jiangjie,
> > > > > > >
> > > > > > > Thanks for the updated KIP. Overall, a +1 on the proposal. A
> few
> > > > minor
> > > > > > > comments on the KIP.
> > > > > > >
> > > > > > > KIP-32:
> > > > > > > 50. 6.c says "The log rolling has to depend on the earliest
> > > > timestamp",
> > > > > > > which is inconsistent with KIP-33.
> > > > > > Corrected.
> > > > > > >
> > > > > > > 51. 8.b "If the time difference threshold is set to 0. The
> > > timestamp
> > > > in
> > > > > > the
> > > > > > > message

[GitHub] kafka pull request: MINOR: speed up connect startup when full conn...

2016-01-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/746


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request: MINOR: Add property to configure showing of st...

2016-01-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/739


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request: Kafka 3078

2016-01-08 Thread SinghAsDev
GitHub user SinghAsDev opened a pull request:

https://github.com/apache/kafka/pull/747

Kafka 3078



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/SinghAsDev/kafka KAFKA-3078

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/747.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #747


commit 52a9e37f7cee7b6d565dbf9da28595ce6de85c74
Author: Ashish Singh 
Date:   2016-01-07T22:32:51Z

KAFKA-3077: Enable KafkaLog4jAppender to work with SASL enabled brokers

commit ddd1e1d0644eb6dc1e6fbb834b237e01a943ed8d
Author: Ashish Singh 
Date:   2016-01-09T00:01:58Z

KAFKA-3078: Add ducktape tests for KafkaLog4jAppender producing to SASL 
enabled Kafka cluster




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request: MINOR: Security doc fixes

2016-01-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/745


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: KIP-41: KafkaConsumer Max Records

2016-01-08 Thread Aarti Gupta
Hi Jason,

+1 on the idea of adding max.poll.bytes as an optional configuration
(default set to -1, would mean that the setting does not come into play)
The  pre-fetching optimization, (pre fetch again only those partitions with
no retained data), seems slightly better(same as what we have in production
today), in preventing massive build up of pre fetched messages in memory,
(in the interim of KAFKA-2045's introduction).
Maybe some perf testing with variable message sizes and  JVM profiling of
both the variants of the algorithm might help tell us if it actually
matters, I can help work on these perf results with you as we get the JIRA
rolled out)

thanks
aarti


On Fri, Jan 8, 2016 at 11:50 AM, Jason Gustafson  wrote:

> Thanks Jens for all of your work as well! Unless there are any more
> concerns, perhaps we can open the vote early next week.
>
> As a quick summary for newcomers to this thread, the problem we're trying
> to solve in this KIP is how to give users more predictable control over the
> message processing loop. Because the new consumer is single-threaded, the
> poll() API must be called frequently enough to ensure that the consumer can
> send heartbeats before its session timeout expires. Typically we recommend
> setting the session timeout large enough to make expiration unlikely, but
> that can be difficult advice to follow in practice when either the number
> of partitions is unknown or increases over time. In some cases, such as in
> Jens' initial bug report, the processing time does not even depend directly
> on the size of the total data to be processed.
>
> To address this problem, we have proposed to offer a new configuration
> option "max.poll.records" which sets an upper bound on the number of
> records returned in a single call to poll(). The point is to give users a
> way to limit message processing time so that the session timeout can be set
> without risking unexpected rebalances. This change is backward compatible
> with the current API and users only need to change their configuration to
> take advantage of it. As a bonus, it provides an easy mechanism to
> implement commit policies which ensure commits at least as often as every N
> records.
>
> As a final subject for consideration, it may make sense to also add a
> configuration "max.poll.bytes," which places an upper bound on the total
> size of the data returned in a call to poll(). This would solve the problem
> more generally since some use cases may actually have processing time which
> is more dependent on the total size of the data than the number of records.
> Others might require a mix of the two.
>
> -Jason
>
> On Fri, Jan 8, 2016 at 9:42 AM, Jason Gustafson 
> wrote:
>
> > Hi Aarti,
> >
> > Thanks for the feedback. I think the concern about memory overhead is
> > valid. As Guozhang mentioned, the problem already exists in the current
> > consumer, so this probably deserves consideration outside of this KIP.
> That
> > said, it's a good question whether our prefetching strategy makes it more
> > difficult to control the memory overhead. The approach we've proposed for
> > prefetching is basically the following: fetch all partitions whenever the
> > number of retained messages is less than max.poll.records. In the worst
> > case, this increases the maximum memory used by the consumer by the size
> of
> > those retained messages. As you've pointed out, messages could be very
> > large. We could reduce this requirement with a slight change: instead of
> > fetching all partitions, we could fetch only those with no retained data.
> > That would reduce the worst-case overhead to #no partitions *
> > max.partition.fetch.bytes, which matches the existing memory overhead.
> > Would that address your concern?
> >
> > A couple other points worth mentioning is that users have the option not
> > to use max.poll.records, in which case the behavior will be the same as
> in
> > the current consumer. Additionally, the implementation can be changed
> over
> > time without affecting users, so we can adjust it in particular when we
> > address memory concerns in KAFKA-2045.
> >
> > On a side note, I'm wondering if it would be useful to extend this KIP to
> > include a max.poll.bytes? For some use cases, it may make more sense to
> > control the processing time by the size of data instead of the number of
> > records. Not that I'm in anxious to draw this out, but if we'll need this
> > setting eventually, we may as well do it now. Thoughts?
> >
> >
> > -Jason
> >
> > On Fri, Jan 8, 2016 at 1:03 AM, Jens Rantil  wrote:
> >
> >> Hi,
> >>
> >> I just publicly wanted to thank Jason for the work he's done with the
> KIP
> >> and say that I've been in touch with him privately back and forth to
> work
> >> out of some of its details. Thanks!
> >>
> >> Since it feels like I initiated this KIP a bit I also want to say that
> I'm
> >> happy with it and that its proposal solves the initial issue I reported
> in
> >> https://issues.apache.org/jira

Build failed in Jenkins: kafka-trunk-jdk7 #948

2016-01-08 Thread Apache Jenkins Server
See 

Changes:

[me] MINOR: speed up connect startup when full connector class name is

[me] MINOR: Add property to configure showing of standard streams in Gradle

--
[...truncated 3714 lines...]
kafka.log.BrokerCompressionTest > testBrokerSideCompression[0] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[1] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[2] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[3] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[4] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[5] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[6] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[7] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[8] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[9] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[10] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[11] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[12] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[13] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[14] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[15] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[16] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[17] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[18] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[19] PASSED

kafka.log.OffsetMapTest > testClear PASSED

kafka.log.OffsetMapTest > testBasicValidation PASSED

kafka.log.LogManagerTest > testCleanupSegmentsToMaintainSize PASSED

kafka.log.LogManagerTest > testRecoveryDirectoryMappingWithRelativeDirectory 
PASSED

kafka.log.LogManagerTest > testGetNonExistentLog PASSED

kafka.log.LogManagerTest > testTwoLogManagersUsingSameDirFails PASSED

kafka.log.LogManagerTest > testLeastLoadedAssignment PASSED

kafka.log.LogManagerTest > testCleanupExpiredSegments PASSED

kafka.log.LogManagerTest > testCheckpointRecoveryPoints PASSED

kafka.log.LogManagerTest > testTimeBasedFlush PASSED

kafka.log.LogManagerTest > testCreateLog PASSED

kafka.log.LogManagerTest > testRecoveryDirectoryMappingWithTrailingSlash PASSED

kafka.log.LogSegmentTest > testRecoveryWithCorruptMessage PASSED

kafka.log.LogSegmentTest > testRecoveryFixesCorruptIndex PASSED

kafka.log.LogSegmentTest > testReadFromGap PASSED

kafka.log.LogSegmentTest > testTruncate PASSED

kafka.log.LogSegmentTest > testReadBeforeFirstOffset PASSED

kafka.log.LogSegmentTest > testCreateWithInitFileSizeAppendMessage PASSED

kafka.log.LogSegmentTest > testChangeFileSuffixes PASSED

kafka.log.LogSegmentTest > testMaxOffset PASSED

kafka.log.LogSegmentTest > testNextOffsetCalculation PASSED

kafka.log.LogSegmentTest > testReadOnEmptySegment PASSED

kafka.log.LogSegmentTest > testReadAfterLast PASSED

kafka.log.LogSegmentTest > testCreateWithInitFileSizeClearShutdown PASSED

kafka.log.LogSegmentTest > testTruncateFull PASSED

kafka.log.FileMessageSetTest > testTruncate PASSED

kafka.log.FileMessageSetTest > testIterationOverPartialAndTruncation PASSED

kafka.log.FileMessageSetTest > testRead PASSED

kafka.log.FileMessageSetTest > testFileSize PASSED

kafka.log.FileMessageSetTest > testIteratorWithLimits PASSED

kafka.log.FileMessageSetTest > testPreallocateTrue PASSED

kafka.log.FileMessageSetTest > testIteratorIsConsistent PASSED

kafka.log.FileMessageSetTest > testIterationDoesntChangePosition PASSED

kafka.log.FileMessageSetTest > testWrittenEqualsRead PASSED

kafka.log.FileMessageSetTest > testWriteTo PASSED

kafka.log.FileMessageSetTest > testPreallocateFalse PASSED

kafka.log.FileMessageSetTest > testPreallocateClearShutdown PASSED

kafka.log.FileMessageSetTest > testSearch PASSED

kafka.log.FileMessageSetTest > testSizeInBytes PASSED

kafka.log.LogCleanerIntegrationTest > cleanerTest[0] PASSED

kafka.log.LogCleanerIntegrationTest > cleanerTest[1] PASSED

kafka.log.LogCleanerIntegrationTest > cleanerTest[2] PASSED

kafka.log.LogCleanerIntegrationTest > cleanerTest[3] PASSED

kafka.log.LogTest > testParseTopicPartitionNameForMissingTopic PASSED

kafka.log.LogTest > testIndexRebuild PASSED

kafka.log.LogTest > testLogRolls PASSED

kafka.log.LogTest > testMessageSizeCheck PASSED

kafka.log.LogTest > testAsyncDelete PASSED

kafka.log.LogTest > testReadOutOfRange PASSED

kafka.log.LogTest > testReadAtLogGap PASSED

kafka.log.LogTest > testTimeBasedLogRoll PASSED

kafka.log.LogTest > testLoadEmptyLog PASSED

kafka.log.LogTest > testMessageSetSizeCheck PASSED

kafka.log.LogTest > testIndexResizingAtTruncation PASSED

kafka.log.LogTest > testCompactedTopicConstraints PASSED

kafka.log.LogTest > testThatGarbageCollectingSegmentsDoesntChangeOff

Build failed in Jenkins: kafka-trunk-jdk7 #949

2016-01-08 Thread Apache Jenkins Server
See 

Changes:

[me] MINOR: Security doc fixes

--
Started by an SCM change
[EnvInject] - Loading node environment variables.
Building remotely on H11 (Ubuntu ubuntu) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url 
 > https://git-wip-us.apache.org/repos/asf/kafka.git # timeout=10
Fetching upstream changes from https://git-wip-us.apache.org/repos/asf/kafka.git
 > git --version # timeout=10
 > git -c core.askpass=true fetch --tags --progress 
 > https://git-wip-us.apache.org/repos/asf/kafka.git 
 > +refs/heads/*:refs/remotes/origin/*
 > git rev-parse refs/remotes/origin/trunk^{commit} # timeout=10
 > git rev-parse refs/remotes/origin/origin/trunk^{commit} # timeout=10
Checking out Revision 36f5c46a5cf510001a4990db6199beb37f215007 
(refs/remotes/origin/trunk)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 36f5c46a5cf510001a4990db6199beb37f215007
 > git rev-list 8d674857996667bce52a804feb6108a6903eb39b # timeout=10
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2
Setting 
JDK_1_7U51_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/jdk-1.7u51
[kafka-trunk-jdk7] $ /bin/bash -xe /tmp/hudson6049214791192854326.sh
+ 
/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2/bin/gradle
To honour the JVM settings for this build a new JVM will be forked. Please 
consider using the daemon: 
http://gradle.org/docs/2.4-rc-2/userguide/gradle_daemon.html.
Building project 'core' with Scala version 2.10.6
:downloadWrapper

BUILD SUCCESSFUL

Total time: 15.495 secs
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2
Setting 
JDK_1_7U51_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/jdk-1.7u51
[kafka-trunk-jdk7] $ /bin/bash -xe /tmp/hudson2834116553348547707.sh
+ export GRADLE_OPTS=-Xmx1024m
+ GRADLE_OPTS=-Xmx1024m
+ ./gradlew -Dorg.gradle.project.maxParallelForks=1 clean jarAll testAll
To honour the JVM settings for this build a new JVM will be forked. Please 
consider using the daemon: 
https://docs.gradle.org/2.10/userguide/gradle_daemon.html.
Building project 'core' with Scala version 2.10.6
:clean UP-TO-DATE
:clients:clean
:connect:clean UP-TO-DATE
:core:clean
:examples:clean
:log4j-appender:clean
:streams:clean
:tools:clean
:connect:api:clean
:connect:file:clean
:connect:json:clean
:connect:runtime:clean
:jar_core_2_10
Building project 'core' with Scala version 2.10.6
:kafka-trunk-jdk7:clients:compileJavaNote: 

 uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
Note: Some input files use unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.

:kafka-trunk-jdk7:clients:processResources UP-TO-DATE
:kafka-trunk-jdk7:clients:classes
:kafka-trunk-jdk7:clients:determineCommitId UP-TO-DATE
:kafka-trunk-jdk7:clients:createVersionFile
:kafka-trunk-jdk7:clients:jar
:kafka-trunk-jdk7:core:compileJava UP-TO-DATE
:kafka-trunk-jdk7:core:compileScala
:79:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresponding Javadoc for more information.

org.apache.kafka.common.requests.OffsetCommitRequest.DEFAULT_TIMESTAMP
 ^
:36:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresponding Javadoc for more information.
 commitTimestamp: Long = 
org.apache.kafka.common.requests.OffsetCommitRequest.DEFAULT_TIMESTAMP,

  ^
:37:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresponding Javadoc for more information.
 expireTimestamp: Long = 
org.apache.kafka.common.requests.OffsetCommitRequest.DEFAULT_TIMESTAMP) {

  ^
:394:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresponding Javadoc for more information.
   

Jenkins build is back to normal : kafka_0.9.0_jdk7 #83

2016-01-08 Thread Apache Jenkins Server
See 



Re: [VOTE] KIP-32 Add CreateTime and LogAppendTime to Kafka message.

2016-01-08 Thread Neha Narkhede
Anna - Good suggestion. Sounds good to me as well

On Fri, Jan 8, 2016 at 2:32 PM, Aditya Auradkar <
aaurad...@linkedin.com.invalid> wrote:

> Anna,
>
> That sounds good to me as well.
>
> Aditya
>
> On Fri, Jan 8, 2016 at 2:11 PM, Gwen Shapira  wrote:
>
> > Sounds good to me too. Seems pretty easy to add and can be useful for
> > producers.
> >
> > On Fri, Jan 8, 2016 at 1:22 PM, Joel Koshy  wrote:
> >
> > > Hi Anna,
> > >
> > > That sounds good to me - Becket/others any thoughts?
> > >
> > > Thanks,
> > >
> > > Joel
> > >
> > > On Fri, Jan 8, 2016 at 12:41 PM, Anna Povzner 
> wrote:
> > >
> > > > Hi Becket and everyone,
> > > >
> > > > Could we please add the following functionality to this KIP. I think
> it
> > > > would be very useful for the broker to return the timestamp in the
> ack
> > to
> > > > the producer (in response: timestamp per partition) and propagate it
> > back
> > > > to client in RecordMetadata. This way, if timestamp type is
> > > LogAppendTime,
> > > > the producer client will see what timestamp was actually set -- and
> it
> > > > would match the timestamp that consumer sees. Also, returning the
> > > timestamp
> > > > in RecordMetadata is also useful for timestamp type = CreateTime,
> since
> > > > timestamp could be also set in KafkaProducer (if client set timestamp
> > in
> > > > ProducerRecord to 0).
> > > >
> > > > Since this requires protocol change as well, it will be better to
> > > implement
> > > > this as part of KIP-32, rather than proposing a new KIP.
> > > >
> > > > Thanks,
> > > > Anna
> > > >
> > > >
> > > > On Fri, Jan 8, 2016 at 12:53 PM, Joel Koshy 
> > wrote:
> > > >
> > > > > +1 from me
> > > > >
> > > > > Looking through this thread it seems there was some confusion on
> the
> > > > > migration discussion. This discussion in fact happened in the
> KIP-31
> > > > > discuss thread, not so much in the KIP hangout. There is
> considerable
> > > > > overlap in discussions between KIP-3[1,2,3] so it makes sense to
> > > > > cross-reference all of these.
> > > > >
> > > > > I'm finding the Apache list archive a little cumbersome to use
> (e.g.,
> > > the
> > > > > current link in KIP-31 points to the beginning of September
> archives)
> > > but
> > > > > the emails discussing migration were in October:
> > > > >
> > http://mail-archives.apache.org/mod_mbox/kafka-dev/201510.mbox/thread
> > > > >
> > > > > Markmail has a better interface but interestingly it has not
> indexed
> > > any
> > > > of
> > > > > the emails from August, September and early October (
> > > > >
> > > > >
> > > >
> > >
> >
> http://markmail.org/search/?q=list%3Aorg.apache.incubator.kafka-dev+date%3A201509-201511+order%3Adate-backward
> > > > > ).
> > > > > Perhaps KIPs should include a permalink to the first message of the
> > > > DISCUSS
> > > > > thread. E.g.,
> > > > >
> > > > >
> > > >
> > >
> >
> http://mail-archives.apache.org/mod_mbox/kafka-dev/201509.mbox/%3CCAHrRUm5jvL_dPeZWnfBD-vONgSZWOq1VL1Ss8OSUOCPXmtg8rQ%40mail.gmail.com%3E
> > > > >
> > > > > Also, just to clarify Jay's comments on the content of KIPs: I
> think
> > > > having
> > > > > a pseudo-code spec/implementation guide is useful (especially for
> > > > > client-side KIPs). While the motivation should definitely capture
> > “why
> > > we
> > > > > are doing the KIP” it probably shouldn’t have to exhaustively
> capture
> > > > “why
> > > > > we are doing the KIP *this way*”. i.e., some of the discussions are
> > > > > extremely nuanced and in this case spans multiple KIPs so links to
> > > other
> > > > > KIPs and the discuss threads and KIP hangout recordings are perhaps
> > > > > sufficient to fill this gap - or maybe a new section that
> summarizes
> > > the
> > > > > discussions.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Joel
> > > > >
> > > > > On Wed, Jan 6, 2016 at 9:29 AM, Jun Rao  wrote:
> > > > >
> > > > > > Hi, Jiangjie,
> > > > > >
> > > > > > 52. Replacing MessageSet with o.a.k.common.record will be ideal.
> > > > > > Unfortunately, we use MessageSet in SimpleConsumer, which is part
> > of
> > > > the
> > > > > > public api. Replacing MessageSet with o.a.k.common.record will be
> > an
> > > > > > incompatible api change. So, we probably should do this after we
> > > > > deprecate
> > > > > > SimpleConsumer.
> > > > > >
> > > > > > My original question is actually whether we just bump up magic
> byte
> > > in
> > > > > > Message once to incorporate both the offset and the timestamp
> > change.
> > > > It
> > > > > > seems that the answer is yes. Could you reflect that in the KIP?
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Jun
> > > > > >
> > > > > >
> > > > > > On Wed, Jan 6, 2016 at 7:01 AM, Becket Qin  >
> > > > wrote:
> > > > > >
> > > > > > > Thanks a lot for the careful reading, Jun.
> > > > > > > Please see inline replies.
> > > > > > >
> > > > > > >
> > > > > > > > On Jan 6, 2016, at 3:24 AM, Jun Rao 
> wrote:
> > > > > > > >
> > > > > > > > Jiangjie,
> > > > > > > >
> > > > > > > > Thanks

[GitHub] kafka-site pull request: Minor updates to api, design, security an...

2016-01-08 Thread ijuma
GitHub user ijuma opened a pull request:

https://github.com/apache/kafka-site/pull/7

Minor updates to api, design, security and upgrade pages

Changes copied from 0.9.0 branch of the kafka repo.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka-site 
update-0.9.0-docs-from-kafka-2016-01-09

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka-site/pull/7.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #7


commit 662f2c646e52312ea2bf42a88e9539cc1cee30ac
Author: Ismael Juma 
Date:   2016-01-09T02:26:23Z

Minor updates to api, design, security and upgrade pages

Changes copied from 0.9.0 branch of the kafka repo.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka-site pull request: MINOR: Simple updates to api, design, sec...

2016-01-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka-site/pull/7


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-2120) Add a request timeout to NetworkClient

2016-01-08 Thread Nitin Padalia (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15090458#comment-15090458
 ] 

Nitin Padalia commented on KAFKA-2120:
--

Is this resolution backported to older versions; i.e.; 0.8.2 also?

> Add a request timeout to NetworkClient
> --
>
> Key: KAFKA-2120
> URL: https://issues.apache.org/jira/browse/KAFKA-2120
> Project: Kafka
>  Issue Type: New Feature
>Reporter: Jiangjie Qin
>Assignee: Mayuresh Gharat
>Priority: Blocker
> Fix For: 0.9.0.0
>
> Attachments: KAFKA-2120.patch, KAFKA-2120_2015-07-27_15:31:19.patch, 
> KAFKA-2120_2015-07-29_15:57:02.patch, KAFKA-2120_2015-08-10_19:55:18.patch, 
> KAFKA-2120_2015-08-12_10:59:09.patch, KAFKA-2120_2015-09-03_15:12:02.patch, 
> KAFKA-2120_2015-09-04_17:49:01.patch, KAFKA-2120_2015-09-09_16:45:44.patch, 
> KAFKA-2120_2015-09-09_18:56:18.patch, KAFKA-2120_2015-09-10_21:38:55.patch, 
> KAFKA-2120_2015-09-11_14:54:15.patch, KAFKA-2120_2015-09-15_18:57:20.patch, 
> KAFKA-2120_2015-09-18_19:27:48.patch, KAFKA-2120_2015-09-28_16:13:02.patch
>
>
> Currently NetworkClient does not have a timeout setting for requests. So if 
> no response is received for a request due to reasons such as broker is down, 
> the request will never be completed.
> Request timeout will also be used as implicit timeout for some methods such 
> as KafkaProducer.flush() and kafkaProducer.close().
> KIP-19 is created for this public interface change.
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-19+-+Add+a+request+timeout+to+NetworkClient



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)