[jira] [Commented] (KAFKA-2189) Snappy compression of message batches less efficient in 0.8.2.1

2015-05-28 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562470#comment-14562470
 ] 

Ismael Juma commented on KAFKA-2189:


Given the positive feedback, it seems like it would be good to get this merged 
so that more people can test it before the final release?

> Snappy compression of message batches less efficient in 0.8.2.1
> ---
>
> Key: KAFKA-2189
> URL: https://issues.apache.org/jira/browse/KAFKA-2189
> Project: Kafka
>  Issue Type: Bug
>  Components: build, compression, log
>Affects Versions: 0.8.2.1
>Reporter: Olson,Andrew
>Assignee: Ismael Juma
>Priority: Blocker
>  Labels: trivial
> Fix For: 0.8.3
>
> Attachments: KAFKA-2189.patch
>
>
> We are using snappy compression and noticed a fairly substantial increase 
> (about 2.25x) in log filesystem space consumption after upgrading a Kafka 
> cluster from 0.8.1.1 to 0.8.2.1. We found that this is caused by messages 
> being seemingly recompressed individually (or possibly with a much smaller 
> buffer or dictionary?) instead of as a batch as sent by producers. We 
> eventually tracked down the change in compression ratio/scope to this [1] 
> commit that updated the snappy version from 1.0.5 to 1.1.1.3. The Kafka 
> client version does not appear to be relevant as we can reproduce this with 
> both the 0.8.1.1 and 0.8.2.1 Producer.
> Here are the log files from our troubleshooting that contain the same set of 
> 1000 messages, for batch sizes of 1, 10, 100, and 1000. f9d9b was the last 
> commit with 0.8.1.1-like behavior prior to f5ab8 introducing the issue.
> {noformat}
> -rw-rw-r-- 1 kafka kafka 404967 May 12 11:45 
> /var/kafka2/f9d9b-batch-1-0/.log
> -rw-rw-r-- 1 kafka kafka 119951 May 12 11:45 
> /var/kafka2/f9d9b-batch-10-0/.log
> -rw-rw-r-- 1 kafka kafka  89645 May 12 11:45 
> /var/kafka2/f9d9b-batch-100-0/.log
> -rw-rw-r-- 1 kafka kafka  88279 May 12 11:45 
> /var/kafka2/f9d9b-batch-1000-0/.log
> -rw-rw-r-- 1 kafka kafka 402837 May 12 11:41 
> /var/kafka2/f5ab8-batch-1-0/.log
> -rw-rw-r-- 1 kafka kafka 382437 May 12 11:41 
> /var/kafka2/f5ab8-batch-10-0/.log
> -rw-rw-r-- 1 kafka kafka 364791 May 12 11:41 
> /var/kafka2/f5ab8-batch-100-0/.log
> -rw-rw-r-- 1 kafka kafka 380693 May 12 11:41 
> /var/kafka2/f5ab8-batch-1000-0/.log
> {noformat}
> [1] 
> https://github.com/apache/kafka/commit/f5ab8e1780cf80f267906e3259ad4f9278c32d28
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2185) Update to Gradle 2.4

2015-05-28 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562472#comment-14562472
 ] 

Ismael Juma commented on KAFKA-2185:


Anything I can do to help move this along? People should appreciate faster 
builds after all. :)

> Update to Gradle 2.4
> 
>
> Key: KAFKA-2185
> URL: https://issues.apache.org/jira/browse/KAFKA-2185
> Project: Kafka
>  Issue Type: Improvement
>  Components: build
>Reporter: Ismael Juma
>Assignee: Ismael Juma
>Priority: Minor
> Attachments: KAFKA-2185_2015-05-11_20:55:08.patch
>
>
> Gradle 2.4 has been released recently while Kafka is still using Gradle 2.0. 
> There have been a large number of improvements over the various releases 
> (including performance improvements):
> https://gradle.org/docs/2.1/release-notes
> https://gradle.org/docs/2.2/release-notes
> https://gradle.org/docs/2.3/release-notes
> http://gradle.org/docs/current/release-notes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2187) Introduce merge-kafka-pr.py script

2015-05-28 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562478#comment-14562478
 ] 

Ismael Juma commented on KAFKA-2187:


[~nehanarkhede], please let me know if there's anything I can do to make it 
easier for you to review and test this.

> Introduce merge-kafka-pr.py script
> --
>
> Key: KAFKA-2187
> URL: https://issues.apache.org/jira/browse/KAFKA-2187
> Project: Kafka
>  Issue Type: New Feature
>Reporter: Ismael Juma
>Assignee: Ismael Juma
>Priority: Minor
> Attachments: KAFKA-2187.patch, KAFKA-2187_2015-05-20_23:14:05.patch
>
>
> This script will be used to merge GitHub pull requests and it will pull from 
> the Apache Git repo to the current branch, squash and merge the PR, push the 
> commit to trunk, close the PR (via commit message) and close the relevant 
> JIRA issue (via JIRA API).
> Spark has a script that does most (if not all) of this and that will be used 
> as the starting point:
> https://github.com/apache/spark/blob/master/dev/merge_spark_pr.py



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 34734: Patch for KAFKA-2226

2015-05-28 Thread Onur Karaman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34734/#review85530
---


It seems like you're concerned about adding/removing a TimerTaskEntry that 
already exists in another TimerTaskList. Can you explain how that can happen? 
My understanding of the timing wheel stuff is only so-so.

- Onur Karaman


On May 27, 2015, 9 p.m., Yasuhiro Matsuda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34734/
> ---
> 
> (Updated May 27, 2015, 9 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-2226
> https://issues.apache.org/jira/browse/KAFKA-2226
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> fix a race condition in TimerTaskEntry.remove
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/utils/timer/TimerTaskList.scala 
> e7a96570ddc2367583d6d5590628db7e7f6ba39b 
> 
> Diff: https://reviews.apache.org/r/34734/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Yasuhiro Matsuda
> 
>



Re: Review Request 33049: Patch for KAFKA-2084

2015-05-28 Thread Manikumar Reddy O

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33049/#review85545
---



core/src/main/scala/kafka/server/ClientQuotaMetrics.scala


Are we using clientID as uniqueKey?. But as of now, clientID is not 
mandatory and it need not be unique acorss different producer and consumers.


- Manikumar Reddy O


On May 26, 2015, 6:53 p.m., Aditya Auradkar wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33049/
> ---
> 
> (Updated May 26, 2015, 6:53 p.m.)
> 
> 
> Review request for kafka, Joel Koshy and Jun Rao.
> 
> 
> Bugs: KAFKA-2084
> https://issues.apache.org/jira/browse/KAFKA-2084
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> Updated patch for quotas. This patch does the following: 
> 1. Add per-client metrics for both producer and consumers 
> 2. Add configuration for quotas 
> 3. Compute delay times in the metrics package and return the delay times in 
> QuotaViolationException 
> 4. Add a DelayQueue in KafkaApi's that can be used to throttle any type of 
> request. Implemented request throttling for produce and fetch requests. 
> 5. Added unit and integration test cases.
> 6. This doesn't include a system test. There is a separate ticket for that
> 
> 
> Diffs
> -
> 
>   clients/src/main/java/org/apache/kafka/common/metrics/MetricConfig.java 
> dfa1b0a11042ad9d127226f0e0cec8b1d42b8441 
>   clients/src/main/java/org/apache/kafka/common/metrics/Quota.java 
> d82bb0c055e631425bc1ebbc7d387baac76aeeaa 
>   
> clients/src/main/java/org/apache/kafka/common/metrics/QuotaViolationException.java
>  a451e5385c9eca76b38b425e8ac856b2715fcffe 
>   clients/src/main/java/org/apache/kafka/common/metrics/Sensor.java 
> ca823fd4639523018311b814fde69b6177e73b97 
>   clients/src/test/java/org/apache/kafka/common/utils/MockTime.java  
>   core/src/main/scala/kafka/server/ClientQuotaMetrics.scala PRE-CREATION 
>   core/src/main/scala/kafka/server/KafkaApis.scala 
> 387e387998fc3a6c9cb585dab02b5f77b0381fbf 
>   core/src/main/scala/kafka/server/KafkaConfig.scala 
> 9efa15ca5567b295ab412ee9eea7c03eb4cdc18b 
>   core/src/main/scala/kafka/server/KafkaServer.scala 
> e66710d2368334ece66f70d55f57b3f888262620 
>   core/src/main/scala/kafka/server/ReplicaManager.scala 
> 59c9bc3ac3a8afc07a6f8c88c5871304db588d17 
>   core/src/main/scala/kafka/server/ThrottledRequest.scala PRE-CREATION 
>   core/src/main/scala/kafka/utils/ShutdownableThread.scala 
> fc226c863095b7761290292cd8755cd7ad0f155c 
>   core/src/test/scala/integration/kafka/api/QuotasTest.scala PRE-CREATION 
>   core/src/test/scala/unit/kafka/server/ClientQuotaMetricsTest.scala 
> PRE-CREATION 
>   core/src/test/scala/unit/kafka/server/KafkaConfigConfigDefTest.scala 
> 8014a5a6c362785539f24eb03d77278434614fe6 
>   core/src/test/scala/unit/kafka/server/ThrottledRequestExpirationTest.scala 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/33049/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Aditya Auradkar
> 
>



[jira] [Created] (KAFKA-2227) Phase 1: Requests and KafkaApis

2015-05-28 Thread Andrii Biletskyi (JIRA)
Andrii Biletskyi created KAFKA-2227:
---

 Summary: Phase 1: Requests and KafkaApis
 Key: KAFKA-2227
 URL: https://issues.apache.org/jira/browse/KAFKA-2227
 Project: Kafka
  Issue Type: Sub-task
Reporter: Andrii Biletskyi






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-2228) Phase 1: Requests and KafkaApis

2015-05-28 Thread Andrii Biletskyi (JIRA)
Andrii Biletskyi created KAFKA-2228:
---

 Summary: Phase 1: Requests and KafkaApis
 Key: KAFKA-2228
 URL: https://issues.apache.org/jira/browse/KAFKA-2228
 Project: Kafka
  Issue Type: Sub-task
Reporter: Andrii Biletskyi
Assignee: Andrii Biletskyi






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-2229) Phase 1: Requests and KafkaApis

2015-05-28 Thread Andrii Biletskyi (JIRA)
Andrii Biletskyi created KAFKA-2229:
---

 Summary: Phase 1: Requests and KafkaApis
 Key: KAFKA-2229
 URL: https://issues.apache.org/jira/browse/KAFKA-2229
 Project: Kafka
  Issue Type: Sub-task
Reporter: Andrii Biletskyi
Assignee: Andrii Biletskyi






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2229) Phase 1: Requests and KafkaApis

2015-05-28 Thread Andrii Biletskyi (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrii Biletskyi updated KAFKA-2229:

Attachment: KAFKA-2229.patch

> Phase 1: Requests and KafkaApis
> ---
>
> Key: KAFKA-2229
> URL: https://issues.apache.org/jira/browse/KAFKA-2229
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Andrii Biletskyi
>Assignee: Andrii Biletskyi
> Fix For: 0.8.3
>
> Attachments: KAFKA-2229.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 34766: Patch for KAFKA-2229

2015-05-28 Thread Andrii Biletskyi

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34766/
---

Review request for kafka.


Bugs: KAFKA-2229
https://issues.apache.org/jira/browse/KAFKA-2229


Repository: kafka


Description
---

KIP-4 Admin tools - Phase 1


Diffs
-

  checkstyle/import-control.xml f2e6cec267e67ce8e261341e373718e14a8e8e03 
  clients/src/main/java/org/apache/kafka/common/ConfigEntry.java PRE-CREATION 
  clients/src/main/java/org/apache/kafka/common/PartitionReplicaAssignment.java 
PRE-CREATION 
  clients/src/main/java/org/apache/kafka/common/TopicConfigInfo.java 
PRE-CREATION 
  clients/src/main/java/org/apache/kafka/common/protocol/ApiKeys.java 
b39e9bb53dd51f1ce931691e23aabc9ebaebf1e7 
  clients/src/main/java/org/apache/kafka/common/protocol/Errors.java 
5b898c8f8ad5d0469f469b600c4b2eb13d1fc662 
  clients/src/main/java/org/apache/kafka/common/protocol/Protocol.java 
3dc8b015afd2347a41c9a9dbc02b8e367da5f75f 
  clients/src/main/java/org/apache/kafka/common/requests/AbstractRequest.java 
5e5308ec0e333179a9abbf4f3b750ea25ab57967 
  
clients/src/main/java/org/apache/kafka/common/requests/admin/AlterTopicRequest.java
 PRE-CREATION 
  
clients/src/main/java/org/apache/kafka/common/requests/admin/AlterTopicResponse.java
 PRE-CREATION 
  
clients/src/main/java/org/apache/kafka/common/requests/admin/CreateTopicRequest.java
 PRE-CREATION 
  
clients/src/main/java/org/apache/kafka/common/requests/admin/CreateTopicResponse.java
 PRE-CREATION 
  
clients/src/main/java/org/apache/kafka/common/requests/admin/DeleteTopicRequest.java
 PRE-CREATION 
  
clients/src/main/java/org/apache/kafka/common/requests/admin/DeleteTopicResponse.java
 PRE-CREATION 
  
clients/src/test/java/org/apache/kafka/common/requests/RequestResponseTest.java 
e3cc1967e407b64cc734548c19e30de700b64ba8 
  core/src/main/scala/kafka/admin/AdminUtils.scala 
f06edf41c732a7b794e496d0048b0ce6f897e72b 
  core/src/main/scala/kafka/api/RequestKeys.scala 
ef7a86ec3324028496d6bb7c7c6fec7d7d19d64e 
  core/src/main/scala/kafka/common/InvalidPartitionsException.scala 
PRE-CREATION 
  core/src/main/scala/kafka/common/InvalidReplicaAssignmentException.scala 
PRE-CREATION 
  core/src/main/scala/kafka/common/InvalidReplicationFactorException.scala 
PRE-CREATION 
  core/src/main/scala/kafka/common/InvalidTopicConfigurationException.scala 
PRE-CREATION 
  core/src/main/scala/kafka/common/ReassignPartitionsInProgressException.scala 
PRE-CREATION 
  core/src/main/scala/kafka/server/KafkaApis.scala 
387e387998fc3a6c9cb585dab02b5f77b0381fbf 
  core/src/main/scala/kafka/server/TopicCommandHelper.scala PRE-CREATION 
  core/src/test/scala/unit/kafka/admin/AdminTest.scala 
efb2f8e79b3faef78722774b951fea828cd50374 
  core/src/test/scala/unit/kafka/server/TopicCommandHelperTest.scala 
PRE-CREATION 

Diff: https://reviews.apache.org/r/34766/diff/


Testing
---


Thanks,

Andrii Biletskyi



[jira] [Commented] (KAFKA-2229) Phase 1: Requests and KafkaApis

2015-05-28 Thread Andrii Biletskyi (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562877#comment-14562877
 ] 

Andrii Biletskyi commented on KAFKA-2229:
-

Created reviewboard https://reviews.apache.org/r/34766/diff/
 against branch origin/trunk

> Phase 1: Requests and KafkaApis
> ---
>
> Key: KAFKA-2229
> URL: https://issues.apache.org/jira/browse/KAFKA-2229
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Andrii Biletskyi
>Assignee: Andrii Biletskyi
> Fix For: 0.8.3
>
> Attachments: KAFKA-2229.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2229) Phase 1: Requests and KafkaApis

2015-05-28 Thread Andrii Biletskyi (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrii Biletskyi updated KAFKA-2229:

Status: Patch Available  (was: Open)

> Phase 1: Requests and KafkaApis
> ---
>
> Key: KAFKA-2229
> URL: https://issues.apache.org/jira/browse/KAFKA-2229
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Andrii Biletskyi
>Assignee: Andrii Biletskyi
> Fix For: 0.8.3
>
> Attachments: KAFKA-2229.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2223) Improve distribution of data when using hash-based partitioning

2015-05-28 Thread Gabriel Reid (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14563030#comment-14563030
 ] 

Gabriel Reid commented on KAFKA-2223:
-

Ok, thanks for taking a look [~jkreps]. 

I hadn't even caught that a new partitioner was being used since 8.8.2 (we were 
bitten by this issue pre-0.8.2). Agree that it doesn't make much sense to 
change the scala version due to the backwards compat issue, so I'll just close 
this ticket as will not fix unless someone else has an opinion on it.

> Improve distribution of data when using hash-based partitioning
> ---
>
> Key: KAFKA-2223
> URL: https://issues.apache.org/jira/browse/KAFKA-2223
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Gabriel Reid
> Attachments: KAFKA-2223.patch
>
>
> Both the DefaultPartitioner and ByteArrayPartitioner base themselves on the 
> hash code of keys modulo the number of partitions, along the lines of 
> {code}partition = key.hashCode() % numPartitions{code} (converting to 
> absolute value is ommitted here)
> This approach is entirely dependent on the _lower bits_ of the hash code 
> being uniformly distributed in order to get good distribution of records over 
> multiple partitions. If the lower bits of the key hash code are not uniformly 
> distributed, the key space will not be uniformly distributed over the 
> partitions.
> It can be surprisingly easy to get a very poor distribution. As a simple 
> example, if the keys are integer values and are all divisible by 2, then only 
> half of the partitions will receive data (as the hash code of an integer is 
> the integer value itself).
> This can even be a problem in situations where you would really not expect 
> it. For example, taking the 8-byte big-endian byte-array representation of 
> longs for each timestamp of each second over a period of 24 hours (at 
> millisecond granularity) and partitioning it over 50 partitions results in 34 
> of the 50 partitions not getting any data at all.
> The easiest way to resolve this is to have a custom HashPartitioner that 
> applies a supplementary hash function to the return value of the key's 
> hashCode method. This same approach is taken in java.util.HashMap for the 
> exact same reason.
> One potential issue for a change like this to the default partitioner could 
> be backward compatibility, if there is some kind of logic expecting that a 
> given key would be sent to a given partition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Mesos Community Networking Hangout

2015-05-28 Thread Joe Stein
Hi, a few folks from a few different companies (Cisco, Elodina, Mesosphere,
Moz)  are going to be having a hangout
https://plus.google.com/hangouts/_/stealth.ly/mesos-network every Friday
11am PT / 2pm ET and wanted to invite any other interested parties to join
too. We have already talked with folks from a few other companies so if we
chatted or not would be good for you to join if you have something you want
to contribute (requirements, code, whatever).

The goal is to have a single solution for Mesos networking so that all of
the different requirements around "floating ip" can be achieved with
different implementations and use cases. What goes into mesos, what is a
module, what is something to plugin that others can help define interfaces
for, etc, etc.

If you are interested let me know and I can add you to the invite, thanks!

~ Joe Stein
- - - - - - - - - - - - - - - - -

  http://www.stealth.ly
- - - - - - - - - - - - - - - - -


Re: Review Request 33049: Patch for KAFKA-2084

2015-05-28 Thread Dong Lin


> On May 28, 2015, 12:58 p.m., Manikumar Reddy O wrote:
> > core/src/main/scala/kafka/server/ClientQuotaMetrics.scala, line 145
> > 
> >
> > Are we using clientID as uniqueKey?. But as of now, clientID is not 
> > mandatory and it need not be unique acorss different producer and consumers.

Hey Manikumar, the clientID doesn't have to be unique. Clients of the same 
clientID will share the same quota.

If the quota management is merged into trunk and enabled in a given company, 
kafka clients in that company should explicitly specify their clientID.


- Dong


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33049/#review85545
---


On May 26, 2015, 6:53 p.m., Aditya Auradkar wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33049/
> ---
> 
> (Updated May 26, 2015, 6:53 p.m.)
> 
> 
> Review request for kafka, Joel Koshy and Jun Rao.
> 
> 
> Bugs: KAFKA-2084
> https://issues.apache.org/jira/browse/KAFKA-2084
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> Updated patch for quotas. This patch does the following: 
> 1. Add per-client metrics for both producer and consumers 
> 2. Add configuration for quotas 
> 3. Compute delay times in the metrics package and return the delay times in 
> QuotaViolationException 
> 4. Add a DelayQueue in KafkaApi's that can be used to throttle any type of 
> request. Implemented request throttling for produce and fetch requests. 
> 5. Added unit and integration test cases.
> 6. This doesn't include a system test. There is a separate ticket for that
> 
> 
> Diffs
> -
> 
>   clients/src/main/java/org/apache/kafka/common/metrics/MetricConfig.java 
> dfa1b0a11042ad9d127226f0e0cec8b1d42b8441 
>   clients/src/main/java/org/apache/kafka/common/metrics/Quota.java 
> d82bb0c055e631425bc1ebbc7d387baac76aeeaa 
>   
> clients/src/main/java/org/apache/kafka/common/metrics/QuotaViolationException.java
>  a451e5385c9eca76b38b425e8ac856b2715fcffe 
>   clients/src/main/java/org/apache/kafka/common/metrics/Sensor.java 
> ca823fd4639523018311b814fde69b6177e73b97 
>   clients/src/test/java/org/apache/kafka/common/utils/MockTime.java  
>   core/src/main/scala/kafka/server/ClientQuotaMetrics.scala PRE-CREATION 
>   core/src/main/scala/kafka/server/KafkaApis.scala 
> 387e387998fc3a6c9cb585dab02b5f77b0381fbf 
>   core/src/main/scala/kafka/server/KafkaConfig.scala 
> 9efa15ca5567b295ab412ee9eea7c03eb4cdc18b 
>   core/src/main/scala/kafka/server/KafkaServer.scala 
> e66710d2368334ece66f70d55f57b3f888262620 
>   core/src/main/scala/kafka/server/ReplicaManager.scala 
> 59c9bc3ac3a8afc07a6f8c88c5871304db588d17 
>   core/src/main/scala/kafka/server/ThrottledRequest.scala PRE-CREATION 
>   core/src/main/scala/kafka/utils/ShutdownableThread.scala 
> fc226c863095b7761290292cd8755cd7ad0f155c 
>   core/src/test/scala/integration/kafka/api/QuotasTest.scala PRE-CREATION 
>   core/src/test/scala/unit/kafka/server/ClientQuotaMetricsTest.scala 
> PRE-CREATION 
>   core/src/test/scala/unit/kafka/server/KafkaConfigConfigDefTest.scala 
> 8014a5a6c362785539f24eb03d77278434614fe6 
>   core/src/test/scala/unit/kafka/server/ThrottledRequestExpirationTest.scala 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/33049/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Aditya Auradkar
> 
>



Re: Mesos Community Networking Hangout

2015-05-28 Thread Joe Stein
sorry, wrong email list :)

~ Joe Stein
- - - - - - - - - - - - - - - - -

  http://www.stealth.ly
- - - - - - - - - - - - - - - - -

On Thu, May 28, 2015 at 11:41 AM, Joe Stein  wrote:

> Hi, a few folks from a few different companies (Cisco, Elodina,
> Mesosphere, Moz)  are going to be having a hangout
> https://plus.google.com/hangouts/_/stealth.ly/mesos-network every Friday
> 11am PT / 2pm ET and wanted to invite any other interested parties to join
> too. We have already talked with folks from a few other companies so if we
> chatted or not would be good for you to join if you have something you want
> to contribute (requirements, code, whatever).
>
> The goal is to have a single solution for Mesos networking so that all of
> the different requirements around "floating ip" can be achieved with
> different implementations and use cases. What goes into mesos, what is a
> module, what is something to plugin that others can help define interfaces
> for, etc, etc.
>
> If you are interested let me know and I can add you to the invite, thanks!
>
> ~ Joe Stein
> - - - - - - - - - - - - - - - - -
>
>   http://www.stealth.ly
> - - - - - - - - - - - - - - - - -
>


[jira] [Commented] (KAFKA-2168) New consumer poll() can block other calls like position(), commit(), and close() indefinitely

2015-05-28 Thread Jason Gustafson (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14563212#comment-14563212
 ] 

Jason Gustafson commented on KAFKA-2168:


Talked with Neha, Ewen, and Jay last night. Consensus was to remove the 
synchronization of KafkaConsumer and provide a wakeup() method which can be 
used to interrupt a long poll. This should solve the issue from this ticket, 
though it may hinge on KAFKA-1894, which removes polling loops from the current 
consumer. Note that this explicitly makes the consumer unsafe for 
multi-threaded access, though we will provide a thread-safe close() method 
which can be called (for example) from a shutdown hook. I'm going to update 
this ticket to reflect this change and submit a patch.

> New consumer poll() can block other calls like position(), commit(), and 
> close() indefinitely
> -
>
> Key: KAFKA-2168
> URL: https://issues.apache.org/jira/browse/KAFKA-2168
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer
>Reporter: Ewen Cheslack-Postava
>Assignee: Jason Gustafson
>
> The new consumer is currently using very coarse-grained synchronization. For 
> most methods this isn't a problem since they finish quickly once the lock is 
> acquired, but poll() might run for a long time (and commonly will since 
> polling with long timeouts is a normal use case). This means any operations 
> invoked from another thread may block until the poll() call completes.
> Some example use cases where this can be a problem:
> * A shutdown hook is registered to trigger shutdown and invokes close(). It 
> gets invoked from another thread and blocks indefinitely.
> * User wants to manage offset commit themselves in a background thread. If 
> the commit policy is not purely time based, it's not currently possibly to 
> make sure the call to commit() will be processed promptly.
> Two possible solutions to this:
> 1. Make sure a lock is not held during the actual select call. Since we have 
> multiple layers (KafkaConsumer -> NetworkClient -> Selector -> nio Selector) 
> this is probably hard to make work cleanly since locking is currently only 
> performed at the KafkaConsumer level and we'd want it unlocked around a 
> single line of code in Selector.
> 2. Wake up the selector before synchronizing for certain operations. This 
> would require some additional coordination to make sure the caller of 
> wakeup() is able to acquire the lock promptly (instead of, e.g., the poll() 
> thread being woken up and then promptly reacquiring the lock with a 
> subsequent long poll() call).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2189) Snappy compression of message batches less efficient in 0.8.2.1

2015-05-28 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao updated KAFKA-2189:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Thanks for the patch. +1 and committed to trunk.

> Snappy compression of message batches less efficient in 0.8.2.1
> ---
>
> Key: KAFKA-2189
> URL: https://issues.apache.org/jira/browse/KAFKA-2189
> Project: Kafka
>  Issue Type: Bug
>  Components: build, compression, log
>Affects Versions: 0.8.2.1
>Reporter: Olson,Andrew
>Assignee: Ismael Juma
>Priority: Blocker
>  Labels: trivial
> Fix For: 0.8.3
>
> Attachments: KAFKA-2189.patch
>
>
> We are using snappy compression and noticed a fairly substantial increase 
> (about 2.25x) in log filesystem space consumption after upgrading a Kafka 
> cluster from 0.8.1.1 to 0.8.2.1. We found that this is caused by messages 
> being seemingly recompressed individually (or possibly with a much smaller 
> buffer or dictionary?) instead of as a batch as sent by producers. We 
> eventually tracked down the change in compression ratio/scope to this [1] 
> commit that updated the snappy version from 1.0.5 to 1.1.1.3. The Kafka 
> client version does not appear to be relevant as we can reproduce this with 
> both the 0.8.1.1 and 0.8.2.1 Producer.
> Here are the log files from our troubleshooting that contain the same set of 
> 1000 messages, for batch sizes of 1, 10, 100, and 1000. f9d9b was the last 
> commit with 0.8.1.1-like behavior prior to f5ab8 introducing the issue.
> {noformat}
> -rw-rw-r-- 1 kafka kafka 404967 May 12 11:45 
> /var/kafka2/f9d9b-batch-1-0/.log
> -rw-rw-r-- 1 kafka kafka 119951 May 12 11:45 
> /var/kafka2/f9d9b-batch-10-0/.log
> -rw-rw-r-- 1 kafka kafka  89645 May 12 11:45 
> /var/kafka2/f9d9b-batch-100-0/.log
> -rw-rw-r-- 1 kafka kafka  88279 May 12 11:45 
> /var/kafka2/f9d9b-batch-1000-0/.log
> -rw-rw-r-- 1 kafka kafka 402837 May 12 11:41 
> /var/kafka2/f5ab8-batch-1-0/.log
> -rw-rw-r-- 1 kafka kafka 382437 May 12 11:41 
> /var/kafka2/f5ab8-batch-10-0/.log
> -rw-rw-r-- 1 kafka kafka 364791 May 12 11:41 
> /var/kafka2/f5ab8-batch-100-0/.log
> -rw-rw-r-- 1 kafka kafka 380693 May 12 11:41 
> /var/kafka2/f5ab8-batch-1000-0/.log
> {noformat}
> [1] 
> https://github.com/apache/kafka/commit/f5ab8e1780cf80f267906e3259ad4f9278c32d28
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


RE: [VOTE] KIP-21 Dynamic Configuration

2015-05-28 Thread Aditya Auradkar
bump


From: Aditya Auradkar
Sent: Tuesday, May 26, 2015 1:16 PM
To: dev@kafka.apache.org
Subject: RE: [VOTE] KIP-21 Dynamic Configuration

Hey everyone,

Completed the changes to KIP-4. After today's hangout, there doesn't appear to 
be anything remaining to discuss on this KIP.
Please vote so we can formally close this.

Thanks,
Aditya


From: Aditya Auradkar
Sent: Thursday, May 21, 2015 11:26 AM
To: dev@kafka.apache.org
Subject: RE: [VOTE] KIP-21 Dynamic Configuration

I think we should remove the config part in TopicMetadataResponse. It's 
probably cleaner if Alter and Describe are the only way to view and modify 
configs but I don't feel very strongly about it.

Re-summarizing the proposed changes to KIP-4:
- Change AlterTopic to not allow setting configs. Config changes will flow 
through AlterConfig. CreateTopic will still allow setting configs as it is nice 
to be able to specify configs while creating the topic.
- TopicMetadataResponse shoudn't return config for the topic. DescribeConfig is 
the way to go.
- Change "InvalidTopicConfiguration" error code to "InvalidEntityConfig" as 
proposed in KIP-21.

Aditya


From: Jun Rao [j...@confluent.io]
Sent: Thursday, May 21, 2015 10:50 AM
To: dev@kafka.apache.org
Subject: Re: [VOTE] KIP-21 Dynamic Configuration

What about TopicMetadataResponse in KIP-4? Do we remove the config part in
it?

Thanks,

Jun

On Thu, May 21, 2015 at 10:25 AM, Aditya Auradkar <
aaurad...@linkedin.com.invalid> wrote:

> Hey Jun,
>
> I've added a section on error codes on the KIP-21 wiki.
>
> Here are the proposed changes to KIP-4. I'll update the wiki shortly.
> - Change AlterTopic to not allow setting configs. Config changes will flow
> through AlterConfig. CreateTopic will still allow setting configs as it is
> nice to be able to specify configs while creating the topic.
> - Change "InvalidTopicConfiguration" error code to "InvalidEntityConfig"
> as proposed in KIP-21.
>
>
> Thanks,
> Aditya
>
> 
> From: Jun Rao [j...@confluent.io]
> Sent: Thursday, May 21, 2015 8:41 AM
> To: dev@kafka.apache.org
> Subject: Re: [VOTE] KIP-21 Dynamic Configuration
>
> Aditya,
>
> For completeness, could you list the set of error codes in the wiki? Also,
> could you summarize the changes that are needed for the requests listed in
> KIP-4 and update the wiki accordingly?
>
> Thanks,
>
> Jun
>
> On Tue, May 19, 2015 at 10:33 PM, Aditya Auradkar <
> aaurad...@linkedin.com.invalid> wrote:
>
> > Thanks Andrii. I'll make the changes.
> >
> > I've also updated KIP-21 to include the new config requests. Take a look
> > and vote.
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-21+-+Dynamic+Configuration
> >
> > Aditya
> > 
> > From: Andrii Biletskyi [andrii.bilets...@stealth.ly]
> > Sent: Tuesday, May 19, 2015 2:26 PM
> > To: dev@kafka.apache.org
> > Subject: Re: [VOTE] KIP-21 Dynamic Configuration
> >
> > Hi,
> >
> > Sorry I wasn't able to participate. I don't have objections about
> removing
> > config changes from AlterTopic (as I understand both AddedConfig and
> > DeletedConfig) - you are welcome to update the KIP page.
> >
> > Thanks,
> > Andrii Biletskyi
> >
> > On Tue, May 19, 2015 at 11:40 PM, Aditya Auradkar <
> > aaurad...@linkedin.com.invalid> wrote:
> >
> > > Updating the discussion with the latest comments.
> > >
> > > 1. We discussed adding 2 new API's (AlterConfig and DescribeConfig).
> I'll
> > > update KIP-21 with details on these.
> > > 2. Discussed during the KIP hangout. We are in agreement.
> > >
> > > (1) has a dependency on KIP-4 being completed. Rest of the work in the
> > KIP
> > > can be implemented independently. Any concerns if we tackle it as two
> > > separate work items implementation wise?
> > >
> > > We also discussed changing the AlterTopic command in KIP-4 to not
> include
> > > config changes. Instead, all config changes will pass through the newly
> > > proposed AlterConfig. If no-one objects, I can make some changes to
> KIP-4
> > > to reflect this.
> > >
> > > Aditya
> > >
> > > 
> > > From: Jay Kreps [jay.kr...@gmail.com]
> > > Sent: Tuesday, May 19, 2015 10:51 AM
> > > To: dev@kafka.apache.org
> > > Subject: Re: [VOTE] KIP-21 Dynamic Configuration
> > >
> > > Hey Aditya,
> > >
> > > Two comments:
> > >
> > > 1. Yeah we need to reconcile this with the APIs in KIP-4. I think it
> does
> > > make sense to allow setting config during topic creation. I agree with
> > your
> > > summary that having alter topic and alter config may be confusing, but
> > > there are also some non-config changes such as replication factor and
> > > partition count that alter topic can carry out. What is the final state
> > you
> > > are proposing?
> > >
> > > 2. This is implementation related so probably can be removed from the
> KIP
> 

Re: [VOTE] KIP-21 Dynamic Configuration

2015-05-28 Thread Todd Palino
+1 (non binding)

> On May 28, 2015, at 11:41 AM, Aditya Auradkar 
>  wrote:
> 
> bump
> 
> 
> From: Aditya Auradkar
> Sent: Tuesday, May 26, 2015 1:16 PM
> To: dev@kafka.apache.org
> Subject: RE: [VOTE] KIP-21 Dynamic Configuration
> 
> Hey everyone,
> 
> Completed the changes to KIP-4. After today's hangout, there doesn't appear 
> to be anything remaining to discuss on this KIP.
> Please vote so we can formally close this.
> 
> Thanks,
> Aditya
> 
> 
> From: Aditya Auradkar
> Sent: Thursday, May 21, 2015 11:26 AM
> To: dev@kafka.apache.org
> Subject: RE: [VOTE] KIP-21 Dynamic Configuration
> 
> I think we should remove the config part in TopicMetadataResponse. It's 
> probably cleaner if Alter and Describe are the only way to view and modify 
> configs but I don't feel very strongly about it.
> 
> Re-summarizing the proposed changes to KIP-4:
> - Change AlterTopic to not allow setting configs. Config changes will flow 
> through AlterConfig. CreateTopic will still allow setting configs as it is 
> nice to be able to specify configs while creating the topic.
> - TopicMetadataResponse shoudn't return config for the topic. DescribeConfig 
> is the way to go.
> - Change "InvalidTopicConfiguration" error code to "InvalidEntityConfig" as 
> proposed in KIP-21.
> 
> Aditya
> 
> 
> From: Jun Rao [j...@confluent.io]
> Sent: Thursday, May 21, 2015 10:50 AM
> To: dev@kafka.apache.org
> Subject: Re: [VOTE] KIP-21 Dynamic Configuration
> 
> What about TopicMetadataResponse in KIP-4? Do we remove the config part in
> it?
> 
> Thanks,
> 
> Jun
> 
> On Thu, May 21, 2015 at 10:25 AM, Aditya Auradkar <
> aaurad...@linkedin.com.invalid> wrote:
> 
>> Hey Jun,
>> 
>> I've added a section on error codes on the KIP-21 wiki.
>> 
>> Here are the proposed changes to KIP-4. I'll update the wiki shortly.
>> - Change AlterTopic to not allow setting configs. Config changes will flow
>> through AlterConfig. CreateTopic will still allow setting configs as it is
>> nice to be able to specify configs while creating the topic.
>> - Change "InvalidTopicConfiguration" error code to "InvalidEntityConfig"
>> as proposed in KIP-21.
>> 
>> 
>> Thanks,
>> Aditya
>> 
>> 
>> From: Jun Rao [j...@confluent.io]
>> Sent: Thursday, May 21, 2015 8:41 AM
>> To: dev@kafka.apache.org
>> Subject: Re: [VOTE] KIP-21 Dynamic Configuration
>> 
>> Aditya,
>> 
>> For completeness, could you list the set of error codes in the wiki? Also,
>> could you summarize the changes that are needed for the requests listed in
>> KIP-4 and update the wiki accordingly?
>> 
>> Thanks,
>> 
>> Jun
>> 
>> On Tue, May 19, 2015 at 10:33 PM, Aditya Auradkar <
>> aaurad...@linkedin.com.invalid> wrote:
>> 
>>> Thanks Andrii. I'll make the changes.
>>> 
>>> I've also updated KIP-21 to include the new config requests. Take a look
>>> and vote.
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-21+-+Dynamic+Configuration
>>> 
>>> Aditya
>>> 
>>> From: Andrii Biletskyi [andrii.bilets...@stealth.ly]
>>> Sent: Tuesday, May 19, 2015 2:26 PM
>>> To: dev@kafka.apache.org
>>> Subject: Re: [VOTE] KIP-21 Dynamic Configuration
>>> 
>>> Hi,
>>> 
>>> Sorry I wasn't able to participate. I don't have objections about
>> removing
>>> config changes from AlterTopic (as I understand both AddedConfig and
>>> DeletedConfig) - you are welcome to update the KIP page.
>>> 
>>> Thanks,
>>> Andrii Biletskyi
>>> 
>>> On Tue, May 19, 2015 at 11:40 PM, Aditya Auradkar <
>>> aaurad...@linkedin.com.invalid> wrote:
>>> 
 Updating the discussion with the latest comments.
 
 1. We discussed adding 2 new API's (AlterConfig and DescribeConfig).
>> I'll
 update KIP-21 with details on these.
 2. Discussed during the KIP hangout. We are in agreement.
 
 (1) has a dependency on KIP-4 being completed. Rest of the work in the
>>> KIP
 can be implemented independently. Any concerns if we tackle it as two
 separate work items implementation wise?
 
 We also discussed changing the AlterTopic command in KIP-4 to not
>> include
 config changes. Instead, all config changes will pass through the newly
 proposed AlterConfig. If no-one objects, I can make some changes to
>> KIP-4
 to reflect this.
 
 Aditya
 
 
 From: Jay Kreps [jay.kr...@gmail.com]
 Sent: Tuesday, May 19, 2015 10:51 AM
 To: dev@kafka.apache.org
 Subject: Re: [VOTE] KIP-21 Dynamic Configuration
 
 Hey Aditya,
 
 Two comments:
 
 1. Yeah we need to reconcile this with the APIs in KIP-4. I think it
>> does
 make sense to allow setting config during topic creation. I agree with
>>> your
 summary that having alter topic and alter config may be confusing, but
 there are also some non-config changes such as repli

Re: Review Request 34734: Patch for KAFKA-2226

2015-05-28 Thread Jun Rao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34734/#review85596
---


Thanks for the patch. A couple of questions below.

Also, in TimerTask.setTimerTaskEntry(), the comment suggests that the TaskEntry 
can change for a TimerTask. However, it seems that we set the entry for the 
task only during entry construction time. So, can the TaskEntry in the 
TimerTask ever change?


core/src/main/scala/kafka/utils/timer/TimerTaskList.scala


Could you explain a bit why this is needed? It seems that we can add the 
entry either when it's created for the first time or when it's removed from the 
current list and needs to be added to a new list during reinsert. In both 
cases, the list in the entry will be null and there is no need to remove the 
entry from the list.



core/src/main/scala/kafka/utils/timer/TimerTaskList.scala


So, I guess the race condition is the following. The expiration thread 
moves a TimerTaskEntry from one TimerTaskList to another during reinsert. At 
the same time, another thread can complete an operation and try to remove the 
entry from the list. Is that right?

With the patch, it seems when TimerTask.cancel tries to re move an entry 
from the list, the following can happen (1) line 133 tests that list in entry 
is not null, (2) a reinsert process happens and the entry is removed from list 
which sets list in the entry to null, (3) list.remove in 134 is called and the 
entry is not removed since list is now null, (4) line 133 is tested again and 
since list is now null, we quit the loop, (5) the reinsert process adds the 
entry to a new list. 

At this point, a completed entry still exists in the list.


- Jun Rao


On May 27, 2015, 9 p.m., Yasuhiro Matsuda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34734/
> ---
> 
> (Updated May 27, 2015, 9 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-2226
> https://issues.apache.org/jira/browse/KAFKA-2226
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> fix a race condition in TimerTaskEntry.remove
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/utils/timer/TimerTaskList.scala 
> e7a96570ddc2367583d6d5590628db7e7f6ba39b 
> 
> Diff: https://reviews.apache.org/r/34734/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Yasuhiro Matsuda
> 
>



Re: Review Request 34734: Patch for KAFKA-2226

2015-05-28 Thread Yasuhiro Matsuda


> On May 28, 2015, 8:42 a.m., Onur Karaman wrote:
> > It seems like you're concerned about adding/removing a TimerTaskEntry that 
> > already exists in another TimerTaskList. Can you explain how that can 
> > happen? My understanding of the timing wheel stuff is only so-so.

For adding, a TimerTaskEntry should not exist in any list. If it does, removing 
it from the existing list keeps the structure consistent. That is why I added 
the remove call in the add method.
For removing, there is a race condition. When a bucket expires, an entry in the 
bucket is either expired or moved to a finer grain wheel. TimerTaskEntry.remove 
is called then. Then the race condition happens if TimerTask.cancel is called 
at the same time. The remove operation is synchronized on a TimerTaskList 
instance. Therefore, in the syncrinized block, we have to doublecheck that the 
entry still belongs to the list. If the mehod doesn't remove the entry from the 
list due to the race condition, it will retry.


- Yasuhiro


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34734/#review85530
---


On May 27, 2015, 9 p.m., Yasuhiro Matsuda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34734/
> ---
> 
> (Updated May 27, 2015, 9 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-2226
> https://issues.apache.org/jira/browse/KAFKA-2226
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> fix a race condition in TimerTaskEntry.remove
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/utils/timer/TimerTaskList.scala 
> e7a96570ddc2367583d6d5590628db7e7f6ba39b 
> 
> Diff: https://reviews.apache.org/r/34734/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Yasuhiro Matsuda
> 
>



Re: Review Request 34734: Patch for KAFKA-2226

2015-05-28 Thread Yasuhiro Matsuda


> On May 28, 2015, 7:10 p.m., Jun Rao wrote:
> > Thanks for the patch. A couple of questions below.
> > 
> > Also, in TimerTask.setTimerTaskEntry(), the comment suggests that the 
> > TaskEntry can change for a TimerTask. However, it seems that we set the 
> > entry for the task only during entry construction time. So, can the 
> > TaskEntry in the TimerTask ever change?

It can happen if a TimerTask already in a timer is submitted again or submitted 
to another timer. We never do such a thing. But the code handle such uses just 
in case.


> On May 28, 2015, 7:10 p.m., Jun Rao wrote:
> > core/src/main/scala/kafka/utils/timer/TimerTaskList.scala, lines 132-135
> > 
> >
> > So, I guess the race condition is the following. The expiration thread 
> > moves a TimerTaskEntry from one TimerTaskList to another during reinsert. 
> > At the same time, another thread can complete an operation and try to 
> > remove the entry from the list. Is that right?
> > 
> > With the patch, it seems when TimerTask.cancel tries to re move an 
> > entry from the list, the following can happen (1) line 133 tests that list 
> > in entry is not null, (2) a reinsert process happens and the entry is 
> > removed from list which sets list in the entry to null, (3) list.remove in 
> > 134 is called and the entry is not removed since list is now null, (4) line 
> > 133 is tested again and since list is now null, we quit the loop, (5) the 
> > reinsert process adds the entry to a new list. 
> > 
> > At this point, a completed entry still exists in the list.

You are right. It should be rare, but a completed entry can remain in the list 
until expiration. The completion flag in DelayedOperation prevents excess 
executions. So, it is not too bad.


- Yasuhiro


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34734/#review85596
---


On May 27, 2015, 9 p.m., Yasuhiro Matsuda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34734/
> ---
> 
> (Updated May 27, 2015, 9 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-2226
> https://issues.apache.org/jira/browse/KAFKA-2226
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> fix a race condition in TimerTaskEntry.remove
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/utils/timer/TimerTaskList.scala 
> e7a96570ddc2367583d6d5590628db7e7f6ba39b 
> 
> Diff: https://reviews.apache.org/r/34734/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Yasuhiro Matsuda
> 
>



Re: Review Request 34734: Patch for KAFKA-2226

2015-05-28 Thread Jun Rao


> On May 28, 2015, 7:10 p.m., Jun Rao wrote:
> > core/src/main/scala/kafka/utils/timer/TimerTaskList.scala, lines 132-135
> > 
> >
> > So, I guess the race condition is the following. The expiration thread 
> > moves a TimerTaskEntry from one TimerTaskList to another during reinsert. 
> > At the same time, another thread can complete an operation and try to 
> > remove the entry from the list. Is that right?
> > 
> > With the patch, it seems when TimerTask.cancel tries to re move an 
> > entry from the list, the following can happen (1) line 133 tests that list 
> > in entry is not null, (2) a reinsert process happens and the entry is 
> > removed from list which sets list in the entry to null, (3) list.remove in 
> > 134 is called and the entry is not removed since list is now null, (4) line 
> > 133 is tested again and since list is now null, we quit the loop, (5) the 
> > reinsert process adds the entry to a new list. 
> > 
> > At this point, a completed entry still exists in the list.
> 
> Yasuhiro Matsuda wrote:
> You are right. It should be rare, but a completed entry can remain in the 
> list until expiration. The completion flag in DelayedOperation prevents 
> excess executions. So, it is not too bad.

Thanks for the clarification. Thinking about this a bit more, could we hit a 
NullPointerException in step (3)? At that point, list could be null when we 
call remove.


- Jun


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34734/#review85596
---


On May 27, 2015, 9 p.m., Yasuhiro Matsuda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34734/
> ---
> 
> (Updated May 27, 2015, 9 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-2226
> https://issues.apache.org/jira/browse/KAFKA-2226
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> fix a race condition in TimerTaskEntry.remove
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/utils/timer/TimerTaskList.scala 
> e7a96570ddc2367583d6d5590628db7e7f6ba39b 
> 
> Diff: https://reviews.apache.org/r/34734/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Yasuhiro Matsuda
> 
>



Re: Review Request 32869: Patch for KAFKA-2091

2015-05-28 Thread Joel Koshy

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32869/#review85608
---

Ship it!


Ship It!

- Joel Koshy


On May 27, 2015, 10:50 p.m., Sriharsha Chintalapani wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32869/
> ---
> 
> (Updated May 27, 2015, 10:50 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-2091
> https://issues.apache.org/jira/browse/KAFKA-2091
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> KAFKA-2091. Expose a Partitioner interface in the new producer.
> 
> 
> Diffs
> -
> 
>   clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 
> 8e336a3aa96c73f52beaeb56b931baf4b026cf21 
>   clients/src/main/java/org/apache/kafka/clients/producer/MockProducer.java 
> 3c34610dbc8a68e4561e7264e0b545de3d93cef2 
>   clients/src/main/java/org/apache/kafka/clients/producer/Partitioner.java 
> PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/clients/producer/ProducerConfig.java 
> 187d0004c8c46b6664ddaffecc6166d4b47351e5 
>   
> clients/src/main/java/org/apache/kafka/clients/producer/internals/Partitioner.java
>  93e799105fb6cc5c49a129c0db099a3a973b2ab3 
>   
> clients/src/test/java/org/apache/kafka/clients/producer/internals/PartitionerTest.java
>  5dadd0e3554577ad6be28a18ff5ab08f8b31050f 
> 
> Diff: https://reviews.apache.org/r/32869/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sriharsha Chintalapani
> 
>



Re: Review Request 34734: Patch for KAFKA-2226

2015-05-28 Thread Onur Karaman


> On May 28, 2015, 7:10 p.m., Jun Rao wrote:
> > core/src/main/scala/kafka/utils/timer/TimerTaskList.scala, lines 132-135
> > 
> >
> > So, I guess the race condition is the following. The expiration thread 
> > moves a TimerTaskEntry from one TimerTaskList to another during reinsert. 
> > At the same time, another thread can complete an operation and try to 
> > remove the entry from the list. Is that right?
> > 
> > With the patch, it seems when TimerTask.cancel tries to re move an 
> > entry from the list, the following can happen (1) line 133 tests that list 
> > in entry is not null, (2) a reinsert process happens and the entry is 
> > removed from list which sets list in the entry to null, (3) list.remove in 
> > 134 is called and the entry is not removed since list is now null, (4) line 
> > 133 is tested again and since list is now null, we quit the loop, (5) the 
> > reinsert process adds the entry to a new list. 
> > 
> > At this point, a completed entry still exists in the list.
> 
> Yasuhiro Matsuda wrote:
> You are right. It should be rare, but a completed entry can remain in the 
> list until expiration. The completion flag in DelayedOperation prevents 
> excess executions. So, it is not too bad.
> 
> Jun Rao wrote:
> Thanks for the clarification. Thinking about this a bit more, could we 
> hit a NullPointerException in step (3)? At that point, list could be null 
> when we call remove.

Yeah that check-then-act should probably be done atomically. Maybe all 
changes/check-then-acts to TimerTaskEntry just need to be guarded by the 
TimerTaskEntry's intrinsic lock?


- Onur


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34734/#review85596
---


On May 27, 2015, 9 p.m., Yasuhiro Matsuda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34734/
> ---
> 
> (Updated May 27, 2015, 9 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-2226
> https://issues.apache.org/jira/browse/KAFKA-2226
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> fix a race condition in TimerTaskEntry.remove
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/utils/timer/TimerTaskList.scala 
> e7a96570ddc2367583d6d5590628db7e7f6ba39b 
> 
> Diff: https://reviews.apache.org/r/34734/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Yasuhiro Matsuda
> 
>



[jira] [Commented] (KAFKA-2224) kafka-consumer-offset-checker when kafka offset storage is used

2015-05-28 Thread Jiangjie Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14563652#comment-14563652
 ] 

Jiangjie Qin commented on KAFKA-2224:
-

Seems to be duplicate of KAFKA-1951.

> kafka-consumer-offset-checker when kafka offset storage is used
> ---
>
> Key: KAFKA-2224
> URL: https://issues.apache.org/jira/browse/KAFKA-2224
> Project: Kafka
>  Issue Type: Improvement
>  Components: tools
>Affects Versions: 0.8.2.0
>Reporter: Sadek
>
> Hi,
> It looks like kafka-consumer-offset-checker will only work when offset data 
> is stored in zookeeper.
> Thanks,
> Sadek



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (KAFKA-2224) kafka-consumer-offset-checker when kafka offset storage is used

2015-05-28 Thread Jiangjie Qin (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiangjie Qin resolved KAFKA-2224.
-
Resolution: Duplicate

Duplicate of KAFKA-1951

> kafka-consumer-offset-checker when kafka offset storage is used
> ---
>
> Key: KAFKA-2224
> URL: https://issues.apache.org/jira/browse/KAFKA-2224
> Project: Kafka
>  Issue Type: Improvement
>  Components: tools
>Affects Versions: 0.8.2.0
>Reporter: Sadek
>
> Hi,
> It looks like kafka-consumer-offset-checker will only work when offset data 
> is stored in zookeeper.
> Thanks,
> Sadek



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2091) Expose a Partitioner interface in the new producer

2015-05-28 Thread Joel Koshy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Koshy updated KAFKA-2091:
--

Thanks for the patch - committed to trunk.

> Expose a Partitioner interface in the new producer
> --
>
> Key: KAFKA-2091
> URL: https://issues.apache.org/jira/browse/KAFKA-2091
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jay Kreps
>Assignee: Sriharsha Chintalapani
> Attachments: KAFKA-2091.patch, KAFKA-2091_2015-05-27_15:50:18.patch
>
>
> In the new producer you can pass in a key or hard code the partition as part 
> of ProducerRecord.
> Internally we are using a class
> {code}
> class Partitioner {
> public int partition(String topic, byte[] key, Integer partition, Cluster 
> cluster) {...}
> }
> {code}
> This class uses the specified partition if there is one; uses a hash of the 
> key if there isn't a partition but there is a key; and simply chooses a 
> partition round robin if there is neither a partition nor a key.
> However there are several partitioning strategies that could be useful that 
> we don't support out of the box. 
> An example would be having each producer periodically choose a random 
> partition. This tends to be the most efficient since all data goes to one 
> server and uses the fewest TCP connections, however it only produces good 
> load balancing if there are many producers.
> Of course a user can do this now by just setting the partition manually, but 
> that is a bit inconvenient if you need to do that across a bunch of apps 
> since each will need to remember to set the partition every time.
> The idea would be to expose a configuration to set the partitioner 
> implementation like
> {code}
> partitioner.class=org.apache.kafka.producer.DefaultPartitioner
> {code}
> This would default to the existing partitioner implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2186) Follow-up patch of KAFKA-1650

2015-05-28 Thread Joel Koshy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Koshy updated KAFKA-2186:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Thanks for the patch - committed to trunk

> Follow-up patch of KAFKA-1650
> -
>
> Key: KAFKA-2186
> URL: https://issues.apache.org/jira/browse/KAFKA-2186
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Attachments: KAFKA-2186.patch, KAFKA-2186.patch
>
>
> Offsets commit with a map was added in KAFKA-1650. It should be added to 
> consumer connector java API also.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 34734: Patch for KAFKA-2226

2015-05-28 Thread Yasuhiro Matsuda


> On May 28, 2015, 7:10 p.m., Jun Rao wrote:
> > core/src/main/scala/kafka/utils/timer/TimerTaskList.scala, lines 132-135
> > 
> >
> > So, I guess the race condition is the following. The expiration thread 
> > moves a TimerTaskEntry from one TimerTaskList to another during reinsert. 
> > At the same time, another thread can complete an operation and try to 
> > remove the entry from the list. Is that right?
> > 
> > With the patch, it seems when TimerTask.cancel tries to re move an 
> > entry from the list, the following can happen (1) line 133 tests that list 
> > in entry is not null, (2) a reinsert process happens and the entry is 
> > removed from list which sets list in the entry to null, (3) list.remove in 
> > 134 is called and the entry is not removed since list is now null, (4) line 
> > 133 is tested again and since list is now null, we quit the loop, (5) the 
> > reinsert process adds the entry to a new list. 
> > 
> > At this point, a completed entry still exists in the list.
> 
> Yasuhiro Matsuda wrote:
> You are right. It should be rare, but a completed entry can remain in the 
> list until expiration. The completion flag in DelayedOperation prevents 
> excess executions. So, it is not too bad.
> 
> Jun Rao wrote:
> Thanks for the clarification. Thinking about this a bit more, could we 
> hit a NullPointerException in step (3)? At that point, list could be null 
> when we call remove.
> 
> Onur Karaman wrote:
> Yeah that check-then-act should probably be done atomically. Maybe all 
> changes/check-then-acts to TimerTaskEntry just need to be guarded by the 
> TimerTaskEntry's intrinsic lock?

Oops. The value of TimerTaskEntry.list should be save to a local variable.


- Yasuhiro


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34734/#review85596
---


On May 27, 2015, 9 p.m., Yasuhiro Matsuda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34734/
> ---
> 
> (Updated May 27, 2015, 9 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-2226
> https://issues.apache.org/jira/browse/KAFKA-2226
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> fix a race condition in TimerTaskEntry.remove
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/utils/timer/TimerTaskList.scala 
> e7a96570ddc2367583d6d5590628db7e7f6ba39b 
> 
> Diff: https://reviews.apache.org/r/34734/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Yasuhiro Matsuda
> 
>



[jira] [Commented] (KAFKA-2225) Allow fetching from ISR

2015-05-28 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14563732#comment-14563732
 ] 

Jun Rao commented on KAFKA-2225:


[~iconara], just to clarify, so you are using the private ip to communicate to 
the brokers?

> Allow fetching from ISR
> ---
>
> Key: KAFKA-2225
> URL: https://issues.apache.org/jira/browse/KAFKA-2225
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Theo Hultberg
>Assignee: Parth Brahmbhatt
>
> Currently clients are not allowed to fetch from replicas, even if they are in 
> sync with the master. If I'm not missing anything significant it shouldn't be 
> any difference fetching from the leader or an ISR, besides maybe some extra 
> latency.
> For our use case it would be very beneficial to be able to fetch from 
> replicas instead of just the leader. We run Kafka clusters that replicate 
> across EC2 availability zones, and each byte sent between zones costs money. 
> This bandwith usage costs us about the same as it costs to run the instances. 
> If we could fetch from a replica in the same zone as the client we could 
> avoid some of this cost.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 34734: Patch for KAFKA-2226

2015-05-28 Thread Jun Rao


> On May 28, 2015, 7:10 p.m., Jun Rao wrote:
> > core/src/main/scala/kafka/utils/timer/TimerTaskList.scala, lines 132-135
> > 
> >
> > So, I guess the race condition is the following. The expiration thread 
> > moves a TimerTaskEntry from one TimerTaskList to another during reinsert. 
> > At the same time, another thread can complete an operation and try to 
> > remove the entry from the list. Is that right?
> > 
> > With the patch, it seems when TimerTask.cancel tries to re move an 
> > entry from the list, the following can happen (1) line 133 tests that list 
> > in entry is not null, (2) a reinsert process happens and the entry is 
> > removed from list which sets list in the entry to null, (3) list.remove in 
> > 134 is called and the entry is not removed since list is now null, (4) line 
> > 133 is tested again and since list is now null, we quit the loop, (5) the 
> > reinsert process adds the entry to a new list. 
> > 
> > At this point, a completed entry still exists in the list.
> 
> Yasuhiro Matsuda wrote:
> You are right. It should be rare, but a completed entry can remain in the 
> list until expiration. The completion flag in DelayedOperation prevents 
> excess executions. So, it is not too bad.
> 
> Jun Rao wrote:
> Thanks for the clarification. Thinking about this a bit more, could we 
> hit a NullPointerException in step (3)? At that point, list could be null 
> when we call remove.
> 
> Onur Karaman wrote:
> Yeah that check-then-act should probably be done atomically. Maybe all 
> changes/check-then-acts to TimerTaskEntry just need to be guarded by the 
> TimerTaskEntry's intrinsic lock?
> 
> Yasuhiro Matsuda wrote:
> Oops. The value of TimerTaskEntry.list should be save to a local variable.

Ok, thanks. Perhaps we can also add some comments explaining why we need the 
while loop and the rare possibility of not removing a completed operation from 
the Timer.


- Jun


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34734/#review85596
---


On May 27, 2015, 9 p.m., Yasuhiro Matsuda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34734/
> ---
> 
> (Updated May 27, 2015, 9 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-2226
> https://issues.apache.org/jira/browse/KAFKA-2226
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> fix a race condition in TimerTaskEntry.remove
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/utils/timer/TimerTaskList.scala 
> e7a96570ddc2367583d6d5590628db7e7f6ba39b 
> 
> Diff: https://reviews.apache.org/r/34734/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Yasuhiro Matsuda
> 
>



[jira] [Commented] (KAFKA-2168) New consumer poll() can block other calls like position(), commit(), and close() indefinitely

2015-05-28 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14563753#comment-14563753
 ] 

Guozhang Wang commented on KAFKA-2168:
--

I would also prefer to stick with single-threaded consumer usage, and I agree 
that KAFKA-2123 would be important to have then.

> New consumer poll() can block other calls like position(), commit(), and 
> close() indefinitely
> -
>
> Key: KAFKA-2168
> URL: https://issues.apache.org/jira/browse/KAFKA-2168
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer
>Reporter: Ewen Cheslack-Postava
>Assignee: Jason Gustafson
>
> The new consumer is currently using very coarse-grained synchronization. For 
> most methods this isn't a problem since they finish quickly once the lock is 
> acquired, but poll() might run for a long time (and commonly will since 
> polling with long timeouts is a normal use case). This means any operations 
> invoked from another thread may block until the poll() call completes.
> Some example use cases where this can be a problem:
> * A shutdown hook is registered to trigger shutdown and invokes close(). It 
> gets invoked from another thread and blocks indefinitely.
> * User wants to manage offset commit themselves in a background thread. If 
> the commit policy is not purely time based, it's not currently possibly to 
> make sure the call to commit() will be processed promptly.
> Two possible solutions to this:
> 1. Make sure a lock is not held during the actual select call. Since we have 
> multiple layers (KafkaConsumer -> NetworkClient -> Selector -> nio Selector) 
> this is probably hard to make work cleanly since locking is currently only 
> performed at the KafkaConsumer level and we'd want it unlocked around a 
> single line of code in Selector.
> 2. Wake up the selector before synchronizing for certain operations. This 
> would require some additional coordination to make sure the caller of 
> wakeup() is able to acquire the lock promptly (instead of, e.g., the poll() 
> thread being woken up and then promptly reacquiring the lock with a 
> subsequent long poll() call).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1367) Broker topic metadata not kept in sync with ZooKeeper

2015-05-28 Thread Ashish K Singh (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14563795#comment-14563795
 ] 

Ashish K Singh commented on KAFKA-1367:
---

[~jjkoshy] pinging you for your confirmation on this.

> Broker topic metadata not kept in sync with ZooKeeper
> -
>
> Key: KAFKA-1367
> URL: https://issues.apache.org/jira/browse/KAFKA-1367
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.0, 0.8.1
>Reporter: Ryan Berdeen
>Assignee: Ashish K Singh
>  Labels: newbie++
> Fix For: 0.8.3
>
> Attachments: KAFKA-1367.txt
>
>
> When a broker is restarted, the topic metadata responses from the brokers 
> will be incorrect (different from ZooKeeper) until a preferred replica leader 
> election.
> In the metadata, it looks like leaders are correctly removed from the ISR 
> when a broker disappears, but followers are not. Then, when a broker 
> reappears, the ISR is never updated.
> I used a variation of the Vagrant setup created by Joe Stein to reproduce 
> this with latest from the 0.8.1 branch: 
> https://github.com/also/kafka/commit/dba36a503a5e22ea039df0f9852560b4fb1e067c



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 34734: Patch for KAFKA-2226

2015-05-28 Thread Guozhang Wang


> On May 28, 2015, 7:10 p.m., Jun Rao wrote:
> > core/src/main/scala/kafka/utils/timer/TimerTaskList.scala, lines 132-135
> > 
> >
> > So, I guess the race condition is the following. The expiration thread 
> > moves a TimerTaskEntry from one TimerTaskList to another during reinsert. 
> > At the same time, another thread can complete an operation and try to 
> > remove the entry from the list. Is that right?
> > 
> > With the patch, it seems when TimerTask.cancel tries to re move an 
> > entry from the list, the following can happen (1) line 133 tests that list 
> > in entry is not null, (2) a reinsert process happens and the entry is 
> > removed from list which sets list in the entry to null, (3) list.remove in 
> > 134 is called and the entry is not removed since list is now null, (4) line 
> > 133 is tested again and since list is now null, we quit the loop, (5) the 
> > reinsert process adds the entry to a new list. 
> > 
> > At this point, a completed entry still exists in the list.
> 
> Yasuhiro Matsuda wrote:
> You are right. It should be rare, but a completed entry can remain in the 
> list until expiration. The completion flag in DelayedOperation prevents 
> excess executions. So, it is not too bad.
> 
> Jun Rao wrote:
> Thanks for the clarification. Thinking about this a bit more, could we 
> hit a NullPointerException in step (3)? At that point, list could be null 
> when we call remove.
> 
> Onur Karaman wrote:
> Yeah that check-then-act should probably be done atomically. Maybe all 
> changes/check-then-acts to TimerTaskEntry just need to be guarded by the 
> TimerTaskEntry's intrinsic lock?
> 
> Yasuhiro Matsuda wrote:
> Oops. The value of TimerTaskEntry.list should be save to a local variable.
> 
> Jun Rao wrote:
> Ok, thanks. Perhaps we can also add some comments explaining why we need 
> the while loop and the rare possibility of not removing a completed operation 
> from the Timer.

Could it happen that concurrent threads are calling TimerTaskList.add(entry) 
and TimerTaskList.remove(entry) on different lists for the same entry? Since we 
do not synchronize on the entry object this could cause race condition on 
next/prev/list reference manipulation.


- Guozhang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34734/#review85596
---


On May 27, 2015, 9 p.m., Yasuhiro Matsuda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34734/
> ---
> 
> (Updated May 27, 2015, 9 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-2226
> https://issues.apache.org/jira/browse/KAFKA-2226
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> fix a race condition in TimerTaskEntry.remove
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/utils/timer/TimerTaskList.scala 
> e7a96570ddc2367583d6d5590628db7e7f6ba39b 
> 
> Diff: https://reviews.apache.org/r/34734/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Yasuhiro Matsuda
> 
>



Review Request 34789: Patch for KAFKA-2168

2015-05-28 Thread Jason Gustafson

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34789/
---

Review request for kafka.


Bugs: KAFKA-2168
https://issues.apache.org/jira/browse/KAFKA-2168


Repository: kafka


Description
---

KAFKA-2168; Remove synchronization of KafkaConsumer to enable non-blocking close


Diffs
-

  
clients/src/main/java/org/apache/kafka/clients/consumer/ClosedConsumerException.java
 PRE-CREATION 
  clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java 
d301be4709f7b112e1f3a39f3c04cfa65f00fa60 
  
clients/src/main/java/org/apache/kafka/clients/consumer/internals/Coordinator.java
 b2764df11afa7a99fce46d1ff48960d889032d14 
  
clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java 
ef9dd5238fbc771496029866ece1d85db6d7b7a5 

Diff: https://reviews.apache.org/r/34789/diff/


Testing
---


Thanks,

Jason Gustafson



[jira] [Updated] (KAFKA-2168) New consumer poll() can block other calls like position(), commit(), and close() indefinitely

2015-05-28 Thread Jason Gustafson (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson updated KAFKA-2168:
---
Attachment: KAFKA-2168.patch

> New consumer poll() can block other calls like position(), commit(), and 
> close() indefinitely
> -
>
> Key: KAFKA-2168
> URL: https://issues.apache.org/jira/browse/KAFKA-2168
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer
>Reporter: Ewen Cheslack-Postava
>Assignee: Jason Gustafson
> Attachments: KAFKA-2168.patch
>
>
> The new consumer is currently using very coarse-grained synchronization. For 
> most methods this isn't a problem since they finish quickly once the lock is 
> acquired, but poll() might run for a long time (and commonly will since 
> polling with long timeouts is a normal use case). This means any operations 
> invoked from another thread may block until the poll() call completes.
> Some example use cases where this can be a problem:
> * A shutdown hook is registered to trigger shutdown and invokes close(). It 
> gets invoked from another thread and blocks indefinitely.
> * User wants to manage offset commit themselves in a background thread. If 
> the commit policy is not purely time based, it's not currently possibly to 
> make sure the call to commit() will be processed promptly.
> Two possible solutions to this:
> 1. Make sure a lock is not held during the actual select call. Since we have 
> multiple layers (KafkaConsumer -> NetworkClient -> Selector -> nio Selector) 
> this is probably hard to make work cleanly since locking is currently only 
> performed at the KafkaConsumer level and we'd want it unlocked around a 
> single line of code in Selector.
> 2. Wake up the selector before synchronizing for certain operations. This 
> would require some additional coordination to make sure the caller of 
> wakeup() is able to acquire the lock promptly (instead of, e.g., the poll() 
> thread being woken up and then promptly reacquiring the lock with a 
> subsequent long poll() call).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2168) New consumer poll() can block other calls like position(), commit(), and close() indefinitely

2015-05-28 Thread Jason Gustafson (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14563832#comment-14563832
 ] 

Jason Gustafson commented on KAFKA-2168:


Created reviewboard https://reviews.apache.org/r/34789/diff/
 against branch upstream/trunk

> New consumer poll() can block other calls like position(), commit(), and 
> close() indefinitely
> -
>
> Key: KAFKA-2168
> URL: https://issues.apache.org/jira/browse/KAFKA-2168
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer
>Reporter: Ewen Cheslack-Postava
>Assignee: Jason Gustafson
> Attachments: KAFKA-2168.patch
>
>
> The new consumer is currently using very coarse-grained synchronization. For 
> most methods this isn't a problem since they finish quickly once the lock is 
> acquired, but poll() might run for a long time (and commonly will since 
> polling with long timeouts is a normal use case). This means any operations 
> invoked from another thread may block until the poll() call completes.
> Some example use cases where this can be a problem:
> * A shutdown hook is registered to trigger shutdown and invokes close(). It 
> gets invoked from another thread and blocks indefinitely.
> * User wants to manage offset commit themselves in a background thread. If 
> the commit policy is not purely time based, it's not currently possibly to 
> make sure the call to commit() will be processed promptly.
> Two possible solutions to this:
> 1. Make sure a lock is not held during the actual select call. Since we have 
> multiple layers (KafkaConsumer -> NetworkClient -> Selector -> nio Selector) 
> this is probably hard to make work cleanly since locking is currently only 
> performed at the KafkaConsumer level and we'd want it unlocked around a 
> single line of code in Selector.
> 2. Wake up the selector before synchronizing for certain operations. This 
> would require some additional coordination to make sure the caller of 
> wakeup() is able to acquire the lock promptly (instead of, e.g., the poll() 
> thread being woken up and then promptly reacquiring the lock with a 
> subsequent long poll() call).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2168) New consumer poll() can block other calls like position(), commit(), and close() indefinitely

2015-05-28 Thread Jason Gustafson (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson updated KAFKA-2168:
---
Status: Patch Available  (was: Open)

> New consumer poll() can block other calls like position(), commit(), and 
> close() indefinitely
> -
>
> Key: KAFKA-2168
> URL: https://issues.apache.org/jira/browse/KAFKA-2168
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer
>Reporter: Ewen Cheslack-Postava
>Assignee: Jason Gustafson
> Attachments: KAFKA-2168.patch
>
>
> The new consumer is currently using very coarse-grained synchronization. For 
> most methods this isn't a problem since they finish quickly once the lock is 
> acquired, but poll() might run for a long time (and commonly will since 
> polling with long timeouts is a normal use case). This means any operations 
> invoked from another thread may block until the poll() call completes.
> Some example use cases where this can be a problem:
> * A shutdown hook is registered to trigger shutdown and invokes close(). It 
> gets invoked from another thread and blocks indefinitely.
> * User wants to manage offset commit themselves in a background thread. If 
> the commit policy is not purely time based, it's not currently possibly to 
> make sure the call to commit() will be processed promptly.
> Two possible solutions to this:
> 1. Make sure a lock is not held during the actual select call. Since we have 
> multiple layers (KafkaConsumer -> NetworkClient -> Selector -> nio Selector) 
> this is probably hard to make work cleanly since locking is currently only 
> performed at the KafkaConsumer level and we'd want it unlocked around a 
> single line of code in Selector.
> 2. Wake up the selector before synchronizing for certain operations. This 
> would require some additional coordination to make sure the caller of 
> wakeup() is able to acquire the lock promptly (instead of, e.g., the poll() 
> thread being woken up and then promptly reacquiring the lock with a 
> subsequent long poll() call).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2168) New consumer poll() can block other calls like position(), commit(), and close() indefinitely

2015-05-28 Thread Jason Gustafson (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14563843#comment-14563843
 ] 

Jason Gustafson commented on KAFKA-2168:


I've added a patch which removes synchronization and allows a prompt close 
(using the wakeup call on the underlying selector). It does not expose the 
wakeup call in the consumer interface, however, since that seems to be a bit 
trickier. I think we may want to move that to a separate ticket.

> New consumer poll() can block other calls like position(), commit(), and 
> close() indefinitely
> -
>
> Key: KAFKA-2168
> URL: https://issues.apache.org/jira/browse/KAFKA-2168
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer
>Reporter: Ewen Cheslack-Postava
>Assignee: Jason Gustafson
> Attachments: KAFKA-2168.patch
>
>
> The new consumer is currently using very coarse-grained synchronization. For 
> most methods this isn't a problem since they finish quickly once the lock is 
> acquired, but poll() might run for a long time (and commonly will since 
> polling with long timeouts is a normal use case). This means any operations 
> invoked from another thread may block until the poll() call completes.
> Some example use cases where this can be a problem:
> * A shutdown hook is registered to trigger shutdown and invokes close(). It 
> gets invoked from another thread and blocks indefinitely.
> * User wants to manage offset commit themselves in a background thread. If 
> the commit policy is not purely time based, it's not currently possibly to 
> make sure the call to commit() will be processed promptly.
> Two possible solutions to this:
> 1. Make sure a lock is not held during the actual select call. Since we have 
> multiple layers (KafkaConsumer -> NetworkClient -> Selector -> nio Selector) 
> this is probably hard to make work cleanly since locking is currently only 
> performed at the KafkaConsumer level and we'd want it unlocked around a 
> single line of code in Selector.
> 2. Wake up the selector before synchronizing for certain operations. This 
> would require some additional coordination to make sure the caller of 
> wakeup() is able to acquire the lock promptly (instead of, e.g., the poll() 
> thread being woken up and then promptly reacquiring the lock with a 
> subsequent long poll() call).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1367) Broker topic metadata not kept in sync with ZooKeeper

2015-05-28 Thread Joel Koshy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14563858#comment-14563858
 ] 

Joel Koshy commented on KAFKA-1367:
---

Hi [~singhashish] - sorry I missed your pings. Yes that is the approach we are 
planning to take. i.e., remove ISR from TMR. As mentioned in KIP-4 the ISR will 
be removed in v1 of TMR.

> Broker topic metadata not kept in sync with ZooKeeper
> -
>
> Key: KAFKA-1367
> URL: https://issues.apache.org/jira/browse/KAFKA-1367
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.0, 0.8.1
>Reporter: Ryan Berdeen
>Assignee: Ashish K Singh
>  Labels: newbie++
> Fix For: 0.8.3
>
> Attachments: KAFKA-1367.txt
>
>
> When a broker is restarted, the topic metadata responses from the brokers 
> will be incorrect (different from ZooKeeper) until a preferred replica leader 
> election.
> In the metadata, it looks like leaders are correctly removed from the ISR 
> when a broker disappears, but followers are not. Then, when a broker 
> reappears, the ISR is never updated.
> I used a variation of the Vagrant setup created by Joe Stein to reproduce 
> this with latest from the 0.8.1 branch: 
> https://github.com/also/kafka/commit/dba36a503a5e22ea039df0f9852560b4fb1e067c



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSSION] Can we move the ack timeout in ProducerRequest to broker?

2015-05-28 Thread Jun Rao
Hi, Jiangjie,

The replication time may vary a bit for different partitions. For example,
a partition with more replicas may take a bit more time to propagate the
messages. Also, the replication time depends on network latency. If you
have a Kafka cluster across availability zones, data will be replicated to
nodes within the same zone a bit faster than those outside of the zone. So,
I am not sure if it's better to just reason about the replication time as a
single timeout on the broker side. In theory, different producers may want
to pick different replication time depending on the topics being sent.

Thanks,

Jun

On Tue, May 26, 2015 at 4:46 PM, Jiangjie Qin 
wrote:

> Hi,
>
> I am updating the wiki for KIP-19 and wondering why we have a replication
> timeout on producer side and in producer request?
>
> From what I understand this is a server side setting and the reasons we
> need this replication timeout is because we want to control the purgatory
> size. If that is the case should we just have the replication timeout as a
> broker configuration?
> The downside of having it on server side might be that producer could have
> a request timeout/socket timeout smaller than replication timeout. In this
> case we can put request timeout in producer request and if the request
> timeout is smaller than replication timeout on server side, we return a
> mis-cofiguration exception.
>
> So we can have a producer request V1 which removes ack timeout but adds
> request timeout. This will give user a cleaner timeout configurations on
> producer side as well.
>
> What do people think about this?
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
>


Re: [VOTE] KIP-21 Dynamic Configuration

2015-05-28 Thread Jun Rao
+1.

Thanks,

Jun

On Thu, May 28, 2015 at 11:41 AM, Aditya Auradkar <
aaurad...@linkedin.com.invalid> wrote:

> bump
>
> 
> From: Aditya Auradkar
> Sent: Tuesday, May 26, 2015 1:16 PM
> To: dev@kafka.apache.org
> Subject: RE: [VOTE] KIP-21 Dynamic Configuration
>
> Hey everyone,
>
> Completed the changes to KIP-4. After today's hangout, there doesn't
> appear to be anything remaining to discuss on this KIP.
> Please vote so we can formally close this.
>
> Thanks,
> Aditya
>
> 
> From: Aditya Auradkar
> Sent: Thursday, May 21, 2015 11:26 AM
> To: dev@kafka.apache.org
> Subject: RE: [VOTE] KIP-21 Dynamic Configuration
>
> I think we should remove the config part in TopicMetadataResponse. It's
> probably cleaner if Alter and Describe are the only way to view and modify
> configs but I don't feel very strongly about it.
>
> Re-summarizing the proposed changes to KIP-4:
> - Change AlterTopic to not allow setting configs. Config changes will flow
> through AlterConfig. CreateTopic will still allow setting configs as it is
> nice to be able to specify configs while creating the topic.
> - TopicMetadataResponse shoudn't return config for the topic.
> DescribeConfig is the way to go.
> - Change "InvalidTopicConfiguration" error code to "InvalidEntityConfig"
> as proposed in KIP-21.
>
> Aditya
>
> 
> From: Jun Rao [j...@confluent.io]
> Sent: Thursday, May 21, 2015 10:50 AM
> To: dev@kafka.apache.org
> Subject: Re: [VOTE] KIP-21 Dynamic Configuration
>
> What about TopicMetadataResponse in KIP-4? Do we remove the config part in
> it?
>
> Thanks,
>
> Jun
>
> On Thu, May 21, 2015 at 10:25 AM, Aditya Auradkar <
> aaurad...@linkedin.com.invalid> wrote:
>
> > Hey Jun,
> >
> > I've added a section on error codes on the KIP-21 wiki.
> >
> > Here are the proposed changes to KIP-4. I'll update the wiki shortly.
> > - Change AlterTopic to not allow setting configs. Config changes will
> flow
> > through AlterConfig. CreateTopic will still allow setting configs as it
> is
> > nice to be able to specify configs while creating the topic.
> > - Change "InvalidTopicConfiguration" error code to "InvalidEntityConfig"
> > as proposed in KIP-21.
> >
> >
> > Thanks,
> > Aditya
> >
> > 
> > From: Jun Rao [j...@confluent.io]
> > Sent: Thursday, May 21, 2015 8:41 AM
> > To: dev@kafka.apache.org
> > Subject: Re: [VOTE] KIP-21 Dynamic Configuration
> >
> > Aditya,
> >
> > For completeness, could you list the set of error codes in the wiki?
> Also,
> > could you summarize the changes that are needed for the requests listed
> in
> > KIP-4 and update the wiki accordingly?
> >
> > Thanks,
> >
> > Jun
> >
> > On Tue, May 19, 2015 at 10:33 PM, Aditya Auradkar <
> > aaurad...@linkedin.com.invalid> wrote:
> >
> > > Thanks Andrii. I'll make the changes.
> > >
> > > I've also updated KIP-21 to include the new config requests. Take a
> look
> > > and vote.
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-21+-+Dynamic+Configuration
> > >
> > > Aditya
> > > 
> > > From: Andrii Biletskyi [andrii.bilets...@stealth.ly]
> > > Sent: Tuesday, May 19, 2015 2:26 PM
> > > To: dev@kafka.apache.org
> > > Subject: Re: [VOTE] KIP-21 Dynamic Configuration
> > >
> > > Hi,
> > >
> > > Sorry I wasn't able to participate. I don't have objections about
> > removing
> > > config changes from AlterTopic (as I understand both AddedConfig and
> > > DeletedConfig) - you are welcome to update the KIP page.
> > >
> > > Thanks,
> > > Andrii Biletskyi
> > >
> > > On Tue, May 19, 2015 at 11:40 PM, Aditya Auradkar <
> > > aaurad...@linkedin.com.invalid> wrote:
> > >
> > > > Updating the discussion with the latest comments.
> > > >
> > > > 1. We discussed adding 2 new API's (AlterConfig and DescribeConfig).
> > I'll
> > > > update KIP-21 with details on these.
> > > > 2. Discussed during the KIP hangout. We are in agreement.
> > > >
> > > > (1) has a dependency on KIP-4 being completed. Rest of the work in
> the
> > > KIP
> > > > can be implemented independently. Any concerns if we tackle it as two
> > > > separate work items implementation wise?
> > > >
> > > > We also discussed changing the AlterTopic command in KIP-4 to not
> > include
> > > > config changes. Instead, all config changes will pass through the
> newly
> > > > proposed AlterConfig. If no-one objects, I can make some changes to
> > KIP-4
> > > > to reflect this.
> > > >
> > > > Aditya
> > > >
> > > > 
> > > > From: Jay Kreps [jay.kr...@gmail.com]
> > > > Sent: Tuesday, May 19, 2015 10:51 AM
> > > > To: dev@kafka.apache.org
> > > > Subject: Re: [VOTE] KIP-21 Dynamic Configuration
> > > >
> > > > Hey Aditya,
> > > >
> > > > Two comments:
> > > >
> > > > 1. Yeah we need to reconcile this with the APIs in KIP-4. I think it
> > does
> > > > make sense to allow sett

Re: [DISCUSSION] Can we move the ack timeout in ProducerRequest to broker?

2015-05-28 Thread Joel Koshy
> have a Kafka cluster across availability zones, data will be replicated to
...
> single timeout on the broker side. In theory, different producers may want
> to pick different replication time depending on the topics being sent.

I think Becket raises a good point here in that the above
configurations are best known by the Kafka cluster operators and not
necessarily by the users (producers). So right now users end up having
to either know such details about the deployment when in fact it
should be set by the people who (may manually) assign partitions to
brokers; or they have to "guess" the timeouts or be content with
defaults.

Actually this would end up being a LogConfig which would be per-topic
- i.e., it won't necessarily be a single timeout on the broker.

Thanks,

Joel

On Thu, May 28, 2015 at 04:17:08PM -0700, Jun Rao wrote:
> Hi, Jiangjie,
> 
> The replication time may vary a bit for different partitions. For example,
> a partition with more replicas may take a bit more time to propagate the
> messages. Also, the replication time depends on network latency. If you
> have a Kafka cluster across availability zones, data will be replicated to
> nodes within the same zone a bit faster than those outside of the zone. So,
> I am not sure if it's better to just reason about the replication time as a
> single timeout on the broker side. In theory, different producers may want
> to pick different replication time depending on the topics being sent.
> 
> Thanks,
> 
> Jun
> 
> On Tue, May 26, 2015 at 4:46 PM, Jiangjie Qin 
> wrote:
> 
> > Hi,
> >
> > I am updating the wiki for KIP-19 and wondering why we have a replication
> > timeout on producer side and in producer request?
> >
> > From what I understand this is a server side setting and the reasons we
> > need this replication timeout is because we want to control the purgatory
> > size. If that is the case should we just have the replication timeout as a
> > broker configuration?
> > The downside of having it on server side might be that producer could have
> > a request timeout/socket timeout smaller than replication timeout. In this
> > case we can put request timeout in producer request and if the request
> > timeout is smaller than replication timeout on server side, we return a
> > mis-cofiguration exception.
> >
> > So we can have a producer request V1 which removes ack timeout but adds
> > request timeout. This will give user a cleaner timeout configurations on
> > producer side as well.
> >
> > What do people think about this?
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> >



[jira] [Resolved] (KAFKA-948) ISR list in LeaderAndISR path not updated for partitions when Broker (which is not leader) is down

2015-05-28 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao resolved KAFKA-948.
---
Resolution: Cannot Reproduce

Closing this since it's no longer active.

> ISR list in LeaderAndISR path not updated for partitions when Broker (which 
> is not leader) is down
> --
>
> Key: KAFKA-948
> URL: https://issues.apache.org/jira/browse/KAFKA-948
> Project: Kafka
>  Issue Type: Bug
>  Components: controller
>Affects Versions: 0.8.0
>Reporter: Dibyendu Bhattacharya
>Assignee: Neha Narkhede
>
> When the broker which is the leader for a partition is down, the ISR list in 
> the LeaderAndISR path is updated. But if the broker , which is not a leader 
> of the partition is down, the ISR list is not getting updated. This is an 
> issues because ISR list contains the stale entry.
> This issue I found in kafka-0.8.0-beta1-candidate1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 34789: Patch for KAFKA-2168

2015-05-28 Thread Ewen Cheslack-Postava

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34789/#review85644
---


I think close() isn't quite right, and is probably harder than just wakeup(). 
Also, I think there are other cases where NetworkClient.poll() is called in a 
loop that aren't handled, e.g. NetworkCLient.completeAll. I'm not sure these 
can be handled without pushing the closed flag into NetworkClient (maybe 
changing the name to "closing" to allow some operations to continue normally so 
code using NetworkClient can finish up whatever it was doing).


clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java


Missed removing one of the synchronized keywords.



clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java


I think this requires a bit more coordination between the threads. As 
written, won't this wakeup the selector, but this thread could continue running 
and close metrics/client/serializers before the other thread is done with them?

This gets confusing if we need to support both close() from the poll() 
thread and from another I guess -- in one case you need to wait for another 
thread to finish, in the other you can proceed immediately.



clients/src/main/java/org/apache/kafka/clients/consumer/internals/Coordinator.java


Why do these all have ensureNotClosed(), but the KafkaConsumer methods 
don't all have it?


- Ewen Cheslack-Postava


On May 28, 2015, 10:58 p.m., Jason Gustafson wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34789/
> ---
> 
> (Updated May 28, 2015, 10:58 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-2168
> https://issues.apache.org/jira/browse/KAFKA-2168
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> KAFKA-2168; Remove synchronization of KafkaConsumer to enable non-blocking 
> close
> 
> 
> Diffs
> -
> 
>   
> clients/src/main/java/org/apache/kafka/clients/consumer/ClosedConsumerException.java
>  PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java 
> d301be4709f7b112e1f3a39f3c04cfa65f00fa60 
>   
> clients/src/main/java/org/apache/kafka/clients/consumer/internals/Coordinator.java
>  b2764df11afa7a99fce46d1ff48960d889032d14 
>   
> clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java
>  ef9dd5238fbc771496029866ece1d85db6d7b7a5 
> 
> Diff: https://reviews.apache.org/r/34789/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jason Gustafson
> 
>



[jira] [Commented] (KAFKA-1367) Broker topic metadata not kept in sync with ZooKeeper

2015-05-28 Thread Ashish K Singh (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14563904#comment-14563904
 ] 

Ashish K Singh commented on KAFKA-1367:
---

Thanks [~jjkoshy]! I will update the KIP accordingly.

> Broker topic metadata not kept in sync with ZooKeeper
> -
>
> Key: KAFKA-1367
> URL: https://issues.apache.org/jira/browse/KAFKA-1367
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.0, 0.8.1
>Reporter: Ryan Berdeen
>Assignee: Ashish K Singh
>  Labels: newbie++
> Fix For: 0.8.3
>
> Attachments: KAFKA-1367.txt
>
>
> When a broker is restarted, the topic metadata responses from the brokers 
> will be incorrect (different from ZooKeeper) until a preferred replica leader 
> election.
> In the metadata, it looks like leaders are correctly removed from the ISR 
> when a broker disappears, but followers are not. Then, when a broker 
> reappears, the ISR is never updated.
> I used a variation of the Vagrant setup created by Joe Stein to reproduce 
> this with latest from the 0.8.1 branch: 
> https://github.com/also/kafka/commit/dba36a503a5e22ea039df0f9852560b4fb1e067c



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 34524: Fix KAFKA-2208

2015-05-28 Thread Onur Karaman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34524/#review85641
---



clients/src/main/java/org/apache/kafka/clients/consumer/internals/Coordinator.java


Should we split this up into two checks? If the response is 
UNKNOWN_CONSUMER_ID, you might want to additionally reset the consumer id here.

I was thinking something like this:
```
} else if (response.errorCode() == Errors.ILLEGAL_GENERATION.code()) {
  subscriptions.needReassignment();
} else if (response.errorCode() == Errors.UNKNOWN_CONSUMER_ID.code()) {
  this.consumerId = JoinGroupRequest.UNKNOWN_CONSUMER_ID;
  subscriptions.needReassignment();
}
```



core/src/main/scala/kafka/coordinator/ConsumerCoordinator.scala


Let's say a consumer sends a JoinGroupRequest for a new group g and 
provides his own non-unknown consumer id (it could be a faulty implementation 
of the new consumer). 

The Coordinator would notice group == null, make the group, notice that the 
group doesn't contain the non-unknown consumer id, and then reply with 
UNKNOWN_CONSUMER_ID. 

So basically an empty, stable group has been made with no consumers.



core/src/main/scala/kafka/coordinator/ConsumerCoordinator.scala


Is returning NOT_COORDINATOR_FOR_CONSUMER right? By this point in 
handleJoinGroup, we've already verified that we are the coordinator for the 
group.



core/src/main/scala/kafka/coordinator/ConsumerCoordinator.scala


Is returning NOT_COORDINATOR_FOR_CONSUMER right? By this point in 
handleHeartbeat, we've already verified that we are the coordinator for the 
group.


- Onur Karaman


On May 21, 2015, 2:15 a.m., Guozhang Wang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34524/
> ---
> 
> (Updated May 21, 2015, 2:15 a.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-2208
> https://issues.apache.org/jira/browse/KAFKA-2208
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> 1. Add error handling on consumer; 2. Add the max / min consumer session 
> timeout to kafka server configs; 3. Fixed some consumer bouncing tests
> 
> 
> Diffs
> -
> 
>   
> clients/src/main/java/org/apache/kafka/clients/consumer/internals/Coordinator.java
>  b2764df11afa7a99fce46d1ff48960d889032d14 
>   
> clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java
>  ef9dd5238fbc771496029866ece1d85db6d7b7a5 
>   
> clients/src/main/java/org/apache/kafka/common/requests/HeartbeatResponse.java 
> f548cd0ef70929b35ac887f8fccb7b24c3e2c11a 
>   
> clients/src/main/java/org/apache/kafka/common/requests/JoinGroupResponse.java 
> fd9c545c99058ad3fbe3b2c55ea8b6ea002f5a51 
>   core/src/main/scala/kafka/coordinator/ConsumerCoordinator.scala 
> af06ad45cdc46ac3bc27898ebc1a5bd5b1c7b19e 
>   core/src/main/scala/kafka/coordinator/ConsumerGroupMetadata.scala 
> 47bdfa7cc86fd4e841e2b1d6bfd40f1508e643bd 
>   core/src/main/scala/kafka/network/RequestChannel.scala 
> 1d0024c8f0c2ab0efa6d8cfca6455877a6ed8026 
>   core/src/main/scala/kafka/server/KafkaConfig.scala 
> 9efa15ca5567b295ab412ee9eea7c03eb4cdc18b 
>   core/src/test/scala/integration/kafka/api/ConsumerBounceTest.scala 
> 5c4cca653b3801df3494003cc40a56ae60a789a6 
>   core/src/test/scala/integration/kafka/api/ConsumerTest.scala 
> a1eed965a148eb19d9a6cefbfce131f58aaffc24 
>   core/src/test/scala/unit/kafka/server/KafkaConfigConfigDefTest.scala 
> 8014a5a6c362785539f24eb03d77278434614fe6 
> 
> Diff: https://reviews.apache.org/r/34524/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Guozhang Wang
> 
>



Re: Review Request 34789: Patch for KAFKA-2168

2015-05-28 Thread Jason Gustafson


> On May 28, 2015, 11:30 p.m., Ewen Cheslack-Postava wrote:
> > clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java, 
> > line 857
> > 
> >
> > I think this requires a bit more coordination between the threads. As 
> > written, won't this wakeup the selector, but this thread could continue 
> > running and close metrics/client/serializers before the other thread is 
> > done with them?
> > 
> > This gets confusing if we need to support both close() from the poll() 
> > thread and from another I guess -- in one case you need to wait for another 
> > thread to finish, in the other you can proceed immediately.

Good point. Makes me think that we should bring the syncronization back for 
everything except close. Then we could write close like this:

```
if (client != null) client.wakeup();
synchronized (this) {
  // Shutdown client, serializers, etc.
}
```


- Jason


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34789/#review85644
---


On May 28, 2015, 10:58 p.m., Jason Gustafson wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34789/
> ---
> 
> (Updated May 28, 2015, 10:58 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-2168
> https://issues.apache.org/jira/browse/KAFKA-2168
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> KAFKA-2168; Remove synchronization of KafkaConsumer to enable non-blocking 
> close
> 
> 
> Diffs
> -
> 
>   
> clients/src/main/java/org/apache/kafka/clients/consumer/ClosedConsumerException.java
>  PRE-CREATION 
>   clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java 
> d301be4709f7b112e1f3a39f3c04cfa65f00fa60 
>   
> clients/src/main/java/org/apache/kafka/clients/consumer/internals/Coordinator.java
>  b2764df11afa7a99fce46d1ff48960d889032d14 
>   
> clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java
>  ef9dd5238fbc771496029866ece1d85db6d7b7a5 
> 
> Diff: https://reviews.apache.org/r/34789/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jason Gustafson
> 
>



Re: [DISCUSSION] Can we move the ack timeout in ProducerRequest to broker?

2015-05-28 Thread Jay Kreps
The purpose of the replication timeout was not to control the purgatory
size but rather to bound the time you block on replication to other servers.

We discussed making it a server-side config but the rationale for making a
client configuration is that these timeouts are fundamentally about how
long the client app is willing to wait which isn't something the server can
know, and isn't something which has a single value for all apps using a
topic.

Personally I don't think this rationale still makes sense and I don't think
we should change it.

I think the question is whether we can just set this timeout automatically
off whatever higher level timeout the client configures, at least as a
default.

-Jay

On Tue, May 26, 2015 at 4:46 PM, Jiangjie Qin 
wrote:

> Hi,
>
> I am updating the wiki for KIP-19 and wondering why we have a replication
> timeout on producer side and in producer request?
>
> From what I understand this is a server side setting and the reasons we
> need this replication timeout is because we want to control the purgatory
> size. If that is the case should we just have the replication timeout as a
> broker configuration?
> The downside of having it on server side might be that producer could have
> a request timeout/socket timeout smaller than replication timeout. In this
> case we can put request timeout in producer request and if the request
> timeout is smaller than replication timeout on server side, we return a
> mis-cofiguration exception.
>
> So we can have a producer request V1 which removes ack timeout but adds
> request timeout. This will give user a cleaner timeout configurations on
> producer side as well.
>
> What do people think about this?
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
>


[jira] [Commented] (KAFKA-1367) Broker topic metadata not kept in sync with ZooKeeper

2015-05-28 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14563935#comment-14563935
 ] 

Jun Rao commented on KAFKA-1367:


There is actually a reasonable use case of ISR in KAFKA-2225. Basically, for 
economical reasons, we may want to let a consumer fetch from a replica in ISR 
that's in the same zone. In order to support that, it will be convenient to 
have TMR return the correct ISR for the consumer to choose.

Implementation wise, one way to address the concern with too many watchers is 
to do sth similar to changing topic configs. Basically, when the leader changes 
the isr, in addition to writing the new isr in the partition state in ZK, it 
also writes the change as a sequential node under a new isrChangeNotification 
path in ZK. The controller listens to child changes in the 
isrChangeNotification path. On child change, the controller reads the new isr 
and broadcasts it through an UpdateMetadataRequest to every broker.

> Broker topic metadata not kept in sync with ZooKeeper
> -
>
> Key: KAFKA-1367
> URL: https://issues.apache.org/jira/browse/KAFKA-1367
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.0, 0.8.1
>Reporter: Ryan Berdeen
>Assignee: Ashish K Singh
>  Labels: newbie++
> Fix For: 0.8.3
>
> Attachments: KAFKA-1367.txt
>
>
> When a broker is restarted, the topic metadata responses from the brokers 
> will be incorrect (different from ZooKeeper) until a preferred replica leader 
> election.
> In the metadata, it looks like leaders are correctly removed from the ISR 
> when a broker disappears, but followers are not. Then, when a broker 
> reappears, the ISR is never updated.
> I used a variation of the Vagrant setup created by Joe Stein to reproduce 
> this with latest from the 0.8.1 branch: 
> https://github.com/also/kafka/commit/dba36a503a5e22ea039df0f9852560b4fb1e067c



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] KIP-21 Dynamic Configuration

2015-05-28 Thread Jay Kreps
I still have a couple of questions:
1. Are we introducing a new Java API for the config change protocol and if
so where will that appear? Is that going to be part of the java api in the
admin api kip? Let's document that.
2. The proposed JSON format uses camel case for field names, is that what
we've used for other JSON in zookeeper?
3. This changes the format of the notifications, right? How will we
grandfather in the old format? Clusters will have existing change
notifications in the old format so I think the new code will need to be
able to read those?

-Jay

On Thu, May 28, 2015 at 11:41 AM, Aditya Auradkar <
aaurad...@linkedin.com.invalid> wrote:

> bump
>
> 
> From: Aditya Auradkar
> Sent: Tuesday, May 26, 2015 1:16 PM
> To: dev@kafka.apache.org
> Subject: RE: [VOTE] KIP-21 Dynamic Configuration
>
> Hey everyone,
>
> Completed the changes to KIP-4. After today's hangout, there doesn't
> appear to be anything remaining to discuss on this KIP.
> Please vote so we can formally close this.
>
> Thanks,
> Aditya
>
> 
> From: Aditya Auradkar
> Sent: Thursday, May 21, 2015 11:26 AM
> To: dev@kafka.apache.org
> Subject: RE: [VOTE] KIP-21 Dynamic Configuration
>
> I think we should remove the config part in TopicMetadataResponse. It's
> probably cleaner if Alter and Describe are the only way to view and modify
> configs but I don't feel very strongly about it.
>
> Re-summarizing the proposed changes to KIP-4:
> - Change AlterTopic to not allow setting configs. Config changes will flow
> through AlterConfig. CreateTopic will still allow setting configs as it is
> nice to be able to specify configs while creating the topic.
> - TopicMetadataResponse shoudn't return config for the topic.
> DescribeConfig is the way to go.
> - Change "InvalidTopicConfiguration" error code to "InvalidEntityConfig"
> as proposed in KIP-21.
>
> Aditya
>
> 
> From: Jun Rao [j...@confluent.io]
> Sent: Thursday, May 21, 2015 10:50 AM
> To: dev@kafka.apache.org
> Subject: Re: [VOTE] KIP-21 Dynamic Configuration
>
> What about TopicMetadataResponse in KIP-4? Do we remove the config part in
> it?
>
> Thanks,
>
> Jun
>
> On Thu, May 21, 2015 at 10:25 AM, Aditya Auradkar <
> aaurad...@linkedin.com.invalid> wrote:
>
> > Hey Jun,
> >
> > I've added a section on error codes on the KIP-21 wiki.
> >
> > Here are the proposed changes to KIP-4. I'll update the wiki shortly.
> > - Change AlterTopic to not allow setting configs. Config changes will
> flow
> > through AlterConfig. CreateTopic will still allow setting configs as it
> is
> > nice to be able to specify configs while creating the topic.
> > - Change "InvalidTopicConfiguration" error code to "InvalidEntityConfig"
> > as proposed in KIP-21.
> >
> >
> > Thanks,
> > Aditya
> >
> > 
> > From: Jun Rao [j...@confluent.io]
> > Sent: Thursday, May 21, 2015 8:41 AM
> > To: dev@kafka.apache.org
> > Subject: Re: [VOTE] KIP-21 Dynamic Configuration
> >
> > Aditya,
> >
> > For completeness, could you list the set of error codes in the wiki?
> Also,
> > could you summarize the changes that are needed for the requests listed
> in
> > KIP-4 and update the wiki accordingly?
> >
> > Thanks,
> >
> > Jun
> >
> > On Tue, May 19, 2015 at 10:33 PM, Aditya Auradkar <
> > aaurad...@linkedin.com.invalid> wrote:
> >
> > > Thanks Andrii. I'll make the changes.
> > >
> > > I've also updated KIP-21 to include the new config requests. Take a
> look
> > > and vote.
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-21+-+Dynamic+Configuration
> > >
> > > Aditya
> > > 
> > > From: Andrii Biletskyi [andrii.bilets...@stealth.ly]
> > > Sent: Tuesday, May 19, 2015 2:26 PM
> > > To: dev@kafka.apache.org
> > > Subject: Re: [VOTE] KIP-21 Dynamic Configuration
> > >
> > > Hi,
> > >
> > > Sorry I wasn't able to participate. I don't have objections about
> > removing
> > > config changes from AlterTopic (as I understand both AddedConfig and
> > > DeletedConfig) - you are welcome to update the KIP page.
> > >
> > > Thanks,
> > > Andrii Biletskyi
> > >
> > > On Tue, May 19, 2015 at 11:40 PM, Aditya Auradkar <
> > > aaurad...@linkedin.com.invalid> wrote:
> > >
> > > > Updating the discussion with the latest comments.
> > > >
> > > > 1. We discussed adding 2 new API's (AlterConfig and DescribeConfig).
> > I'll
> > > > update KIP-21 with details on these.
> > > > 2. Discussed during the KIP hangout. We are in agreement.
> > > >
> > > > (1) has a dependency on KIP-4 being completed. Rest of the work in
> the
> > > KIP
> > > > can be implemented independently. Any concerns if we tackle it as two
> > > > separate work items implementation wise?
> > > >
> > > > We also discussed changing the AlterTopic command in KIP-4 to not
> > include
> > > > config changes. Instead, all config changes will pass through the
> newly
> > > 

Re: [VOTE] KIP-21 Dynamic Configuration

2015-05-28 Thread Jay Kreps
I still have a couple of questions:
1. Are we introducing a new Java API for the config change protocol and if
so where will that appear? Is that going to be part of the java api in the
admin api kip? Let's document that.
2. The proposed JSON format uses camel case for field names, is that what
we've used for other JSON in zookeeper?
3. This changes the format of the notifications, right? How will we
grandfather in the old format? Clusters will have existing change
notifications in the old format so I think the new code will need to be
able to read those?

-Jay

On Thu, May 28, 2015 at 11:41 AM, Aditya Auradkar <
aaurad...@linkedin.com.invalid> wrote:

> bump
>
> 
> From: Aditya Auradkar
> Sent: Tuesday, May 26, 2015 1:16 PM
> To: dev@kafka.apache.org
> Subject: RE: [VOTE] KIP-21 Dynamic Configuration
>
> Hey everyone,
>
> Completed the changes to KIP-4. After today's hangout, there doesn't
> appear to be anything remaining to discuss on this KIP.
> Please vote so we can formally close this.
>
> Thanks,
> Aditya
>
> 
> From: Aditya Auradkar
> Sent: Thursday, May 21, 2015 11:26 AM
> To: dev@kafka.apache.org
> Subject: RE: [VOTE] KIP-21 Dynamic Configuration
>
> I think we should remove the config part in TopicMetadataResponse. It's
> probably cleaner if Alter and Describe are the only way to view and modify
> configs but I don't feel very strongly about it.
>
> Re-summarizing the proposed changes to KIP-4:
> - Change AlterTopic to not allow setting configs. Config changes will flow
> through AlterConfig. CreateTopic will still allow setting configs as it is
> nice to be able to specify configs while creating the topic.
> - TopicMetadataResponse shoudn't return config for the topic.
> DescribeConfig is the way to go.
> - Change "InvalidTopicConfiguration" error code to "InvalidEntityConfig"
> as proposed in KIP-21.
>
> Aditya
>
> 
> From: Jun Rao [j...@confluent.io]
> Sent: Thursday, May 21, 2015 10:50 AM
> To: dev@kafka.apache.org
> Subject: Re: [VOTE] KIP-21 Dynamic Configuration
>
> What about TopicMetadataResponse in KIP-4? Do we remove the config part in
> it?
>
> Thanks,
>
> Jun
>
> On Thu, May 21, 2015 at 10:25 AM, Aditya Auradkar <
> aaurad...@linkedin.com.invalid> wrote:
>
> > Hey Jun,
> >
> > I've added a section on error codes on the KIP-21 wiki.
> >
> > Here are the proposed changes to KIP-4. I'll update the wiki shortly.
> > - Change AlterTopic to not allow setting configs. Config changes will
> flow
> > through AlterConfig. CreateTopic will still allow setting configs as it
> is
> > nice to be able to specify configs while creating the topic.
> > - Change "InvalidTopicConfiguration" error code to "InvalidEntityConfig"
> > as proposed in KIP-21.
> >
> >
> > Thanks,
> > Aditya
> >
> > 
> > From: Jun Rao [j...@confluent.io]
> > Sent: Thursday, May 21, 2015 8:41 AM
> > To: dev@kafka.apache.org
> > Subject: Re: [VOTE] KIP-21 Dynamic Configuration
> >
> > Aditya,
> >
> > For completeness, could you list the set of error codes in the wiki?
> Also,
> > could you summarize the changes that are needed for the requests listed
> in
> > KIP-4 and update the wiki accordingly?
> >
> > Thanks,
> >
> > Jun
> >
> > On Tue, May 19, 2015 at 10:33 PM, Aditya Auradkar <
> > aaurad...@linkedin.com.invalid> wrote:
> >
> > > Thanks Andrii. I'll make the changes.
> > >
> > > I've also updated KIP-21 to include the new config requests. Take a
> look
> > > and vote.
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-21+-+Dynamic+Configuration
> > >
> > > Aditya
> > > 
> > > From: Andrii Biletskyi [andrii.bilets...@stealth.ly]
> > > Sent: Tuesday, May 19, 2015 2:26 PM
> > > To: dev@kafka.apache.org
> > > Subject: Re: [VOTE] KIP-21 Dynamic Configuration
> > >
> > > Hi,
> > >
> > > Sorry I wasn't able to participate. I don't have objections about
> > removing
> > > config changes from AlterTopic (as I understand both AddedConfig and
> > > DeletedConfig) - you are welcome to update the KIP page.
> > >
> > > Thanks,
> > > Andrii Biletskyi
> > >
> > > On Tue, May 19, 2015 at 11:40 PM, Aditya Auradkar <
> > > aaurad...@linkedin.com.invalid> wrote:
> > >
> > > > Updating the discussion with the latest comments.
> > > >
> > > > 1. We discussed adding 2 new API's (AlterConfig and DescribeConfig).
> > I'll
> > > > update KIP-21 with details on these.
> > > > 2. Discussed during the KIP hangout. We are in agreement.
> > > >
> > > > (1) has a dependency on KIP-4 being completed. Rest of the work in
> the
> > > KIP
> > > > can be implemented independently. Any concerns if we tackle it as two
> > > > separate work items implementation wise?
> > > >
> > > > We also discussed changing the AlterTopic command in KIP-4 to not
> > include
> > > > config changes. Instead, all config changes will pass through the
> newly
> > > 

Re: Review Request 34734: Patch for KAFKA-2226

2015-05-28 Thread Yasuhiro Matsuda


> On May 28, 2015, 7:10 p.m., Jun Rao wrote:
> > core/src/main/scala/kafka/utils/timer/TimerTaskList.scala, lines 132-135
> > 
> >
> > So, I guess the race condition is the following. The expiration thread 
> > moves a TimerTaskEntry from one TimerTaskList to another during reinsert. 
> > At the same time, another thread can complete an operation and try to 
> > remove the entry from the list. Is that right?
> > 
> > With the patch, it seems when TimerTask.cancel tries to re move an 
> > entry from the list, the following can happen (1) line 133 tests that list 
> > in entry is not null, (2) a reinsert process happens and the entry is 
> > removed from list which sets list in the entry to null, (3) list.remove in 
> > 134 is called and the entry is not removed since list is now null, (4) line 
> > 133 is tested again and since list is now null, we quit the loop, (5) the 
> > reinsert process adds the entry to a new list. 
> > 
> > At this point, a completed entry still exists in the list.
> 
> Yasuhiro Matsuda wrote:
> You are right. It should be rare, but a completed entry can remain in the 
> list until expiration. The completion flag in DelayedOperation prevents 
> excess executions. So, it is not too bad.
> 
> Jun Rao wrote:
> Thanks for the clarification. Thinking about this a bit more, could we 
> hit a NullPointerException in step (3)? At that point, list could be null 
> when we call remove.
> 
> Onur Karaman wrote:
> Yeah that check-then-act should probably be done atomically. Maybe all 
> changes/check-then-acts to TimerTaskEntry just need to be guarded by the 
> TimerTaskEntry's intrinsic lock?
> 
> Yasuhiro Matsuda wrote:
> Oops. The value of TimerTaskEntry.list should be save to a local variable.
> 
> Jun Rao wrote:
> Ok, thanks. Perhaps we can also add some comments explaining why we need 
> the while loop and the rare possibility of not removing a completed operation 
> from the Timer.
> 
> Guozhang Wang wrote:
> Could it happen that concurrent threads are calling 
> TimerTaskList.add(entry) and TimerTaskList.remove(entry) on different lists 
> for the same entry? Since we do not synchronize on the entry object this 
> could cause race condition on next/prev/list reference manipulation.

I will add a flag to TimerTaskEntry for cancelation to so that a canceled task 
won't be reinserted. This should reduce a chance of leaving completed task in a 
list.

Also I will add sync on TimerTaskEntry to TimerTaskList.{add|remove}. This 
makes one operation sync on both TimerTaskList and TimerTaskEntry. We have to 
be careful in ordering them, otherwise it may cause a deadlock. I hope I did it 
right. Please review my next patch carefully.


- Yasuhiro


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34734/#review85596
---


On May 27, 2015, 9 p.m., Yasuhiro Matsuda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34734/
> ---
> 
> (Updated May 27, 2015, 9 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-2226
> https://issues.apache.org/jira/browse/KAFKA-2226
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> fix a race condition in TimerTaskEntry.remove
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/utils/timer/TimerTaskList.scala 
> e7a96570ddc2367583d6d5590628db7e7f6ba39b 
> 
> Diff: https://reviews.apache.org/r/34734/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Yasuhiro Matsuda
> 
>



Re: [VOTE] KIP-21 Dynamic Configuration

2015-05-28 Thread Guozhang Wang
For the sequential config/changes/config_change_XX znode, do we have any
manners to do cleaning in order to avoid the change-log from growing
indefinitely?

Guozhang

On Thu, May 28, 2015 at 5:02 PM, Jay Kreps  wrote:

> I still have a couple of questions:
> 1. Are we introducing a new Java API for the config change protocol and if
> so where will that appear? Is that going to be part of the java api in the
> admin api kip? Let's document that.
> 2. The proposed JSON format uses camel case for field names, is that what
> we've used for other JSON in zookeeper?
> 3. This changes the format of the notifications, right? How will we
> grandfather in the old format? Clusters will have existing change
> notifications in the old format so I think the new code will need to be
> able to read those?
>
> -Jay
>
> On Thu, May 28, 2015 at 11:41 AM, Aditya Auradkar <
> aaurad...@linkedin.com.invalid> wrote:
>
> > bump
> >
> > 
> > From: Aditya Auradkar
> > Sent: Tuesday, May 26, 2015 1:16 PM
> > To: dev@kafka.apache.org
> > Subject: RE: [VOTE] KIP-21 Dynamic Configuration
> >
> > Hey everyone,
> >
> > Completed the changes to KIP-4. After today's hangout, there doesn't
> > appear to be anything remaining to discuss on this KIP.
> > Please vote so we can formally close this.
> >
> > Thanks,
> > Aditya
> >
> > 
> > From: Aditya Auradkar
> > Sent: Thursday, May 21, 2015 11:26 AM
> > To: dev@kafka.apache.org
> > Subject: RE: [VOTE] KIP-21 Dynamic Configuration
> >
> > I think we should remove the config part in TopicMetadataResponse. It's
> > probably cleaner if Alter and Describe are the only way to view and
> modify
> > configs but I don't feel very strongly about it.
> >
> > Re-summarizing the proposed changes to KIP-4:
> > - Change AlterTopic to not allow setting configs. Config changes will
> flow
> > through AlterConfig. CreateTopic will still allow setting configs as it
> is
> > nice to be able to specify configs while creating the topic.
> > - TopicMetadataResponse shoudn't return config for the topic.
> > DescribeConfig is the way to go.
> > - Change "InvalidTopicConfiguration" error code to "InvalidEntityConfig"
> > as proposed in KIP-21.
> >
> > Aditya
> >
> > 
> > From: Jun Rao [j...@confluent.io]
> > Sent: Thursday, May 21, 2015 10:50 AM
> > To: dev@kafka.apache.org
> > Subject: Re: [VOTE] KIP-21 Dynamic Configuration
> >
> > What about TopicMetadataResponse in KIP-4? Do we remove the config part
> in
> > it?
> >
> > Thanks,
> >
> > Jun
> >
> > On Thu, May 21, 2015 at 10:25 AM, Aditya Auradkar <
> > aaurad...@linkedin.com.invalid> wrote:
> >
> > > Hey Jun,
> > >
> > > I've added a section on error codes on the KIP-21 wiki.
> > >
> > > Here are the proposed changes to KIP-4. I'll update the wiki shortly.
> > > - Change AlterTopic to not allow setting configs. Config changes will
> > flow
> > > through AlterConfig. CreateTopic will still allow setting configs as it
> > is
> > > nice to be able to specify configs while creating the topic.
> > > - Change "InvalidTopicConfiguration" error code to
> "InvalidEntityConfig"
> > > as proposed in KIP-21.
> > >
> > >
> > > Thanks,
> > > Aditya
> > >
> > > 
> > > From: Jun Rao [j...@confluent.io]
> > > Sent: Thursday, May 21, 2015 8:41 AM
> > > To: dev@kafka.apache.org
> > > Subject: Re: [VOTE] KIP-21 Dynamic Configuration
> > >
> > > Aditya,
> > >
> > > For completeness, could you list the set of error codes in the wiki?
> > Also,
> > > could you summarize the changes that are needed for the requests listed
> > in
> > > KIP-4 and update the wiki accordingly?
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Tue, May 19, 2015 at 10:33 PM, Aditya Auradkar <
> > > aaurad...@linkedin.com.invalid> wrote:
> > >
> > > > Thanks Andrii. I'll make the changes.
> > > >
> > > > I've also updated KIP-21 to include the new config requests. Take a
> > look
> > > > and vote.
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-21+-+Dynamic+Configuration
> > > >
> > > > Aditya
> > > > 
> > > > From: Andrii Biletskyi [andrii.bilets...@stealth.ly]
> > > > Sent: Tuesday, May 19, 2015 2:26 PM
> > > > To: dev@kafka.apache.org
> > > > Subject: Re: [VOTE] KIP-21 Dynamic Configuration
> > > >
> > > > Hi,
> > > >
> > > > Sorry I wasn't able to participate. I don't have objections about
> > > removing
> > > > config changes from AlterTopic (as I understand both AddedConfig and
> > > > DeletedConfig) - you are welcome to update the KIP page.
> > > >
> > > > Thanks,
> > > > Andrii Biletskyi
> > > >
> > > > On Tue, May 19, 2015 at 11:40 PM, Aditya Auradkar <
> > > > aaurad...@linkedin.com.invalid> wrote:
> > > >
> > > > > Updating the discussion with the latest comments.
> > > > >
> > > > > 1. We discussed adding 2 new API's (AlterConfig and
> DescribeConfig).
> > > I'll
> > >

[jira] [Commented] (KAFKA-1367) Broker topic metadata not kept in sync with ZooKeeper

2015-05-28 Thread Joel Koshy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14563960#comment-14563960
 ] 

Joel Koshy commented on KAFKA-1367:
---

Jun - that is a good point. That sounds like a good approach to address the 
concern with watcher counts. Another way is to just allow brokers to send an 
equivalent update metadata (or similar) request to the controller to notify it 
of an ISR change - or even allow leaders to broadcast update metadata requests 
for ISR changes. We currently don't allow this, but maybe we should consider a 
generic broker-broker communication component. Given the use-case that Theo 
raised on the list yesterday, it appears we may want to keep the ISR even in 
TMR v1. It may make sense to discuss this at an upcoming hangout especially 
since it affects KIP-4.

> Broker topic metadata not kept in sync with ZooKeeper
> -
>
> Key: KAFKA-1367
> URL: https://issues.apache.org/jira/browse/KAFKA-1367
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.0, 0.8.1
>Reporter: Ryan Berdeen
>Assignee: Ashish K Singh
>  Labels: newbie++
> Fix For: 0.8.3
>
> Attachments: KAFKA-1367.txt
>
>
> When a broker is restarted, the topic metadata responses from the brokers 
> will be incorrect (different from ZooKeeper) until a preferred replica leader 
> election.
> In the metadata, it looks like leaders are correctly removed from the ISR 
> when a broker disappears, but followers are not. Then, when a broker 
> reappears, the ISR is never updated.
> I used a variation of the Vagrant setup created by Joe Stein to reproduce 
> this with latest from the 0.8.1 branch: 
> https://github.com/also/kafka/commit/dba36a503a5e22ea039df0f9852560b4fb1e067c



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-2230) Add a wakeup() call to the new consumer

2015-05-28 Thread Jay Kreps (JIRA)
Jay Kreps created KAFKA-2230:


 Summary: Add a wakeup() call to the new consumer
 Key: KAFKA-2230
 URL: https://issues.apache.org/jira/browse/KAFKA-2230
 Project: Kafka
  Issue Type: Improvement
Reporter: Jay Kreps


The consumer is meant to be used in an event loop where you poll with some 
timeout. There are sometime instances where you want the consumer to stop 
waiting on I/O and wake up and do something (like shut down or tell you the 
offset or whatever) it would be good to add an api like
  wakeup()
to the consumer to allow the client to do this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2226) NullPointerException in TestPurgatoryPerformance

2015-05-28 Thread Yasuhiro Matsuda (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yasuhiro Matsuda updated KAFKA-2226:

Attachment: KAFKA-2226_2015-05-28_17:18:55.patch

> NullPointerException in TestPurgatoryPerformance
> 
>
> Key: KAFKA-2226
> URL: https://issues.apache.org/jira/browse/KAFKA-2226
> Project: Kafka
>  Issue Type: Bug
>Reporter: Onur Karaman
>Assignee: Yasuhiro Matsuda
> Attachments: KAFKA-2226.patch, KAFKA-2226_2015-05-28_17:18:55.patch
>
>
> A NullPointerException sometimes shows up in TimerTaskList.remove while 
> running TestPurgatoryPerformance. I’m on the HEAD of trunk.
> {code}
> > ./bin/kafka-run-class.sh kafka.TestPurgatoryPerformance --key-space-size 
> > 10 --keys 3 --num 10 --pct50 50 --pct75 75 --rate 1000 --size 50 
> > --timeout 20
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/Users/okaraman/code/kafka/core/build/dependant-libs-2.10.5/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/Users/okaraman/code/kafka/core/build/dependant-libs-2.10.5/slf4j-log4j12-1.7.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> [2015-05-27 10:02:14,782] ERROR [completion thread], Error due to  
> (kafka.TestPurgatoryPerformance$CompletionQueue$$anon$1)
> java.lang.NullPointerException
>   at kafka.utils.timer.TimerTaskList.remove(TimerTaskList.scala:80)
>   at kafka.utils.timer.TimerTaskEntry.remove(TimerTaskList.scala:128)
>   at kafka.utils.timer.TimerTask$class.cancel(TimerTask.scala:27)
>   at kafka.server.DelayedOperation.cancel(DelayedOperation.scala:50)
>   at kafka.server.DelayedOperation.forceComplete(DelayedOperation.scala:71)
>   at 
> kafka.TestPurgatoryPerformance$CompletionQueue$$anon$1.doWork(TestPurgatoryPerformance.scala:263)
>   at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2226) NullPointerException in TestPurgatoryPerformance

2015-05-28 Thread Yasuhiro Matsuda (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14563965#comment-14563965
 ] 

Yasuhiro Matsuda commented on KAFKA-2226:
-

Updated reviewboard https://reviews.apache.org/r/34734/diff/
 against branch origin/trunk

> NullPointerException in TestPurgatoryPerformance
> 
>
> Key: KAFKA-2226
> URL: https://issues.apache.org/jira/browse/KAFKA-2226
> Project: Kafka
>  Issue Type: Bug
>Reporter: Onur Karaman
>Assignee: Yasuhiro Matsuda
> Attachments: KAFKA-2226.patch, KAFKA-2226_2015-05-28_17:18:55.patch
>
>
> A NullPointerException sometimes shows up in TimerTaskList.remove while 
> running TestPurgatoryPerformance. I’m on the HEAD of trunk.
> {code}
> > ./bin/kafka-run-class.sh kafka.TestPurgatoryPerformance --key-space-size 
> > 10 --keys 3 --num 10 --pct50 50 --pct75 75 --rate 1000 --size 50 
> > --timeout 20
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/Users/okaraman/code/kafka/core/build/dependant-libs-2.10.5/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/Users/okaraman/code/kafka/core/build/dependant-libs-2.10.5/slf4j-log4j12-1.7.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> [2015-05-27 10:02:14,782] ERROR [completion thread], Error due to  
> (kafka.TestPurgatoryPerformance$CompletionQueue$$anon$1)
> java.lang.NullPointerException
>   at kafka.utils.timer.TimerTaskList.remove(TimerTaskList.scala:80)
>   at kafka.utils.timer.TimerTaskEntry.remove(TimerTaskList.scala:128)
>   at kafka.utils.timer.TimerTask$class.cancel(TimerTask.scala:27)
>   at kafka.server.DelayedOperation.cancel(DelayedOperation.scala:50)
>   at kafka.server.DelayedOperation.forceComplete(DelayedOperation.scala:71)
>   at 
> kafka.TestPurgatoryPerformance$CompletionQueue$$anon$1.doWork(TestPurgatoryPerformance.scala:263)
>   at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 34734: Patch for KAFKA-2226

2015-05-28 Thread Yasuhiro Matsuda

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34734/
---

(Updated May 29, 2015, 12:19 a.m.)


Review request for kafka.


Bugs: KAFKA-2226
https://issues.apache.org/jira/browse/KAFKA-2226


Repository: kafka


Description
---

fix a race condition in TimerTaskEntry.remove


Diffs (updated)
-

  core/src/main/scala/kafka/utils/timer/Timer.scala 
b8cde820a770a4e894804f1c268b24b529940650 
  core/src/main/scala/kafka/utils/timer/TimerTask.scala 
3407138115d579339ffb6b00e32e38c984ac5d6e 
  core/src/main/scala/kafka/utils/timer/TimerTaskList.scala 
e7a96570ddc2367583d6d5590628db7e7f6ba39b 
  core/src/main/scala/kafka/utils/timer/TimingWheel.scala 
e92aba3844dbf3372182e14536a5d98cf3366d73 

Diff: https://reviews.apache.org/r/34734/diff/


Testing
---


Thanks,

Yasuhiro Matsuda



[jira] [Commented] (KAFKA-2168) New consumer poll() can block other calls like position(), commit(), and close() indefinitely

2015-05-28 Thread Jason Gustafson (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14563968#comment-14563968
 ] 

Jason Gustafson commented on KAFKA-2168:


This patch is a work in progress. Using the wakeup() method in order to invoke 
close() on NetworkClient does not work as easily as I thought it would due to 
the completeAll() methods which invoke poll() in a loop.

> New consumer poll() can block other calls like position(), commit(), and 
> close() indefinitely
> -
>
> Key: KAFKA-2168
> URL: https://issues.apache.org/jira/browse/KAFKA-2168
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer
>Reporter: Ewen Cheslack-Postava
>Assignee: Jason Gustafson
> Attachments: KAFKA-2168.patch
>
>
> The new consumer is currently using very coarse-grained synchronization. For 
> most methods this isn't a problem since they finish quickly once the lock is 
> acquired, but poll() might run for a long time (and commonly will since 
> polling with long timeouts is a normal use case). This means any operations 
> invoked from another thread may block until the poll() call completes.
> Some example use cases where this can be a problem:
> * A shutdown hook is registered to trigger shutdown and invokes close(). It 
> gets invoked from another thread and blocks indefinitely.
> * User wants to manage offset commit themselves in a background thread. If 
> the commit policy is not purely time based, it's not currently possibly to 
> make sure the call to commit() will be processed promptly.
> Two possible solutions to this:
> 1. Make sure a lock is not held during the actual select call. Since we have 
> multiple layers (KafkaConsumer -> NetworkClient -> Selector -> nio Selector) 
> this is probably hard to make work cleanly since locking is currently only 
> performed at the KafkaConsumer level and we'd want it unlocked around a 
> single line of code in Selector.
> 2. Wake up the selector before synchronizing for certain operations. This 
> would require some additional coordination to make sure the caller of 
> wakeup() is able to acquire the lock promptly (instead of, e.g., the poll() 
> thread being woken up and then promptly reacquiring the lock with a 
> subsequent long poll() call).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2185) Update to Gradle 2.4

2015-05-28 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao updated KAFKA-2185:
---
   Resolution: Fixed
Fix Version/s: 0.8.3
   Status: Resolved  (was: Patch Available)

Thanks for the patch. +1 and committed to trunk.

> Update to Gradle 2.4
> 
>
> Key: KAFKA-2185
> URL: https://issues.apache.org/jira/browse/KAFKA-2185
> Project: Kafka
>  Issue Type: Improvement
>  Components: build
>Reporter: Ismael Juma
>Assignee: Ismael Juma
>Priority: Minor
> Fix For: 0.8.3
>
> Attachments: KAFKA-2185_2015-05-11_20:55:08.patch
>
>
> Gradle 2.4 has been released recently while Kafka is still using Gradle 2.0. 
> There have been a large number of improvements over the various releases 
> (including performance improvements):
> https://gradle.org/docs/2.1/release-notes
> https://gradle.org/docs/2.2/release-notes
> https://gradle.org/docs/2.3/release-notes
> http://gradle.org/docs/current/release-notes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] KIP-21 Dynamic Configuration

2015-05-28 Thread Jay Kreps
That is handled now so I am assuming the same mechanism carries over?

-Jay

On Thu, May 28, 2015 at 5:12 PM, Guozhang Wang  wrote:

> For the sequential config/changes/config_change_XX znode, do we have any
> manners to do cleaning in order to avoid the change-log from growing
> indefinitely?
>
> Guozhang
>
> On Thu, May 28, 2015 at 5:02 PM, Jay Kreps  wrote:
>
> > I still have a couple of questions:
> > 1. Are we introducing a new Java API for the config change protocol and
> if
> > so where will that appear? Is that going to be part of the java api in
> the
> > admin api kip? Let's document that.
> > 2. The proposed JSON format uses camel case for field names, is that what
> > we've used for other JSON in zookeeper?
> > 3. This changes the format of the notifications, right? How will we
> > grandfather in the old format? Clusters will have existing change
> > notifications in the old format so I think the new code will need to be
> > able to read those?
> >
> > -Jay
> >
> > On Thu, May 28, 2015 at 11:41 AM, Aditya Auradkar <
> > aaurad...@linkedin.com.invalid> wrote:
> >
> > > bump
> > >
> > > 
> > > From: Aditya Auradkar
> > > Sent: Tuesday, May 26, 2015 1:16 PM
> > > To: dev@kafka.apache.org
> > > Subject: RE: [VOTE] KIP-21 Dynamic Configuration
> > >
> > > Hey everyone,
> > >
> > > Completed the changes to KIP-4. After today's hangout, there doesn't
> > > appear to be anything remaining to discuss on this KIP.
> > > Please vote so we can formally close this.
> > >
> > > Thanks,
> > > Aditya
> > >
> > > 
> > > From: Aditya Auradkar
> > > Sent: Thursday, May 21, 2015 11:26 AM
> > > To: dev@kafka.apache.org
> > > Subject: RE: [VOTE] KIP-21 Dynamic Configuration
> > >
> > > I think we should remove the config part in TopicMetadataResponse. It's
> > > probably cleaner if Alter and Describe are the only way to view and
> > modify
> > > configs but I don't feel very strongly about it.
> > >
> > > Re-summarizing the proposed changes to KIP-4:
> > > - Change AlterTopic to not allow setting configs. Config changes will
> > flow
> > > through AlterConfig. CreateTopic will still allow setting configs as it
> > is
> > > nice to be able to specify configs while creating the topic.
> > > - TopicMetadataResponse shoudn't return config for the topic.
> > > DescribeConfig is the way to go.
> > > - Change "InvalidTopicConfiguration" error code to
> "InvalidEntityConfig"
> > > as proposed in KIP-21.
> > >
> > > Aditya
> > >
> > > 
> > > From: Jun Rao [j...@confluent.io]
> > > Sent: Thursday, May 21, 2015 10:50 AM
> > > To: dev@kafka.apache.org
> > > Subject: Re: [VOTE] KIP-21 Dynamic Configuration
> > >
> > > What about TopicMetadataResponse in KIP-4? Do we remove the config part
> > in
> > > it?
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Thu, May 21, 2015 at 10:25 AM, Aditya Auradkar <
> > > aaurad...@linkedin.com.invalid> wrote:
> > >
> > > > Hey Jun,
> > > >
> > > > I've added a section on error codes on the KIP-21 wiki.
> > > >
> > > > Here are the proposed changes to KIP-4. I'll update the wiki shortly.
> > > > - Change AlterTopic to not allow setting configs. Config changes will
> > > flow
> > > > through AlterConfig. CreateTopic will still allow setting configs as
> it
> > > is
> > > > nice to be able to specify configs while creating the topic.
> > > > - Change "InvalidTopicConfiguration" error code to
> > "InvalidEntityConfig"
> > > > as proposed in KIP-21.
> > > >
> > > >
> > > > Thanks,
> > > > Aditya
> > > >
> > > > 
> > > > From: Jun Rao [j...@confluent.io]
> > > > Sent: Thursday, May 21, 2015 8:41 AM
> > > > To: dev@kafka.apache.org
> > > > Subject: Re: [VOTE] KIP-21 Dynamic Configuration
> > > >
> > > > Aditya,
> > > >
> > > > For completeness, could you list the set of error codes in the wiki?
> > > Also,
> > > > could you summarize the changes that are needed for the requests
> listed
> > > in
> > > > KIP-4 and update the wiki accordingly?
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > > > On Tue, May 19, 2015 at 10:33 PM, Aditya Auradkar <
> > > > aaurad...@linkedin.com.invalid> wrote:
> > > >
> > > > > Thanks Andrii. I'll make the changes.
> > > > >
> > > > > I've also updated KIP-21 to include the new config requests. Take a
> > > look
> > > > > and vote.
> > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-21+-+Dynamic+Configuration
> > > > >
> > > > > Aditya
> > > > > 
> > > > > From: Andrii Biletskyi [andrii.bilets...@stealth.ly]
> > > > > Sent: Tuesday, May 19, 2015 2:26 PM
> > > > > To: dev@kafka.apache.org
> > > > > Subject: Re: [VOTE] KIP-21 Dynamic Configuration
> > > > >
> > > > > Hi,
> > > > >
> > > > > Sorry I wasn't able to participate. I don't have objections about
> > > > removing
> > > > > config changes from AlterTopic (as I unders

Re: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2)

2015-05-28 Thread Jun Rao
There is a reasonable use case of ISR in KAFKA-2225. Basically, for
economical reasons, we may want to let a consumer fetch from a replica in
ISR that's in the same zone. In order to support that, it will be
convenient to have TMR return the correct ISR for the consumer to choose.

So, perhaps it's worth fixing the ISR inconsistency issue in KAFKA-1367
(there is some new discussion there on what it takes to fix this). If we do
that, we can leave TMR unchanged.

Thanks,

Jun

On Tue, May 26, 2015 at 1:13 PM, Aditya Auradkar <
aaurad...@linkedin.com.invalid> wrote:

> Andryii,
>
> I made a few edits to this document as discussed in the KIP-21 thread.
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations
>
> With these changes. the only difference between TopicMetadataResponse_V1
> and V0 is the removal of the ISR field. I've altered the KIP with the
> assumption that this is a good enough reason by itself to evolve the
> request/response protocol. Any concerns there?
>
> Thanks,
> Aditya
>
> 
> From: Mayuresh Gharat [gharatmayures...@gmail.com]
> Sent: Thursday, May 21, 2015 8:29 PM
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-4 - Command line and centralized administrative
> operations (Thread 2)
>
> Hi Jun,
>
> Thanks a lot. I get it now.
>  Point 4) will actually enable clients to who don't want to create a topic
> with default partitions, if it does not exist and then can manually create
> the topic with their own configs(#partitions).
>
> Thanks,
>
> Mayuresh
>
> On Thu, May 21, 2015 at 6:16 PM, Jun Rao  wrote:
>
> > Mayuresh,
> >
> > The current plan is the following.
> >
> > 1. Add TMR v1, which still triggers auto topic creation.
> > 2. Change the consumer client to TMR v1. Change the producer client to
> use
> > TMR v1 and on UnknownTopicException, issue TopicCreateRequest to
> explicitly
> > create the topic with the default server side partitions and replicas.
> > 3. At some later time after the new clients are released and deployed,
> > disable auto topic creation in TMR v1. This will make sure consumers
> never
> > create new topics.
> > 4. If needed, we can add a new config in the producer to control whether
> > TopicCreateRequest should be issued or not on UnknownTopicException. If
> > this is disabled and the topic doesn't exist, send will fail and the user
> > is expected to create the topic manually.
> >
> > Thanks,
> >
> > Jun
> >
> >
> > On Thu, May 21, 2015 at 5:27 PM, Mayuresh Gharat <
> > gharatmayures...@gmail.com
> > > wrote:
> >
> > > Hi,
> > > I had a question about TopicMetadata Request.
> > > Currently the way it works is :
> > >
> > > 1) Suppose a topic T1 does not exist.
> > > 2) Client wants to produce data to T1 using producer P1.
> > > 3) Since T1 does not exist, P1 issues a TopicMetadata request to kafka.
> > > This in turn creates the default number of partition. The number of
> > > partitions is a cluster wide config.
> > > 4) Same goes for a consumer. If the topic does not exist and new topic
> > will
> > > be created when the consumer issues TopicMetadata request.
> > >
> > > Here are 2 use cases where it might not be suited :
> > >
> > > The auto creation flag for topics  is turned  ON.
> > >
> > > a) Some clients might not want to create topic with default number of
> > > partitions but with lower number of partitions. Currently in a
> > multi-tenant
> > > environment this is not possible without changing the cluster wide
> > default
> > > config.
> > >
> > > b) Some clients might want to just check if the topic exist or not but
> > > currently the topic gets created automatically using default number of
> > > partitions.
> > >
> > > Here are some ideas to address this :
> > >
> > > 1) The way this can be  addressed is that TopicMetadata request should
> > have
> > > a way to specify whether it should only check if the topic exist or
> check
> > > and create a topic with given number of partitions. If the number of
> > > partitions is not specified use the default cluster wide config.
> > >
> > > OR
> > >
> > > 2) We should only allow TopicMetadata Request to get the metadata
> > > explicitly and not allow it to create a new topic. We should have
> another
> > > Request that takes in config parameters from the user regarding how
> > he/she
> > > wants the topic to be created. This request can be used if we get an
> > empty
> > > TopicMetadata Response.
> > >
> > >
> > > Thanks,
> > >
> > > Mayuresh
> > >
> > >
> > > On Thu, May 14, 2015 at 10:22 AM, Jun Rao  wrote:
> > >
> > > > For ListTopics, we decided not to add a ListTopics request for now
> and
> > > just
> > > > rely on passing in an empty list to TMR. We can revisit this in the
> > > future
> > > > if it becomes an issue.
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > > > On Wed, May 13, 2015 at 3:31 PM, Joel Koshy 
> > wrote:
> > > >
> > > > > Just had a few minor questions before I join 

[jira] [Commented] (KAFKA-1367) Broker topic metadata not kept in sync with ZooKeeper

2015-05-28 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14564002#comment-14564002
 ] 

Jun Rao commented on KAFKA-1367:


It's probably better to always let the controller propagate metadata changes to 
the brokers. If the metadata change can be sent from both the controller and 
other brokers, we need additional logic to reason about ordering.

Having the broker send the change to the controller is possible. The 
implication is that there is another thread that can exercise the controller 
logic, instead of just the ZK watcher thread. So, we may need to deal with more 
concurrency issues.

> Broker topic metadata not kept in sync with ZooKeeper
> -
>
> Key: KAFKA-1367
> URL: https://issues.apache.org/jira/browse/KAFKA-1367
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.0, 0.8.1
>Reporter: Ryan Berdeen
>Assignee: Ashish K Singh
>  Labels: newbie++
> Fix For: 0.8.3
>
> Attachments: KAFKA-1367.txt
>
>
> When a broker is restarted, the topic metadata responses from the brokers 
> will be incorrect (different from ZooKeeper) until a preferred replica leader 
> election.
> In the metadata, it looks like leaders are correctly removed from the ISR 
> when a broker disappears, but followers are not. Then, when a broker 
> reappears, the ISR is never updated.
> I used a variation of the Vagrant setup created by Joe Stein to reproduce 
> this with latest from the 0.8.1 branch: 
> https://github.com/also/kafka/commit/dba36a503a5e22ea039df0f9852560b4fb1e067c



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1367) Broker topic metadata not kept in sync with ZooKeeper

2015-05-28 Thread Joel Koshy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14564012#comment-14564012
 ] 

Joel Koshy commented on KAFKA-1367:
---

One minor issue with depending on the TMR for KAKFA-2225 even if we fix this is 
that the consumer would need to periodically refresh its metadata in case the 
ISR changes after it starts reading from a follower in ISR.

Another approach for KAFKA-2225 is to add the ISR information to the fetch 
response. The followers will then have the current ISR information and so will 
the consumers. There are at least two concerns though: first, it depends on a 
live replica fetcher thread; second, it's a bit hacky to add ISR to fetch 
response as it is more associated with metadata.


> Broker topic metadata not kept in sync with ZooKeeper
> -
>
> Key: KAFKA-1367
> URL: https://issues.apache.org/jira/browse/KAFKA-1367
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.0, 0.8.1
>Reporter: Ryan Berdeen
>Assignee: Ashish K Singh
>  Labels: newbie++
> Fix For: 0.8.3
>
> Attachments: KAFKA-1367.txt
>
>
> When a broker is restarted, the topic metadata responses from the brokers 
> will be incorrect (different from ZooKeeper) until a preferred replica leader 
> election.
> In the metadata, it looks like leaders are correctly removed from the ISR 
> when a broker disappears, but followers are not. Then, when a broker 
> reappears, the ISR is never updated.
> I used a variation of the Vagrant setup created by Joe Stein to reproduce 
> this with latest from the 0.8.1 branch: 
> https://github.com/also/kafka/commit/dba36a503a5e22ea039df0f9852560b4fb1e067c



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1367) Broker topic metadata not kept in sync with ZooKeeper

2015-05-28 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14564029#comment-14564029
 ] 

Jun Rao commented on KAFKA-1367:


Well, in that approach, you still have the problem on the very first fetch 
request. If ISR is not returned in TMR, the first fetch request has to go to 
the leader. Then the consumer has to switch to another broker on a subsequent 
request, which seems more complicated.

I am not sure if we need to rely on periodic metadata refresh to detect whether 
a replica is out of sync. Basically, as long as the fetch offset is less than 
HW, the replica can serve the request. If the fetch offset is larger than HW 
(an indication that the replica is out of sync), the consumer will get an 
OffsetOutOfRangeException and has to refresh the metadata and pick a new broker 
to fetch from.

> Broker topic metadata not kept in sync with ZooKeeper
> -
>
> Key: KAFKA-1367
> URL: https://issues.apache.org/jira/browse/KAFKA-1367
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.0, 0.8.1
>Reporter: Ryan Berdeen
>Assignee: Ashish K Singh
>  Labels: newbie++
> Fix For: 0.8.3
>
> Attachments: KAFKA-1367.txt
>
>
> When a broker is restarted, the topic metadata responses from the brokers 
> will be incorrect (different from ZooKeeper) until a preferred replica leader 
> election.
> In the metadata, it looks like leaders are correctly removed from the ISR 
> when a broker disappears, but followers are not. Then, when a broker 
> reappears, the ISR is never updated.
> I used a variation of the Vagrant setup created by Joe Stein to reproduce 
> this with latest from the 0.8.1 branch: 
> https://github.com/also/kafka/commit/dba36a503a5e22ea039df0f9852560b4fb1e067c



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 34734: Patch for KAFKA-2226

2015-05-28 Thread Jun Rao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34734/#review85665
---


Thanks for the new patch. A few more clarification questions below.


core/src/main/scala/kafka/utils/timer/TimerTaskList.scala


Could you explain a bit how the extra sync helps?



core/src/main/scala/kafka/utils/timer/TimerTaskList.scala


Since this is under synchronized, it seems that remove should always return 
true?



core/src/main/scala/kafka/utils/timer/TimerTaskList.scala


With this canceled flag, the logic is a bit more complicated since a few 
other places need to check this flag. Not sure how much this helps in reducing 
the probability of having a cancelled operation reinserted into the list. Do 
you think it's worth doing this?



core/src/main/scala/kafka/utils/timer/TimerTaskList.scala


Not sure if I follow this comment.


- Jun Rao


On May 29, 2015, 12:19 a.m., Yasuhiro Matsuda wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34734/
> ---
> 
> (Updated May 29, 2015, 12:19 a.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-2226
> https://issues.apache.org/jira/browse/KAFKA-2226
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> fix a race condition in TimerTaskEntry.remove
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/utils/timer/Timer.scala 
> b8cde820a770a4e894804f1c268b24b529940650 
>   core/src/main/scala/kafka/utils/timer/TimerTask.scala 
> 3407138115d579339ffb6b00e32e38c984ac5d6e 
>   core/src/main/scala/kafka/utils/timer/TimerTaskList.scala 
> e7a96570ddc2367583d6d5590628db7e7f6ba39b 
>   core/src/main/scala/kafka/utils/timer/TimingWheel.scala 
> e92aba3844dbf3372182e14536a5d98cf3366d73 
> 
> Diff: https://reviews.apache.org/r/34734/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Yasuhiro Matsuda
> 
>



Re: [DISCUSSION] Can we move the ack timeout in ProducerRequest to broker?

2015-05-28 Thread Jiangjie Qin
Jun, Jay and Joel, thanks a lot for the explanations.

Just summarize what I learned - please correct me if I understand wrong:
1. The actually replication time is determined by inter-broker
(intra-cluster) communication.
2. The actual replication time for different topics may be different.
3. The replication timeout config means how long a broker wants to wait
for replication. From the producer point of view, this prevents producer
from waiting too long for the response, assuming broker will ever respond.

I agree with Jay that replication timeout should be a part of request
timeout. I actually still think we can remove it from producer side.

As mentioned in (3), before we have request timeout, replication timeout
implicitly become some kind of timeout to keep producer from waiting too
long, assuming broker will ever respond. For a producer, after it sends
out a request, it either receives a response or does not receive a
response. So the producer can only control how long it will wait for the
response. And that is request timeout.  In this sense request timeout will
do better because it works even when broker does not respond.
Hence it looks to me from producer point of view, request timeout cover
all the things replication timeout does. So if we have request timeout,
replication timeout on producer side is no longer needed.

In addition, replication timeout on producer side also look a little bit
awkward. Does it mean different producer can set different replication
timeout for the same topic, even though the actual replication time of
that topic would be the same for different producers? If so, it looks the
producer side setting is only reasonable when it is actual replication
time + some safety buffer. So it looks this configuration does not really
mean "what client is willing to wait", but "what client has to wait if it
wants to produce data². So does it mean replication timeout is a producer
side config but producer does not really have control over it if the
producer wants to produce data?


Please correct me if I miss something.

Thanks,

Jiangjie (Becket) Qin
 


On 5/28/15, 4:26 PM, "Joel Koshy"  wrote:

>> have a Kafka cluster across availability zones, data will be replicated
>>to
>...
>> single timeout on the broker side. In theory, different producers may
>>want
>> to pick different replication time depending on the topics being sent.
>
>I think Becket raises a good point here in that the above
>configurations are best known by the Kafka cluster operators and not
>necessarily by the users (producers). So right now users end up having
>to either know such details about the deployment when in fact it
>should be set by the people who (may manually) assign partitions to
>brokers; or they have to "guess" the timeouts or be content with
>defaults.
>
>Actually this would end up being a LogConfig which would be per-topic
>- i.e., it won't necessarily be a single timeout on the broker.
>
>Thanks,
>
>Joel
>
>On Thu, May 28, 2015 at 04:17:08PM -0700, Jun Rao wrote:
>> Hi, Jiangjie,
>> 
>> The replication time may vary a bit for different partitions. For
>>example,
>> a partition with more replicas may take a bit more time to propagate the
>> messages. Also, the replication time depends on network latency. If you
>> have a Kafka cluster across availability zones, data will be replicated
>>to
>> nodes within the same zone a bit faster than those outside of the zone.
>>So,
>> I am not sure if it's better to just reason about the replication time
>>as a
>> single timeout on the broker side. In theory, different producers may
>>want
>> to pick different replication time depending on the topics being sent.
>> 
>> Thanks,
>> 
>> Jun
>> 
>> On Tue, May 26, 2015 at 4:46 PM, Jiangjie Qin
>>
>> wrote:
>> 
>> > Hi,
>> >
>> > I am updating the wiki for KIP-19 and wondering why we have a
>>replication
>> > timeout on producer side and in producer request?
>> >
>> > From what I understand this is a server side setting and the reasons
>>we
>> > need this replication timeout is because we want to control the
>>purgatory
>> > size. If that is the case should we just have the replication timeout
>>as a
>> > broker configuration?
>> > The downside of having it on server side might be that producer could
>>have
>> > a request timeout/socket timeout smaller than replication timeout. In
>>this
>> > case we can put request timeout in producer request and if the request
>> > timeout is smaller than replication timeout on server side, we return
>>a
>> > mis-cofiguration exception.
>> >
>> > So we can have a producer request V1 which removes ack timeout but
>>adds
>> > request timeout. This will give user a cleaner timeout configurations
>>on
>> > producer side as well.
>> >
>> > What do people think about this?
>> >
>> > Thanks,
>> >
>> > Jiangjie (Becket) Qin
>> >
>> >
>



[jira] [Commented] (KAFKA-2225) Allow fetching from ISR

2015-05-28 Thread Theo Hultberg (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14564226#comment-14564226
 ] 

Theo Hultberg commented on KAFKA-2225:
--

Yes, we use the private IP.

> Allow fetching from ISR
> ---
>
> Key: KAFKA-2225
> URL: https://issues.apache.org/jira/browse/KAFKA-2225
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Theo Hultberg
>Assignee: Parth Brahmbhatt
>
> Currently clients are not allowed to fetch from replicas, even if they are in 
> sync with the master. If I'm not missing anything significant it shouldn't be 
> any difference fetching from the leader or an ISR, besides maybe some extra 
> latency.
> For our use case it would be very beneficial to be able to fetch from 
> replicas instead of just the leader. We run Kafka clusters that replicate 
> across EC2 availability zones, and each byte sent between zones costs money. 
> This bandwith usage costs us about the same as it costs to run the instances. 
> If we could fetch from a replica in the same zone as the client we could 
> avoid some of this cost.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


RE: [DISCUSS] KIP-4 - Command line and centralized administrative operations (Thread 2)

2015-05-28 Thread Aditya Auradkar
Thanks. Perhaps we should leave TMR unchanged for now. Should we discuss this 
during the next hangout?

Aditya


From: Jun Rao [j...@confluent.io]
Sent: Thursday, May 28, 2015 5:32 PM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-4 - Command line and centralized administrative 
operations (Thread 2)

There is a reasonable use case of ISR in KAFKA-2225. Basically, for
economical reasons, we may want to let a consumer fetch from a replica in
ISR that's in the same zone. In order to support that, it will be
convenient to have TMR return the correct ISR for the consumer to choose.

So, perhaps it's worth fixing the ISR inconsistency issue in KAFKA-1367
(there is some new discussion there on what it takes to fix this). If we do
that, we can leave TMR unchanged.

Thanks,

Jun

On Tue, May 26, 2015 at 1:13 PM, Aditya Auradkar <
aaurad...@linkedin.com.invalid> wrote:

> Andryii,
>
> I made a few edits to this document as discussed in the KIP-21 thread.
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations
>
> With these changes. the only difference between TopicMetadataResponse_V1
> and V0 is the removal of the ISR field. I've altered the KIP with the
> assumption that this is a good enough reason by itself to evolve the
> request/response protocol. Any concerns there?
>
> Thanks,
> Aditya
>
> 
> From: Mayuresh Gharat [gharatmayures...@gmail.com]
> Sent: Thursday, May 21, 2015 8:29 PM
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-4 - Command line and centralized administrative
> operations (Thread 2)
>
> Hi Jun,
>
> Thanks a lot. I get it now.
>  Point 4) will actually enable clients to who don't want to create a topic
> with default partitions, if it does not exist and then can manually create
> the topic with their own configs(#partitions).
>
> Thanks,
>
> Mayuresh
>
> On Thu, May 21, 2015 at 6:16 PM, Jun Rao  wrote:
>
> > Mayuresh,
> >
> > The current plan is the following.
> >
> > 1. Add TMR v1, which still triggers auto topic creation.
> > 2. Change the consumer client to TMR v1. Change the producer client to
> use
> > TMR v1 and on UnknownTopicException, issue TopicCreateRequest to
> explicitly
> > create the topic with the default server side partitions and replicas.
> > 3. At some later time after the new clients are released and deployed,
> > disable auto topic creation in TMR v1. This will make sure consumers
> never
> > create new topics.
> > 4. If needed, we can add a new config in the producer to control whether
> > TopicCreateRequest should be issued or not on UnknownTopicException. If
> > this is disabled and the topic doesn't exist, send will fail and the user
> > is expected to create the topic manually.
> >
> > Thanks,
> >
> > Jun
> >
> >
> > On Thu, May 21, 2015 at 5:27 PM, Mayuresh Gharat <
> > gharatmayures...@gmail.com
> > > wrote:
> >
> > > Hi,
> > > I had a question about TopicMetadata Request.
> > > Currently the way it works is :
> > >
> > > 1) Suppose a topic T1 does not exist.
> > > 2) Client wants to produce data to T1 using producer P1.
> > > 3) Since T1 does not exist, P1 issues a TopicMetadata request to kafka.
> > > This in turn creates the default number of partition. The number of
> > > partitions is a cluster wide config.
> > > 4) Same goes for a consumer. If the topic does not exist and new topic
> > will
> > > be created when the consumer issues TopicMetadata request.
> > >
> > > Here are 2 use cases where it might not be suited :
> > >
> > > The auto creation flag for topics  is turned  ON.
> > >
> > > a) Some clients might not want to create topic with default number of
> > > partitions but with lower number of partitions. Currently in a
> > multi-tenant
> > > environment this is not possible without changing the cluster wide
> > default
> > > config.
> > >
> > > b) Some clients might want to just check if the topic exist or not but
> > > currently the topic gets created automatically using default number of
> > > partitions.
> > >
> > > Here are some ideas to address this :
> > >
> > > 1) The way this can be  addressed is that TopicMetadata request should
> > have
> > > a way to specify whether it should only check if the topic exist or
> check
> > > and create a topic with given number of partitions. If the number of
> > > partitions is not specified use the default cluster wide config.
> > >
> > > OR
> > >
> > > 2) We should only allow TopicMetadata Request to get the metadata
> > > explicitly and not allow it to create a new topic. We should have
> another
> > > Request that takes in config parameters from the user regarding how
> > he/she
> > > wants the topic to be created. This request can be used if we get an
> > empty
> > > TopicMetadata Response.
> > >
> > >
> > > Thanks,
> > >
> > > Mayuresh
> > >
> > >
> > > On Thu, May 14, 2015 at 10:22 AM, Jun Rao  wrote:
> > >
> > > > For ListTopics, we decided not to a

RE: [VOTE] KIP-21 Dynamic Configuration

2015-05-28 Thread Aditya Auradkar
Yeah, the same cleaning mechanism will be carried over.

> 1. Are we introducing a new Java API for the config change protocol and if
> so where will that appear? Is that going to be part of the java api in the
> admin api kip? Let's document that.
Yeah, we need to introduce a new Java API for the config change protocol. It 
should be a part of the AdminClient API. I'll alter KIP-4 to reflect that since 
the API is being introduced there.

> 2. The proposed JSON format uses camel case for field names, is that what
> we've used for other JSON in zookeeper?
I think camel case is more appropriate for the JSON format. For example, under 
the "brokers" znode, I found "jmx_port".

> 3. This changes the format of the notifications, right? How will we
> grandfather in the old format? Clusters will have existing change
> notifications in the old format so I think the new code will need to be
> able to read those?
Interesting, I figured the existing notifications were purged by a cleaner 
thread frequently. In that case, we wouldn't need to grandfather any 
notifications since we would only need to not make any config changes for X 
minutes for there to be no changes in ZK. But the old notifications are 
actually removed when a new notification is received or the broker is bounced. 
So we do need to handle notifications in the old format. Should we simply 
ignore older changes since they are only valid for a short period of time?

Thanks,
Aditya

From: Jay Kreps [jay.kr...@gmail.com]
Sent: Thursday, May 28, 2015 5:25 PM
To: dev@kafka.apache.org
Subject: Re: [VOTE] KIP-21 Dynamic Configuration

That is handled now so I am assuming the same mechanism carries over?

-Jay

On Thu, May 28, 2015 at 5:12 PM, Guozhang Wang  wrote:

> For the sequential config/changes/config_change_XX znode, do we have any
> manners to do cleaning in order to avoid the change-log from growing
> indefinitely?
>
> Guozhang
>
> On Thu, May 28, 2015 at 5:02 PM, Jay Kreps  wrote:
>
> > I still have a couple of questions:
> > 1. Are we introducing a new Java API for the config change protocol and
> if
> > so where will that appear? Is that going to be part of the java api in
> the
> > admin api kip? Let's document that.
> > 2. The proposed JSON format uses camel case for field names, is that what
> > we've used for other JSON in zookeeper?
> > 3. This changes the format of the notifications, right? How will we
> > grandfather in the old format? Clusters will have existing change
> > notifications in the old format so I think the new code will need to be
> > able to read those?
> >
> > -Jay
> >
> > On Thu, May 28, 2015 at 11:41 AM, Aditya Auradkar <
> > aaurad...@linkedin.com.invalid> wrote:
> >
> > > bump
> > >
> > > 
> > > From: Aditya Auradkar
> > > Sent: Tuesday, May 26, 2015 1:16 PM
> > > To: dev@kafka.apache.org
> > > Subject: RE: [VOTE] KIP-21 Dynamic Configuration
> > >
> > > Hey everyone,
> > >
> > > Completed the changes to KIP-4. After today's hangout, there doesn't
> > > appear to be anything remaining to discuss on this KIP.
> > > Please vote so we can formally close this.
> > >
> > > Thanks,
> > > Aditya
> > >
> > > 
> > > From: Aditya Auradkar
> > > Sent: Thursday, May 21, 2015 11:26 AM
> > > To: dev@kafka.apache.org
> > > Subject: RE: [VOTE] KIP-21 Dynamic Configuration
> > >
> > > I think we should remove the config part in TopicMetadataResponse. It's
> > > probably cleaner if Alter and Describe are the only way to view and
> > modify
> > > configs but I don't feel very strongly about it.
> > >
> > > Re-summarizing the proposed changes to KIP-4:
> > > - Change AlterTopic to not allow setting configs. Config changes will
> > flow
> > > through AlterConfig. CreateTopic will still allow setting configs as it
> > is
> > > nice to be able to specify configs while creating the topic.
> > > - TopicMetadataResponse shoudn't return config for the topic.
> > > DescribeConfig is the way to go.
> > > - Change "InvalidTopicConfiguration" error code to
> "InvalidEntityConfig"
> > > as proposed in KIP-21.
> > >
> > > Aditya
> > >
> > > 
> > > From: Jun Rao [j...@confluent.io]
> > > Sent: Thursday, May 21, 2015 10:50 AM
> > > To: dev@kafka.apache.org
> > > Subject: Re: [VOTE] KIP-21 Dynamic Configuration
> > >
> > > What about TopicMetadataResponse in KIP-4? Do we remove the config part
> > in
> > > it?
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Thu, May 21, 2015 at 10:25 AM, Aditya Auradkar <
> > > aaurad...@linkedin.com.invalid> wrote:
> > >
> > > > Hey Jun,
> > > >
> > > > I've added a section on error codes on the KIP-21 wiki.
> > > >
> > > > Here are the proposed changes to KIP-4. I'll update the wiki shortly.
> > > > - Change AlterTopic to not allow setting configs. Config changes will
> > > flow
> > > > through AlterConfig. CreateTopic will still allow setting configs as
> it

Nagging - pending review requests :)

2015-05-28 Thread Jaikiran Pai
Could someone please look at these few review requests and let me know 
if any changes are needed:


https://reviews.apache.org/r/34394/ related to 
https://issues.apache.org/jira/browse/KAFKA-1907
https://reviews.apache.org/r/30403/ related to 
https://issues.apache.org/jira/browse/KAFKA-1906


There's also this one https://reviews.apache.org/r/34697/ for 
https://issues.apache.org/jira/browse/KAFKA-2221 but it's only been up 
since a couple of days and is a fairly minor one.


-Jaikiran


[jira] [Updated] (KAFKA-2223) Improve distribution of data when using hash-based partitioning

2015-05-28 Thread Gabriel Reid (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabriel Reid updated KAFKA-2223:

Resolution: Won't Fix
Status: Resolved  (was: Patch Available)

Closing as Won't Fix because this issue is already resolved in the new Java 
producer, and this could introduce backwards compatibility issues (in terms of 
different data distribution) in the existing Scala producer.

> Improve distribution of data when using hash-based partitioning
> ---
>
> Key: KAFKA-2223
> URL: https://issues.apache.org/jira/browse/KAFKA-2223
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Gabriel Reid
> Attachments: KAFKA-2223.patch
>
>
> Both the DefaultPartitioner and ByteArrayPartitioner base themselves on the 
> hash code of keys modulo the number of partitions, along the lines of 
> {code}partition = key.hashCode() % numPartitions{code} (converting to 
> absolute value is ommitted here)
> This approach is entirely dependent on the _lower bits_ of the hash code 
> being uniformly distributed in order to get good distribution of records over 
> multiple partitions. If the lower bits of the key hash code are not uniformly 
> distributed, the key space will not be uniformly distributed over the 
> partitions.
> It can be surprisingly easy to get a very poor distribution. As a simple 
> example, if the keys are integer values and are all divisible by 2, then only 
> half of the partitions will receive data (as the hash code of an integer is 
> the integer value itself).
> This can even be a problem in situations where you would really not expect 
> it. For example, taking the 8-byte big-endian byte-array representation of 
> longs for each timestamp of each second over a period of 24 hours (at 
> millisecond granularity) and partitioning it over 50 partitions results in 34 
> of the 50 partitions not getting any data at all.
> The easiest way to resolve this is to have a custom HashPartitioner that 
> applies a supplementary hash function to the return value of the key's 
> hashCode method. This same approach is taken in java.util.HashMap for the 
> exact same reason.
> One potential issue for a change like this to the default partitioner could 
> be backward compatibility, if there is some kind of logic expecting that a 
> given key would be sent to a given partition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Nagging - pending review requests :)

2015-05-28 Thread Joe Stein
see below

On Fri, May 29, 2015 at 2:25 AM, Jaikiran Pai 
wrote:

> Could someone please look at these few review requests and let me know if
> any changes are needed:
>
> https://reviews.apache.org/r/34394/ related to
> https://issues.apache.org/jira/browse/KAFKA-1907


I haven't looked at all the other changes that would be introduced from
their release that could break between zk and kafka by introducing a zk
client bump. A less ops negative way to deal with this might be to create a
plugable interface, then someone can use a patched zkclient if they wanted,
or exhibitor, or consul, or akka, etc.


>
> https://reviews.apache.org/r/30403/ related to
> https://issues.apache.org/jira/browse/KAFKA-1906


I don't understand the patch and how it would fix the issue. I also don't
think necessarily there is an issue. Its a balance from the community
having a good out of the box experience vs taking defaults and rushing them
into production. No matter what we do we can't stop the latter from
happening, which will also cause issues.


>
>
> There's also this one https://reviews.apache.org/r/34697/ for
> https://issues.apache.org/jira/browse/KAFKA-2221 but it's only been up
> since a couple of days and is a fairly minor one.


Folks should start to transition in 0.8.3 to the new java consumer (which
is on trunk). If this fix is so critical we should release it in 0.8.2.2
otherwise continue to try to not make changes to the existing scalal
consumer.


>
>
> -Jaikiran
>


RE: [VOTE] KIP-21 Dynamic Configuration

2015-05-28 Thread Aditya Auradkar
Minor edit: I meant that we should expect change notifications in the old 
format made earlier, but should perhaps ignore them. After the upgrade is done, 
older versions of AdminTools can no longer be used to make config changes.

Aditya


From: Aditya Auradkar
Sent: Thursday, May 28, 2015 11:22 PM
To: dev@kafka.apache.org
Subject: RE: [VOTE] KIP-21 Dynamic Configuration

Yeah, the same cleaning mechanism will be carried over.

> 1. Are we introducing a new Java API for the config change protocol and if
> so where will that appear? Is that going to be part of the java api in the
> admin api kip? Let's document that.
Yeah, we need to introduce a new Java API for the config change protocol. It 
should be a part of the AdminClient API. I'll alter KIP-4 to reflect that since 
the API is being introduced there.

> 2. The proposed JSON format uses camel case for field names, is that what
> we've used for other JSON in zookeeper?
I think camel case is more appropriate for the JSON format. For example, under 
the "brokers" znode, I found "jmx_port".

> 3. This changes the format of the notifications, right? How will we
> grandfather in the old format? Clusters will have existing change
> notifications in the old format so I think the new code will need to be
> able to read those?
Interesting, I figured the existing notifications were purged by a cleaner 
thread frequently. In that case, we wouldn't need to grandfather any 
notifications since we would only need to not make any config changes for X 
minutes for there to be no changes in ZK. But the old notifications are 
actually removed when a new notification is received or the broker is bounced. 
So we do need to handle notifications in the old format. Should we simply 
ignore older changes since they are only valid for a short period of time?

Thanks,
Aditya

From: Jay Kreps [jay.kr...@gmail.com]
Sent: Thursday, May 28, 2015 5:25 PM
To: dev@kafka.apache.org
Subject: Re: [VOTE] KIP-21 Dynamic Configuration

That is handled now so I am assuming the same mechanism carries over?

-Jay

On Thu, May 28, 2015 at 5:12 PM, Guozhang Wang  wrote:

> For the sequential config/changes/config_change_XX znode, do we have any
> manners to do cleaning in order to avoid the change-log from growing
> indefinitely?
>
> Guozhang
>
> On Thu, May 28, 2015 at 5:02 PM, Jay Kreps  wrote:
>
> > I still have a couple of questions:
> > 1. Are we introducing a new Java API for the config change protocol and
> if
> > so where will that appear? Is that going to be part of the java api in
> the
> > admin api kip? Let's document that.
> > 2. The proposed JSON format uses camel case for field names, is that what
> > we've used for other JSON in zookeeper?
> > 3. This changes the format of the notifications, right? How will we
> > grandfather in the old format? Clusters will have existing change
> > notifications in the old format so I think the new code will need to be
> > able to read those?
> >
> > -Jay
> >
> > On Thu, May 28, 2015 at 11:41 AM, Aditya Auradkar <
> > aaurad...@linkedin.com.invalid> wrote:
> >
> > > bump
> > >
> > > 
> > > From: Aditya Auradkar
> > > Sent: Tuesday, May 26, 2015 1:16 PM
> > > To: dev@kafka.apache.org
> > > Subject: RE: [VOTE] KIP-21 Dynamic Configuration
> > >
> > > Hey everyone,
> > >
> > > Completed the changes to KIP-4. After today's hangout, there doesn't
> > > appear to be anything remaining to discuss on this KIP.
> > > Please vote so we can formally close this.
> > >
> > > Thanks,
> > > Aditya
> > >
> > > 
> > > From: Aditya Auradkar
> > > Sent: Thursday, May 21, 2015 11:26 AM
> > > To: dev@kafka.apache.org
> > > Subject: RE: [VOTE] KIP-21 Dynamic Configuration
> > >
> > > I think we should remove the config part in TopicMetadataResponse. It's
> > > probably cleaner if Alter and Describe are the only way to view and
> > modify
> > > configs but I don't feel very strongly about it.
> > >
> > > Re-summarizing the proposed changes to KIP-4:
> > > - Change AlterTopic to not allow setting configs. Config changes will
> > flow
> > > through AlterConfig. CreateTopic will still allow setting configs as it
> > is
> > > nice to be able to specify configs while creating the topic.
> > > - TopicMetadataResponse shoudn't return config for the topic.
> > > DescribeConfig is the way to go.
> > > - Change "InvalidTopicConfiguration" error code to
> "InvalidEntityConfig"
> > > as proposed in KIP-21.
> > >
> > > Aditya
> > >
> > > 
> > > From: Jun Rao [j...@confluent.io]
> > > Sent: Thursday, May 21, 2015 10:50 AM
> > > To: dev@kafka.apache.org
> > > Subject: Re: [VOTE] KIP-21 Dynamic Configuration
> > >
> > > What about TopicMetadataResponse in KIP-4? Do we remove the config part
> > in
> > > it?
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Thu, May 21, 2015 at 10:25 AM