Review Request 31007: Patch for KAFKA-1926

2015-02-13 Thread Tong Li

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31007/
---

Review request for kafka.


Bugs: KAFKA-1926
https://issues.apache.org/jira/browse/KAFKA-1926


Repository: kafka


Description
---

More patch set will be submitted to address utils.java issues
Following things happen in this patch set:

1. Crc32.java was removed so that the core uses the same class
   defined in the client package.
2. Removed Time trait defined in Time.scala to use the interface
   defined in the client Time module.
3. Rewrite the SystemTime object in Time.scala so that it uses
   the client SystemTime class.
4. References to Time has been all refacted to use the interface
   defined in client module.
5. .gitignore does not exclude the following build process created
   folder and files:
  core/data/*
  gradle/wrapper/*


Diffs
-

  .gitignore 06a64184eaa531fcbf5586692b78bfd48e4176ba 
  clients/src/main/java/org/apache/kafka/common/utils/Time.java 
66c44de74521ed13cff75d556ebefa8846485b21 
  core/src/main/scala/kafka/cluster/Partition.scala 
e6ad8be5e33b6fb31c078ad78f8de709869ddc04 
  core/src/main/scala/kafka/cluster/Replica.scala 
bd13c20338ce3d73113224440e858a12814e5adb 
  core/src/main/scala/kafka/log/Log.scala 
846023bb98d0fa0603016466360c97071ac935ea 
  core/src/main/scala/kafka/log/LogCleaner.scala 
f8e7cd5fabce78c248a9027c4bb374a792508675 
  core/src/main/scala/kafka/log/LogManager.scala 
47d250af62c1aa53d11204a332d0684fb4217c8d 
  core/src/main/scala/kafka/log/LogSegment.scala 
ac9643423a28d189133705ba69b16cfce23f0049 
  core/src/main/scala/kafka/network/SocketServer.scala 
76ce41aed6e04ac5ba88395c4d5008aca17f9a73 
  core/src/main/scala/kafka/server/KafkaServer.scala 
7e5ddcb9be8fcef3df6ebc82a13ef44ef95f73ae 
  core/src/main/scala/kafka/server/ReplicaManager.scala 
fb948b9ab28c516e81dab14dcbe211dcd99842b6 
  core/src/main/scala/kafka/server/TopicConfigManager.scala 
47295d40131492aaac786273819b7bc6e22e5486 
  core/src/main/scala/kafka/utils/Crc32.java 
0e0e7bcb33886f47b9770a122d00d6eafdbb89a9 
  core/src/main/scala/kafka/utils/Throttler.scala 
d1a144d7882919426824799ff8e8a47f89c83158 
  core/src/main/scala/kafka/utils/Time.scala 
194cc1fa73b6a68914caf46b7dd7d415b2ff6a9f 
  core/src/main/scala/kafka/utils/Utils.scala 
738c1af9ef5de16fdf5130daab69757a14c48b5c 
  core/src/test/scala/unit/kafka/server/ISRExpirationTest.scala 
a703d2715048c5602635127451593903f8d20576 
  core/src/test/scala/unit/kafka/server/LogOffsetTest.scala 
c06ee756bf0fe07e5d3c92823a476c960b37afd6 
  core/src/test/scala/unit/kafka/server/OffsetCommitTest.scala 
5b93239cdc26b5be7696f4e7863adb9fbe5f0ed5 
  core/src/test/scala/unit/kafka/utils/MockScheduler.scala 
c6740782813cbf7c979bf1b94dd829945ef38458 
  core/src/test/scala/unit/kafka/utils/MockTime.scala 
ee65748afefd5e2d699e68fbd402b0ed9a659589 
  core/src/test/scala/unit/kafka/utils/TestUtils.scala 
21d0ed2cb7c9459261d3cdc7c21dece5e2079698 

Diff: https://reviews.apache.org/r/31007/diff/


Testing
---


Thanks,

Tong Li



[jira] [Updated] (KAFKA-1926) Replace kafka.utils.Utils with o.a.k.common.utils.Utils

2015-02-13 Thread Tong Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tong Li updated KAFKA-1926:
---
Attachment: KAFKA-1926.patch

> Replace kafka.utils.Utils with o.a.k.common.utils.Utils
> ---
>
> Key: KAFKA-1926
> URL: https://issues.apache.org/jira/browse/KAFKA-1926
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.8.2.0
>Reporter: Jay Kreps
>  Labels: newbie, patch
> Attachments: KAFKA-1926.patch, KAFKA-1926.patch
>
>
> There is currently a lot of duplication between the Utils class in common and 
> the one in core.
> Our plan has been to deprecate duplicate code in the server and replace it 
> with the new common code.
> As such we should evaluate each method in the scala Utils and do one of the 
> following:
> 1. Migrate it to o.a.k.common.utils.Utils if it is a sensible general purpose 
> utility in active use that is not Kafka-specific. If we migrate it we should 
> really think about the API and make sure there is some test coverage. A few 
> things in there are kind of funky and we shouldn't just blindly copy them 
> over.
> 2. Create a new class ServerUtils or ScalaUtils in kafka.utils that will hold 
> any utilities that really need to make use of Scala features to be convenient.
> 3. Delete it if it is not used, or has a bad api.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1926) Replace kafka.utils.Utils with o.a.k.common.utils.Utils

2015-02-13 Thread Tong Li (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14320404#comment-14320404
 ] 

Tong Li commented on KAFKA-1926:


Created reviewboard https://reviews.apache.org/r/31007/diff/
 against branch origin/trunk

> Replace kafka.utils.Utils with o.a.k.common.utils.Utils
> ---
>
> Key: KAFKA-1926
> URL: https://issues.apache.org/jira/browse/KAFKA-1926
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.8.2.0
>Reporter: Jay Kreps
>  Labels: newbie, patch
> Attachments: KAFKA-1926.patch, KAFKA-1926.patch
>
>
> There is currently a lot of duplication between the Utils class in common and 
> the one in core.
> Our plan has been to deprecate duplicate code in the server and replace it 
> with the new common code.
> As such we should evaluate each method in the scala Utils and do one of the 
> following:
> 1. Migrate it to o.a.k.common.utils.Utils if it is a sensible general purpose 
> utility in active use that is not Kafka-specific. If we migrate it we should 
> really think about the API and make sure there is some test coverage. A few 
> things in there are kind of funky and we shouldn't just blindly copy them 
> over.
> 2. Create a new class ServerUtils or ScalaUtils in kafka.utils that will hold 
> any utilities that really need to make use of Scala features to be convenient.
> 3. Delete it if it is not used, or has a bad api.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1926) Replace kafka.utils.Utils with o.a.k.common.utils.Utils

2015-02-13 Thread Tong Li (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14320416#comment-14320416
 ] 

Tong Li commented on KAFKA-1926:


[~jkreps] Removal of Time.scala cause a lot of code changes. Since Time.scala 
defined the SystemTime object and get used all over the place. Though another 
class SystemTime defined in client util package as a java class. The use of the 
SystemTime object in the core is like using the property, not method calls, I 
can rework that part, but it will be a lot of code like this
 SystemTime.milliseconds

To be changed to:
new SystemTime().milliseconds()

Since the code in client common util gets defined as a regular class. I feel 
doing that also will cost quite a bit rather than stick with the scala 
SystemTime and simply calls its property, not many instance gets created. Let 
me know what you think and we can always improve.

Thanks.

> Replace kafka.utils.Utils with o.a.k.common.utils.Utils
> ---
>
> Key: KAFKA-1926
> URL: https://issues.apache.org/jira/browse/KAFKA-1926
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.8.2.0
>Reporter: Jay Kreps
>  Labels: newbie, patch
> Attachments: KAFKA-1926.patch, KAFKA-1926.patch
>
>
> There is currently a lot of duplication between the Utils class in common and 
> the one in core.
> Our plan has been to deprecate duplicate code in the server and replace it 
> with the new common code.
> As such we should evaluate each method in the scala Utils and do one of the 
> following:
> 1. Migrate it to o.a.k.common.utils.Utils if it is a sensible general purpose 
> utility in active use that is not Kafka-specific. If we migrate it we should 
> really think about the API and make sure there is some test coverage. A few 
> things in there are kind of funky and we shouldn't just blindly copy them 
> over.
> 2. Create a new class ServerUtils or ScalaUtils in kafka.utils that will hold 
> any utilities that really need to make use of Scala features to be convenient.
> 3. Delete it if it is not used, or has a bad api.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 29912: Patch for KAFKA-1852

2015-02-13 Thread Eric Olander

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29912/#review72405
---



core/src/main/scala/kafka/server/KafkaApis.scala


This code could be done using map and getOrElse on the Option rather than 
using pattern matching/reflection.

logManager.getLog(topicAndPartition).map{log => fetchOffsetsBefore(log, 
timestamp, maxNumOffsets)
}.getOrElse {
if (timestamp == OffsetRequest.LatestTime || timestamp == 
OffsetRequest.EarliestTime)
  Seq(0L)
else
  Nil
}


- Eric Olander


On Feb. 13, 2015, 12:46 a.m., Sriharsha Chintalapani wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29912/
> ---
> 
> (Updated Feb. 13, 2015, 12:46 a.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1852
> https://issues.apache.org/jira/browse/KAFKA-1852
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> KAFKA-1852. OffsetCommitRequest can commit offset on unknown topic. Added 
> contains method to MetadataCache.
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/server/KafkaApis.scala 
> 6ee7d8819a9ef923f3a65c865a0a3d8ded8845f0 
>   core/src/main/scala/kafka/server/MetadataCache.scala 
> 4c70aa7e0157b85de5e24736ebf487239c4571d0 
>   core/src/main/scala/kafka/server/OffsetManager.scala 
> 83d52643028c5628057dc0aa29819becfda61fdb 
>   core/src/test/scala/unit/kafka/server/OffsetCommitTest.scala 
> 5b93239cdc26b5be7696f4e7863adb9fbe5f0ed5 
> 
> Diff: https://reviews.apache.org/r/29912/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sriharsha Chintalapani
> 
>



Re: Review Request 29912: Patch for KAFKA-1852

2015-02-13 Thread Sriharsha Chintalapani


> On Feb. 13, 2015, 6:24 p.m., Eric Olander wrote:
> > core/src/main/scala/kafka/server/KafkaApis.scala, line 303
> > 
> >
> > This code could be done using map and getOrElse on the Option rather 
> > than using pattern matching/reflection.
> > 
> > logManager.getLog(topicAndPartition).map{log => fetchOffsetsBefore(log, 
> > timestamp, maxNumOffsets)
> > }.getOrElse {
> > if (timestamp == OffsetRequest.LatestTime || timestamp == 
> > OffsetRequest.EarliestTime)
> >   Seq(0L)
> > else
> >   Nil
> > }

Eric, 
Thanks for the review. But the above code is not added as part of this 
patch and also not releated to this JIRA so don't want to change it as part of 
this patch.


- Sriharsha


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29912/#review72405
---


On Feb. 13, 2015, 12:46 a.m., Sriharsha Chintalapani wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29912/
> ---
> 
> (Updated Feb. 13, 2015, 12:46 a.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1852
> https://issues.apache.org/jira/browse/KAFKA-1852
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> KAFKA-1852. OffsetCommitRequest can commit offset on unknown topic. Added 
> contains method to MetadataCache.
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/server/KafkaApis.scala 
> 6ee7d8819a9ef923f3a65c865a0a3d8ded8845f0 
>   core/src/main/scala/kafka/server/MetadataCache.scala 
> 4c70aa7e0157b85de5e24736ebf487239c4571d0 
>   core/src/main/scala/kafka/server/OffsetManager.scala 
> 83d52643028c5628057dc0aa29819becfda61fdb 
>   core/src/test/scala/unit/kafka/server/OffsetCommitTest.scala 
> 5b93239cdc26b5be7696f4e7863adb9fbe5f0ed5 
> 
> Diff: https://reviews.apache.org/r/29912/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sriharsha Chintalapani
> 
>



Re: Review Request 29912: Patch for KAFKA-1852

2015-02-13 Thread Gwen Shapira


> On Feb. 13, 2015, 6:24 p.m., Eric Olander wrote:
> > core/src/main/scala/kafka/server/KafkaApis.scala, line 303
> > 
> >
> > This code could be done using map and getOrElse on the Option rather 
> > than using pattern matching/reflection.
> > 
> > logManager.getLog(topicAndPartition).map{log => fetchOffsetsBefore(log, 
> > timestamp, maxNumOffsets)
> > }.getOrElse {
> > if (timestamp == OffsetRequest.LatestTime || timestamp == 
> > OffsetRequest.EarliestTime)
> >   Seq(0L)
> > else
> >   Nil
> > }
> 
> Sriharsha Chintalapani wrote:
> Eric, 
> Thanks for the review. But the above code is not added as part of 
> this patch and also not releated to this JIRA so don't want to change it as 
> part of this patch.

Eric,

I noticed you have a lot of good comments on code style when reviewing patches. 
I think we are all happy to improve code we are touching as part of the patch, 
but we are also trying to keep the scope of patches mostly contained and not 
touch unrelated code. This is so if we accidentally break something it will be 
easier to figure out what happened.

Feel free to open new JIRA for those unrelated issues and submit patches.

Gwen


- Gwen


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29912/#review72405
---


On Feb. 13, 2015, 12:46 a.m., Sriharsha Chintalapani wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29912/
> ---
> 
> (Updated Feb. 13, 2015, 12:46 a.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1852
> https://issues.apache.org/jira/browse/KAFKA-1852
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> KAFKA-1852. OffsetCommitRequest can commit offset on unknown topic. Added 
> contains method to MetadataCache.
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/server/KafkaApis.scala 
> 6ee7d8819a9ef923f3a65c865a0a3d8ded8845f0 
>   core/src/main/scala/kafka/server/MetadataCache.scala 
> 4c70aa7e0157b85de5e24736ebf487239c4571d0 
>   core/src/main/scala/kafka/server/OffsetManager.scala 
> 83d52643028c5628057dc0aa29819becfda61fdb 
>   core/src/test/scala/unit/kafka/server/OffsetCommitTest.scala 
> 5b93239cdc26b5be7696f4e7863adb9fbe5f0ed5 
> 
> Diff: https://reviews.apache.org/r/29912/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sriharsha Chintalapani
> 
>



Re: Review Request 29912: Patch for KAFKA-1852

2015-02-13 Thread Joel Koshy

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29912/#review72413
---


minor comment, looks good otherwise


core/src/main/scala/kafka/server/KafkaApis.scala


(Personally I also prefer the map+getOrElse idiom as it is way more 
concise. It does take getting used to - I use it but some prefer case matching)



core/src/main/scala/kafka/server/OffsetManager.scala


Minor comment. I think this may be better to pass in to the OffsetManager.

We should even use it in loadOffsets to discard offsets that are from 
topics that have been deleted. We can do that in a separate jira - I don't 
think our handling for clearing out offsets on a delete topic is done yet - 
Onur Karaman did it for ZK based offsets but we need a separate jira to delete 
Kafka-based offsets.


- Joel Koshy


On Feb. 13, 2015, 12:46 a.m., Sriharsha Chintalapani wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29912/
> ---
> 
> (Updated Feb. 13, 2015, 12:46 a.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1852
> https://issues.apache.org/jira/browse/KAFKA-1852
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> KAFKA-1852. OffsetCommitRequest can commit offset on unknown topic. Added 
> contains method to MetadataCache.
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/server/KafkaApis.scala 
> 6ee7d8819a9ef923f3a65c865a0a3d8ded8845f0 
>   core/src/main/scala/kafka/server/MetadataCache.scala 
> 4c70aa7e0157b85de5e24736ebf487239c4571d0 
>   core/src/main/scala/kafka/server/OffsetManager.scala 
> 83d52643028c5628057dc0aa29819becfda61fdb 
>   core/src/test/scala/unit/kafka/server/OffsetCommitTest.scala 
> 5b93239cdc26b5be7696f4e7863adb9fbe5f0ed5 
> 
> Diff: https://reviews.apache.org/r/29912/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sriharsha Chintalapani
> 
>



[jira] [Updated] (KAFKA-1792) change behavior of --generate to produce assignment config with fair replica distribution and minimal number of reassignments

2015-02-13 Thread Dmitry Pekar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Pekar updated KAFKA-1792:

Attachment: KAFKA-1792_2015-02-13_21:07:06.patch

> change behavior of --generate to produce assignment config with fair replica 
> distribution and minimal number of reassignments
> -
>
> Key: KAFKA-1792
> URL: https://issues.apache.org/jira/browse/KAFKA-1792
> Project: Kafka
>  Issue Type: Sub-task
>  Components: tools
>Reporter: Dmitry Pekar
>Assignee: Dmitry Pekar
> Fix For: 0.8.3
>
> Attachments: KAFKA-1792.patch, KAFKA-1792_2014-12-03_19:24:56.patch, 
> KAFKA-1792_2014-12-08_13:42:43.patch, KAFKA-1792_2014-12-19_16:48:12.patch, 
> KAFKA-1792_2015-01-14_12:54:52.patch, KAFKA-1792_2015-01-27_19:09:27.patch, 
> KAFKA-1792_2015-02-13_21:07:06.patch, generate_alg_tests.txt, 
> rebalance_use_cases.txt
>
>
> Current implementation produces fair replica distribution between specified 
> list of brokers. Unfortunately, it doesn't take
> into account current replica assignment.
> So if we have, for instance, 3 brokers id=[0..2] and are going to add fourth 
> broker id=3, 
> generate will create an assignment config which will redistribute replicas 
> fairly across brokers [0..3] 
> in the same way as those partitions were created from scratch. It will not 
> take into consideration current replica 
> assignment and accordingly will not try to minimize number of replica moves 
> between brokers.
> As proposed by [~charmalloc] this should be improved. New output of improved 
> --generate algorithm should suite following requirements:
> - fairness of replica distribution - every broker will have R or R+1 replicas 
> assigned;
> - minimum of reassignments - number of replica moves between brokers will be 
> minimal;
> Example.
> Consider following replica distribution per brokers [0..3] (we just added 
> brokers 2 and 3):
> - broker - 0, 1, 2, 3 
> - replicas - 7, 6, 0, 0
> The new algorithm will produce following assignment:
> - broker - 0, 1, 2, 3 
> - replicas - 4, 3, 3, 3
> - moves - -3, -3, +3, +3
> It will be fair and number of moves will be 6, which is minimal for specified 
> initial distribution.
> The scope of this issue is:
> - design an algorithm matching the above requirements;
> - implement this algorithm and unit tests;
> - test it manually using different initial assignments;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 28481: Patch for KAFKA-1792

2015-02-13 Thread Dmitry Pekar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28481/
---

(Updated Feb. 13, 2015, 7:07 p.m.)


Review request for kafka.


Bugs: KAFKA-1792
https://issues.apache.org/jira/browse/KAFKA-1792


Repository: kafka


Description (updated)
---

KAFKA-1792: CR


KAFKA-1792: CR2


KAFKA-1792: merge of KAFKA-1753


KAFKA-1792: generate renamed to rebalance


KAFKA-1792: --rebalance renamed back to --generate, removed 
--decomission-broker command


KAFKA-1792: added back --decommission-broker command


Diffs (updated)
-

  core/src/main/scala/kafka/admin/AdminUtils.scala 
28b12c7b89a56c113b665fbde1b95f873f8624a3 
  core/src/main/scala/kafka/admin/ReassignPartitionsCommand.scala 
979992b68af3723cd229845faff81c641123bb88 
  core/src/test/scala/unit/kafka/admin/AdminTest.scala 
e28979827110dfbbb92fe5b152e7f1cc973de400 
  topics.json ff011ed381e781b9a177036001d44dca3eac586f 

Diff: https://reviews.apache.org/r/28481/diff/


Testing
---


Thanks,

Dmitry Pekar



[jira] [Commented] (KAFKA-1792) change behavior of --generate to produce assignment config with fair replica distribution and minimal number of reassignments

2015-02-13 Thread Dmitry Pekar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14320584#comment-14320584
 ] 

Dmitry Pekar commented on KAFKA-1792:
-

Updated reviewboard https://reviews.apache.org/r/28481/diff/
 against branch origin/trunk

> change behavior of --generate to produce assignment config with fair replica 
> distribution and minimal number of reassignments
> -
>
> Key: KAFKA-1792
> URL: https://issues.apache.org/jira/browse/KAFKA-1792
> Project: Kafka
>  Issue Type: Sub-task
>  Components: tools
>Reporter: Dmitry Pekar
>Assignee: Dmitry Pekar
> Fix For: 0.8.3
>
> Attachments: KAFKA-1792.patch, KAFKA-1792_2014-12-03_19:24:56.patch, 
> KAFKA-1792_2014-12-08_13:42:43.patch, KAFKA-1792_2014-12-19_16:48:12.patch, 
> KAFKA-1792_2015-01-14_12:54:52.patch, KAFKA-1792_2015-01-27_19:09:27.patch, 
> KAFKA-1792_2015-02-13_21:07:06.patch, generate_alg_tests.txt, 
> rebalance_use_cases.txt
>
>
> Current implementation produces fair replica distribution between specified 
> list of brokers. Unfortunately, it doesn't take
> into account current replica assignment.
> So if we have, for instance, 3 brokers id=[0..2] and are going to add fourth 
> broker id=3, 
> generate will create an assignment config which will redistribute replicas 
> fairly across brokers [0..3] 
> in the same way as those partitions were created from scratch. It will not 
> take into consideration current replica 
> assignment and accordingly will not try to minimize number of replica moves 
> between brokers.
> As proposed by [~charmalloc] this should be improved. New output of improved 
> --generate algorithm should suite following requirements:
> - fairness of replica distribution - every broker will have R or R+1 replicas 
> assigned;
> - minimum of reassignments - number of replica moves between brokers will be 
> minimal;
> Example.
> Consider following replica distribution per brokers [0..3] (we just added 
> brokers 2 and 3):
> - broker - 0, 1, 2, 3 
> - replicas - 7, 6, 0, 0
> The new algorithm will produce following assignment:
> - broker - 0, 1, 2, 3 
> - replicas - 4, 3, 3, 3
> - moves - -3, -3, +3, +3
> It will be fair and number of moves will be 6, which is minimal for specified 
> initial distribution.
> The scope of this issue is:
> - design an algorithm matching the above requirements;
> - implement this algorithm and unit tests;
> - test it manually using different initial assignments;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1792) change behavior of --generate to produce assignment config with fair replica distribution and minimal number of reassignments

2015-02-13 Thread Dmitry Pekar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14320584#comment-14320584
 ] 

Dmitry Pekar edited comment on KAFKA-1792 at 2/13/15 7:09 PM:
--

Updated reviewboard https://reviews.apache.org/r/28481/diff/
 against branch origin/trunk

This patch contains restored --decommission-broker command.


was (Author: dmitry pekar):
Updated reviewboard https://reviews.apache.org/r/28481/diff/
 against branch origin/trunk

> change behavior of --generate to produce assignment config with fair replica 
> distribution and minimal number of reassignments
> -
>
> Key: KAFKA-1792
> URL: https://issues.apache.org/jira/browse/KAFKA-1792
> Project: Kafka
>  Issue Type: Sub-task
>  Components: tools
>Reporter: Dmitry Pekar
>Assignee: Dmitry Pekar
> Fix For: 0.8.3
>
> Attachments: KAFKA-1792.patch, KAFKA-1792_2014-12-03_19:24:56.patch, 
> KAFKA-1792_2014-12-08_13:42:43.patch, KAFKA-1792_2014-12-19_16:48:12.patch, 
> KAFKA-1792_2015-01-14_12:54:52.patch, KAFKA-1792_2015-01-27_19:09:27.patch, 
> KAFKA-1792_2015-02-13_21:07:06.patch, generate_alg_tests.txt, 
> rebalance_use_cases.txt
>
>
> Current implementation produces fair replica distribution between specified 
> list of brokers. Unfortunately, it doesn't take
> into account current replica assignment.
> So if we have, for instance, 3 brokers id=[0..2] and are going to add fourth 
> broker id=3, 
> generate will create an assignment config which will redistribute replicas 
> fairly across brokers [0..3] 
> in the same way as those partitions were created from scratch. It will not 
> take into consideration current replica 
> assignment and accordingly will not try to minimize number of replica moves 
> between brokers.
> As proposed by [~charmalloc] this should be improved. New output of improved 
> --generate algorithm should suite following requirements:
> - fairness of replica distribution - every broker will have R or R+1 replicas 
> assigned;
> - minimum of reassignments - number of replica moves between brokers will be 
> minimal;
> Example.
> Consider following replica distribution per brokers [0..3] (we just added 
> brokers 2 and 3):
> - broker - 0, 1, 2, 3 
> - replicas - 7, 6, 0, 0
> The new algorithm will produce following assignment:
> - broker - 0, 1, 2, 3 
> - replicas - 4, 3, 3, 3
> - moves - -3, -3, +3, +3
> It will be fair and number of moves will be 6, which is minimal for specified 
> initial distribution.
> The scope of this issue is:
> - design an algorithm matching the above requirements;
> - implement this algorithm and unit tests;
> - test it manually using different initial assignments;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1659) Ability to cleanly abort the KafkaProducer

2015-02-13 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14320673#comment-14320673
 ] 

Guozhang Wang commented on KAFKA-1659:
--

One big use-case for this function call would be preserving ordering in the 
producer: say we have call send() for message 1, 2 and 3, when we get an error 
in the callback for message 2, we do not want the producer to continue sending 
3 after it finishes this callback, and one way would be aborting the producer 
in the callback. Of course an even better solution would be an API such like 
drop() that cleans up the messages with the corresponding topic/partition in 
the buffer.

If we are pursuing the abort() function, since it may be called in the 
callback, killing the io-thread right away may not work appropriately.

An alternative approach would be adding an aborted flag to Sender, and have

{code}
public void abort() {
  this.sender.abort();
  this.close()
}
{code}

and in the Sender class, after the main loop check if aborted == true, if yes, 
clean up the buffer and return immediately; otherwise flush the buffer and 
return.

> Ability to cleanly abort the KafkaProducer
> --
>
> Key: KAFKA-1659
> URL: https://issues.apache.org/jira/browse/KAFKA-1659
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, producer 
>Affects Versions: 0.8.2.0
>Reporter: Andrew Stein
>Assignee: Jun Rao
> Fix For: 0.8.3
>
>
> I would like the ability to "abort" the Java Client's KafkaProducer. This 
> includes the stopping the writing of buffered records.
> The motivation for this is described 
> [here|http://mail-archives.apache.org/mod_mbox/kafka-dev/201409.mbox/%3CCAOk4UxB7BJm6HSgLXrR01sksB2dOC3zdt0NHaKHz1EALR6%3DCTQ%40mail.gmail.com%3E].
> A sketch of this method is:
> {code}
> public void abort() {
> try {
> ioThread.interrupt();
> ioThread.stop(new ThreadDeath());
> } catch (IllegalAccessException e) {
> }
> }
> {code}
> but of course it is preferable to stop the {{ioThread}} by cooperation, 
> rather than use the deprecated {{Thread.stop(new ThreadDeath())}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1461) Replica fetcher thread does not implement any back-off behavior

2015-02-13 Thread Idcmp (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14320746#comment-14320746
 ] 

Idcmp commented on KAFKA-1461:
--

This issue can be tickled on a multi-broker configuration by having brokers 
advertise host names that do not exist. 


> Replica fetcher thread does not implement any back-off behavior
> ---
>
> Key: KAFKA-1461
> URL: https://issues.apache.org/jira/browse/KAFKA-1461
> Project: Kafka
>  Issue Type: Improvement
>  Components: replication
>Affects Versions: 0.8.1.1
>Reporter: Sam Meder
>Assignee: Sriharsha Chintalapani
>  Labels: newbie++
> Fix For: 0.8.3
>
>
> The current replica fetcher thread will retry in a tight loop if any error 
> occurs during the fetch call. For example, we've seen cases where the fetch 
> continuously throws a connection refused exception leading to several replica 
> fetcher threads that spin in a pretty tight loop.
> To a much lesser degree this is also an issue in the consumer fetcher thread, 
> although the fact that erroring partitions are removed so a leader can be 
> re-discovered helps some.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1461) Replica fetcher thread does not implement any back-off behavior

2015-02-13 Thread Idcmp (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14320746#comment-14320746
 ] 

Idcmp edited comment on KAFKA-1461 at 2/13/15 9:04 PM:
---

This issue can be tickled on a multi-broker configuration by having brokers 
advertise host names that do not exist.  (Say for example you're running Kafka 
in docker containers with custom hostnames :-)


was (Author: idcmp):
This issue can be tickled on a multi-broker configuration by having brokers 
advertise host names that do not exist. 


> Replica fetcher thread does not implement any back-off behavior
> ---
>
> Key: KAFKA-1461
> URL: https://issues.apache.org/jira/browse/KAFKA-1461
> Project: Kafka
>  Issue Type: Improvement
>  Components: replication
>Affects Versions: 0.8.1.1
>Reporter: Sam Meder
>Assignee: Sriharsha Chintalapani
>  Labels: newbie++
> Fix For: 0.8.3
>
>
> The current replica fetcher thread will retry in a tight loop if any error 
> occurs during the fetch call. For example, we've seen cases where the fetch 
> continuously throws a connection refused exception leading to several replica 
> fetcher threads that spin in a pretty tight loop.
> To a much lesser degree this is also an issue in the consumer fetcher thread, 
> although the fact that erroring partitions are removed so a leader can be 
> re-discovered helps some.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1697) remove code related to ack>1 on the broker

2015-02-13 Thread Joel Koshy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Koshy updated KAFKA-1697:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Thanks for the patch. Pushed to trunk.

> remove code related to ack>1 on the broker
> --
>
> Key: KAFKA-1697
> URL: https://issues.apache.org/jira/browse/KAFKA-1697
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jun Rao
>Assignee: Gwen Shapira
> Fix For: 0.8.3
>
> Attachments: KAFKA-1697.patch, KAFKA-1697_2015-01-14_15:41:37.patch, 
> KAFKA-1697_2015-02-10_17:06:51.patch, KAFKA-1697_2015-02-11_18:45:42.patch, 
> KAFKA-1697_2015-02-11_18:47:53.patch, KAFKA-1697_2015-02-11_23:13:53.patch, 
> KAFKA-1697_2015-02-12_18:57:36.patch
>
>
> We removed the ack>1 support from the producer client in kafka-1555. We can 
> completely remove the code in the broker that supports ack>1.
> Also, we probably want to make NotEnoughReplicasAfterAppend a non-retriable 
> exception and let the client decide what to do.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1951) Consumer offset checker should handle corner cases for dual commit or no offsets.

2015-02-13 Thread Jiangjie Qin (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiangjie Qin updated KAFKA-1951:

Assignee: Jiangjie Qin
  Status: Patch Available  (was: Open)

> Consumer offset checker should handle corner cases for dual commit or no 
> offsets.
> -
>
> Key: KAFKA-1951
> URL: https://issues.apache.org/jira/browse/KAFKA-1951
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Attachments: KAFKA-1951.patch
>
>
> Currently, if consumer offsets exists in Kafka, then offset checker assume 
> that is the latest offsets and skip checking zookeeper. It should print the 
> offsets in both place.
> When offset checker checks the offsets for not existing consumer group. It 
> throws exception complaining about zk path does not exist. It should response 
> with offset N/A.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1951) Consumer offset checker should handle corner cases for dual commit or no offsets.

2015-02-13 Thread Jiangjie Qin (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiangjie Qin updated KAFKA-1951:

Attachment: KAFKA-1951.patch

> Consumer offset checker should handle corner cases for dual commit or no 
> offsets.
> -
>
> Key: KAFKA-1951
> URL: https://issues.apache.org/jira/browse/KAFKA-1951
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jiangjie Qin
> Attachments: KAFKA-1951.patch
>
>
> Currently, if consumer offsets exists in Kafka, then offset checker assume 
> that is the latest offsets and skip checking zookeeper. It should print the 
> offsets in both place.
> When offset checker checks the offsets for not existing consumer group. It 
> throws exception complaining about zk path does not exist. It should response 
> with offset N/A.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 31021: Patch for KAFKA-1951

2015-02-13 Thread Jiangjie Qin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31021/
---

Review request for kafka.


Bugs: KAFKA-1951
https://issues.apache.org/jira/browse/KAFKA-1951


Repository: kafka


Description
---

Fix for KAFKA-1951


Diffs
-

  core/src/main/scala/kafka/tools/ConsumerOffsetChecker.scala 
d1e7c434e77859d746b8dc68dd5d5a3740425e79 

Diff: https://reviews.apache.org/r/31021/diff/


Testing
---


Thanks,

Jiangjie Qin



[jira] [Commented] (KAFKA-1951) Consumer offset checker should handle corner cases for dual commit or no offsets.

2015-02-13 Thread Jiangjie Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14320800#comment-14320800
 ] 

Jiangjie Qin commented on KAFKA-1951:
-

Created reviewboard https://reviews.apache.org/r/31021/diff/
 against branch origin/trunk

> Consumer offset checker should handle corner cases for dual commit or no 
> offsets.
> -
>
> Key: KAFKA-1951
> URL: https://issues.apache.org/jira/browse/KAFKA-1951
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jiangjie Qin
> Attachments: KAFKA-1951.patch
>
>
> Currently, if consumer offsets exists in Kafka, then offset checker assume 
> that is the latest offsets and skip checking zookeeper. It should print the 
> offsets in both place.
> When offset checker checks the offsets for not existing consumer group. It 
> throws exception complaining about zk path does not exist. It should response 
> with offset N/A.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-1952) High CPU Usage in 0.8.2 release

2015-02-13 Thread Jay Kreps (JIRA)
Jay Kreps created KAFKA-1952:


 Summary: High CPU Usage in 0.8.2 release
 Key: KAFKA-1952
 URL: https://issues.apache.org/jira/browse/KAFKA-1952
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.2.0
Reporter: Jay Kreps
Assignee: Jun Rao
Priority: Critical
 Fix For: 0.8.2.0


Brokers with high partition count see increased CPU usage when migrating from 
0.8.1.1 to 0.8.2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1659) Ability to cleanly abort the KafkaProducer

2015-02-13 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14320981#comment-14320981
 ] 

Jay Kreps commented on KAFKA-1659:
--

But there would be a race condition between the sender sending 3 and the client 
calling abort(), right? I don't think you could actually depend on that working 
could you? I think this ticket/api is equivalent to the close(0, 
TimeUnit.MILLISECONDS), right?

> Ability to cleanly abort the KafkaProducer
> --
>
> Key: KAFKA-1659
> URL: https://issues.apache.org/jira/browse/KAFKA-1659
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, producer 
>Affects Versions: 0.8.2.0
>Reporter: Andrew Stein
>Assignee: Jun Rao
> Fix For: 0.8.3
>
>
> I would like the ability to "abort" the Java Client's KafkaProducer. This 
> includes the stopping the writing of buffered records.
> The motivation for this is described 
> [here|http://mail-archives.apache.org/mod_mbox/kafka-dev/201409.mbox/%3CCAOk4UxB7BJm6HSgLXrR01sksB2dOC3zdt0NHaKHz1EALR6%3DCTQ%40mail.gmail.com%3E].
> A sketch of this method is:
> {code}
> public void abort() {
> try {
> ioThread.interrupt();
> ioThread.stop(new ThreadDeath());
> } catch (IllegalAccessException e) {
> }
> }
> {code}
> but of course it is preferable to stop the {{ioThread}} by cooperation, 
> rather than use the deprecated {{Thread.stop(new ThreadDeath())}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-1953) Disambiguate metrics from different purgatories

2015-02-13 Thread Joel Koshy (JIRA)
Joel Koshy created KAFKA-1953:
-

 Summary: Disambiguate metrics from different purgatories
 Key: KAFKA-1953
 URL: https://issues.apache.org/jira/browse/KAFKA-1953
 Project: Kafka
  Issue Type: Sub-task
Reporter: Joel Koshy
Assignee: Joel Koshy


After the purgatory refactoring, all the different purgatories map to the same 
metric names. We need to disambiguate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1659) Ability to cleanly abort the KafkaProducer

2015-02-13 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14320991#comment-14320991
 ] 

Guozhang Wang commented on KAFKA-1659:
--

Okay, I think we may need to treat the following two cases separately:

1. another caller thread needs to "abort" the producer process (including the 
iothread).
2. the io-thread itself needs to abort the process, e.g. from the callback 
error handling.

What I was thinking about is for 2) above. As for 1) I agree that 
close(timeout) will work, but I am wondering if we can have a single API that 
covers both.


> Ability to cleanly abort the KafkaProducer
> --
>
> Key: KAFKA-1659
> URL: https://issues.apache.org/jira/browse/KAFKA-1659
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, producer 
>Affects Versions: 0.8.2.0
>Reporter: Andrew Stein
>Assignee: Jun Rao
> Fix For: 0.8.3
>
>
> I would like the ability to "abort" the Java Client's KafkaProducer. This 
> includes the stopping the writing of buffered records.
> The motivation for this is described 
> [here|http://mail-archives.apache.org/mod_mbox/kafka-dev/201409.mbox/%3CCAOk4UxB7BJm6HSgLXrR01sksB2dOC3zdt0NHaKHz1EALR6%3DCTQ%40mail.gmail.com%3E].
> A sketch of this method is:
> {code}
> public void abort() {
> try {
> ioThread.interrupt();
> ioThread.stop(new ThreadDeath());
> } catch (IllegalAccessException e) {
> }
> }
> {code}
> but of course it is preferable to stop the {{ioThread}} by cooperation, 
> rather than use the deprecated {{Thread.stop(new ThreadDeath())}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1659) Ability to cleanly abort the KafkaProducer

2015-02-13 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14321015#comment-14321015
 ] 

Jay Kreps commented on KAFKA-1659:
--

Hmm, but how useful is being able to shutdown the producer from inside the 
callback? I don't think that actually works to guarantee no additional messages 
are sent since they may already have been sent. And also that is a pretty heavy 
handed way to accomplish that, no? Isn't the fix here really idempotent retries?

> Ability to cleanly abort the KafkaProducer
> --
>
> Key: KAFKA-1659
> URL: https://issues.apache.org/jira/browse/KAFKA-1659
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, producer 
>Affects Versions: 0.8.2.0
>Reporter: Andrew Stein
>Assignee: Jun Rao
> Fix For: 0.8.3
>
>
> I would like the ability to "abort" the Java Client's KafkaProducer. This 
> includes the stopping the writing of buffered records.
> The motivation for this is described 
> [here|http://mail-archives.apache.org/mod_mbox/kafka-dev/201409.mbox/%3CCAOk4UxB7BJm6HSgLXrR01sksB2dOC3zdt0NHaKHz1EALR6%3DCTQ%40mail.gmail.com%3E].
> A sketch of this method is:
> {code}
> public void abort() {
> try {
> ioThread.interrupt();
> ioThread.stop(new ThreadDeath());
> } catch (IllegalAccessException e) {
> }
> }
> {code}
> but of course it is preferable to stop the {{ioThread}} by cooperation, 
> rather than use the deprecated {{Thread.stop(new ThreadDeath())}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1659) Ability to cleanly abort the KafkaProducer

2015-02-13 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14321029#comment-14321029
 ] 

Guozhang Wang commented on KAFKA-1659:
--

Yes it works only with max-inflight-request == 1 (this is the setting for those 
ordering preserving apps like mirror-maker, samza and database replication 
anyways), but the logic itself should be very simple.

I am not sure how idempotent producers can help here, can you elaborate?

> Ability to cleanly abort the KafkaProducer
> --
>
> Key: KAFKA-1659
> URL: https://issues.apache.org/jira/browse/KAFKA-1659
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, producer 
>Affects Versions: 0.8.2.0
>Reporter: Andrew Stein
>Assignee: Jun Rao
> Fix For: 0.8.3
>
>
> I would like the ability to "abort" the Java Client's KafkaProducer. This 
> includes the stopping the writing of buffered records.
> The motivation for this is described 
> [here|http://mail-archives.apache.org/mod_mbox/kafka-dev/201409.mbox/%3CCAOk4UxB7BJm6HSgLXrR01sksB2dOC3zdt0NHaKHz1EALR6%3DCTQ%40mail.gmail.com%3E].
> A sketch of this method is:
> {code}
> public void abort() {
> try {
> ioThread.interrupt();
> ioThread.stop(new ThreadDeath());
> } catch (IllegalAccessException e) {
> }
> }
> {code}
> but of course it is preferable to stop the {{ioThread}} by cooperation, 
> rather than use the deprecated {{Thread.stop(new ThreadDeath())}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1659) Ability to cleanly abort the KafkaProducer

2015-02-13 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14321071#comment-14321071
 ] 

Jay Kreps commented on KAFKA-1659:
--

I understood the original problem we were trying to solve to be ensuring we 
don't publish messages out of order, which the idempotent producer stuff would 
do.

> Ability to cleanly abort the KafkaProducer
> --
>
> Key: KAFKA-1659
> URL: https://issues.apache.org/jira/browse/KAFKA-1659
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, producer 
>Affects Versions: 0.8.2.0
>Reporter: Andrew Stein
>Assignee: Jun Rao
> Fix For: 0.8.3
>
>
> I would like the ability to "abort" the Java Client's KafkaProducer. This 
> includes the stopping the writing of buffered records.
> The motivation for this is described 
> [here|http://mail-archives.apache.org/mod_mbox/kafka-dev/201409.mbox/%3CCAOk4UxB7BJm6HSgLXrR01sksB2dOC3zdt0NHaKHz1EALR6%3DCTQ%40mail.gmail.com%3E].
> A sketch of this method is:
> {code}
> public void abort() {
> try {
> ioThread.interrupt();
> ioThread.stop(new ThreadDeath());
> } catch (IllegalAccessException e) {
> }
> }
> {code}
> but of course it is preferable to stop the {{ioThread}} by cooperation, 
> rather than use the deprecated {{Thread.stop(new ThreadDeath())}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 31040: Patch for kafka-1952

2015-02-13 Thread Jun Rao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31040/
---

Review request for kafka.


Bugs: kafka-1952
https://issues.apache.org/jira/browse/kafka-1952


Repository: kafka


Description
---

KAFKA-1952; High CPU Usage in 0.8.2 release


Diffs
-

  core/src/main/scala/kafka/server/RequestPurgatory.scala 
9d76234bc2c810ec08621dc92bb4061b8e7cd993 

Diff: https://reviews.apache.org/r/31040/diff/


Testing
---


Thanks,

Jun Rao



[jira] [Updated] (KAFKA-1952) High CPU Usage in 0.8.2 release

2015-02-13 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao updated KAFKA-1952:
---
Attachment: kafka-1952.patch

> High CPU Usage in 0.8.2 release
> ---
>
> Key: KAFKA-1952
> URL: https://issues.apache.org/jira/browse/KAFKA-1952
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.0
>Reporter: Jay Kreps
>Assignee: Jun Rao
>Priority: Critical
> Fix For: 0.8.2.0
>
> Attachments: kafka-1952.patch
>
>
> Brokers with high partition count see increased CPU usage when migrating from 
> 0.8.1.1 to 0.8.2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1952) High CPU Usage in 0.8.2 release

2015-02-13 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14321131#comment-14321131
 ] 

Jun Rao commented on KAFKA-1952:


The issue is that after purgatory refactoring, we make a checkSatisfied() call 
on every key in the request. However, each checkSatisfied() goes through every 
partition in the request. So, if you have 3K partitions in the request (like 
the replica fetcher), we will be making 9 million calls to methods like 
getPartition(). That's what's burning the CPU. This issue seems to only happen 
in a cluster when there is no data being produced.

Attach a patch that addresses the issue.

> High CPU Usage in 0.8.2 release
> ---
>
> Key: KAFKA-1952
> URL: https://issues.apache.org/jira/browse/KAFKA-1952
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.0
>Reporter: Jay Kreps
>Assignee: Jun Rao
>Priority: Critical
> Fix For: 0.8.2.0
>
> Attachments: kafka-1952.patch
>
>
> Brokers with high partition count see increased CPU usage when migrating from 
> 0.8.1.1 to 0.8.2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-1952) High CPU Usage in 0.8.2 release

2015-02-13 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao updated KAFKA-1952:
---
Status: Patch Available  (was: Open)

> High CPU Usage in 0.8.2 release
> ---
>
> Key: KAFKA-1952
> URL: https://issues.apache.org/jira/browse/KAFKA-1952
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.0
>Reporter: Jay Kreps
>Assignee: Jun Rao
>Priority: Critical
> Fix For: 0.8.2.0
>
> Attachments: kafka-1952.patch
>
>
> Brokers with high partition count see increased CPU usage when migrating from 
> 0.8.1.1 to 0.8.2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1952) High CPU Usage in 0.8.2 release

2015-02-13 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14321130#comment-14321130
 ] 

Jun Rao commented on KAFKA-1952:


Created reviewboard https://reviews.apache.org/r/31040/diff/
 against branch origin/0.8.2

> High CPU Usage in 0.8.2 release
> ---
>
> Key: KAFKA-1952
> URL: https://issues.apache.org/jira/browse/KAFKA-1952
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.0
>Reporter: Jay Kreps
>Assignee: Jun Rao
>Priority: Critical
> Fix For: 0.8.2.0
>
> Attachments: kafka-1952.patch
>
>
> Brokers with high partition count see increased CPU usage when migrating from 
> 0.8.1.1 to 0.8.2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1926) Replace kafka.utils.Utils with o.a.k.common.utils.Utils

2015-02-13 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14321134#comment-14321134
 ] 

Jay Kreps commented on KAFKA-1926:
--

Yes but the whole point of these code cleanups is to clean it up. So I think 
you should just rip and cut away.

I think the complex interaction that I am aware of is between Scheduler, Time, 
MockScheduler, and MockTime. I think if you move one you may have to move them 
all...

The use of SystemTime.milliseconds is silly since once you hard code SystemTime 
you lose the ability to plug in the MockTime for unit testing which is the 
whole point. So I think we should just move all of these to 
System.currentTimeMillis.

> Replace kafka.utils.Utils with o.a.k.common.utils.Utils
> ---
>
> Key: KAFKA-1926
> URL: https://issues.apache.org/jira/browse/KAFKA-1926
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.8.2.0
>Reporter: Jay Kreps
>  Labels: newbie, patch
> Attachments: KAFKA-1926.patch, KAFKA-1926.patch
>
>
> There is currently a lot of duplication between the Utils class in common and 
> the one in core.
> Our plan has been to deprecate duplicate code in the server and replace it 
> with the new common code.
> As such we should evaluate each method in the scala Utils and do one of the 
> following:
> 1. Migrate it to o.a.k.common.utils.Utils if it is a sensible general purpose 
> utility in active use that is not Kafka-specific. If we migrate it we should 
> really think about the API and make sure there is some test coverage. A few 
> things in there are kind of funky and we shouldn't just blindly copy them 
> over.
> 2. Create a new class ServerUtils or ScalaUtils in kafka.utils that will hold 
> any utilities that really need to make use of Scala features to be convenient.
> 3. Delete it if it is not used, or has a bad api.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 31040: Patch for kafka-1952

2015-02-13 Thread Guozhang Wang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31040/#review72479
---

Ship it!



core/src/main/scala/kafka/server/RequestPurgatory.scala


"<= 0)" ?


- Guozhang Wang


On Feb. 14, 2015, 1:52 a.m., Jun Rao wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/31040/
> ---
> 
> (Updated Feb. 14, 2015, 1:52 a.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: kafka-1952
> https://issues.apache.org/jira/browse/kafka-1952
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> KAFKA-1952; High CPU Usage in 0.8.2 release
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/server/RequestPurgatory.scala 
> 9d76234bc2c810ec08621dc92bb4061b8e7cd993 
> 
> Diff: https://reviews.apache.org/r/31040/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jun Rao
> 
>



[jira] [Commented] (KAFKA-1659) Ability to cleanly abort the KafkaProducer

2015-02-13 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14321158#comment-14321158
 ] 

Guozhang Wang commented on KAFKA-1659:
--

Jay, I think idempotent producer is helpful for preserving order under the 
scenario of duplicates, but still I am how it can preserve ordering during 
fatal errors, as in its proposal 
(https://cwiki.apache.org/confluence/display/KAFKA/Idempotent+Producer) we are 
not requiring the sequence number to be strictly incremental?

> Ability to cleanly abort the KafkaProducer
> --
>
> Key: KAFKA-1659
> URL: https://issues.apache.org/jira/browse/KAFKA-1659
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, producer 
>Affects Versions: 0.8.2.0
>Reporter: Andrew Stein
>Assignee: Jun Rao
> Fix For: 0.8.3
>
>
> I would like the ability to "abort" the Java Client's KafkaProducer. This 
> includes the stopping the writing of buffered records.
> The motivation for this is described 
> [here|http://mail-archives.apache.org/mod_mbox/kafka-dev/201409.mbox/%3CCAOk4UxB7BJm6HSgLXrR01sksB2dOC3zdt0NHaKHz1EALR6%3DCTQ%40mail.gmail.com%3E].
> A sketch of this method is:
> {code}
> public void abort() {
> try {
> ioThread.interrupt();
> ioThread.stop(new ThreadDeath());
> } catch (IllegalAccessException e) {
> }
> }
> {code}
> but of course it is preferable to stop the {{ioThread}} by cooperation, 
> rather than use the deprecated {{Thread.stop(new ThreadDeath())}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Build failed in Jenkins: KafkaPreCommit #5

2015-02-13 Thread Apache Jenkins Server
See 

Changes:

[jjkoshy] KAFKA-1697; Remove support for producer ack > 1 on the broker; 
reviewed by Joel Koshy

--
[...truncated 1121 lines...]
kafka.api.test.ProducerFailureHandlingTest > testNotEnoughReplicas PASSED

kafka.api.test.ProducerFailureHandlingTest > testTooLargeRecordWithAckZero 
PASSED

kafka.api.test.ProducerFailureHandlingTest > testTooLargeRecordWithAckOne PASSED

kafka.api.test.ProducerFailureHandlingTest > testNonExistentTopic PASSED

kafka.api.test.ProducerFailureHandlingTest > testWrongBrokerList PASSED

kafka.api.test.ProducerFailureHandlingTest > testNoResponse PASSED

kafka.api.test.ProducerFailureHandlingTest > testSendAfterClosed PASSED

kafka.api.test.ProducerFailureHandlingTest > testBrokerFailure PASSED

kafka.api.test.ProducerFailureHandlingTest > testCannotSendToInternalTopic 
PASSED

kafka.api.test.ProducerFailureHandlingTest > 
testNotEnoughReplicasAfterBrokerShutdown PASSED

kafka.network.SocketServerTest > simpleRequest PASSED

kafka.network.SocketServerTest > tooBigRequestIsRejected PASSED

kafka.network.SocketServerTest > testNullResponse PASSED

kafka.network.SocketServerTest > testSocketsCloseOnShutdown PASSED

kafka.network.SocketServerTest > testMaxConnectionsPerIp PASSED

kafka.network.SocketServerTest > testMaxConnectionsPerIPOverrides PASSED

kafka.log.OffsetIndexTest > truncate PASSED

kafka.log.OffsetIndexTest > randomLookupTest PASSED

kafka.log.OffsetIndexTest > lookupExtremeCases PASSED

kafka.log.OffsetIndexTest > appendTooMany PASSED

kafka.log.OffsetIndexTest > appendOutOfOrder PASSED

kafka.log.OffsetIndexTest > testReopen PASSED

kafka.log.OffsetMapTest > testBasicValidation PASSED

kafka.log.OffsetMapTest > testClear PASSED

kafka.log.CleanerTest > testCleanSegments PASSED

kafka.log.CleanerTest > testCleaningWithDeletes PASSED

kafka.log.CleanerTest > testCleanSegmentsWithAbort PASSED

kafka.log.CleanerTest > testSegmentGrouping PASSED

kafka.log.CleanerTest > testBuildOffsetMap PASSED

kafka.log.FileMessageSetTest > testWrittenEqualsRead PASSED

kafka.log.FileMessageSetTest > testIteratorIsConsistent PASSED

kafka.log.FileMessageSetTest > testSizeInBytes PASSED

kafka.log.FileMessageSetTest > testWriteTo PASSED

kafka.log.FileMessageSetTest > testFileSize PASSED

kafka.log.FileMessageSetTest > testIterationOverPartialAndTruncation PASSED

kafka.log.FileMessageSetTest > testIterationDoesntChangePosition PASSED

kafka.log.FileMessageSetTest > testRead PASSED

kafka.log.FileMessageSetTest > testSearch PASSED

kafka.log.FileMessageSetTest > testIteratorWithLimits PASSED

kafka.log.FileMessageSetTest > testTruncate PASSED

kafka.log.LogTest > testTimeBasedLogRoll PASSED

kafka.log.LogTest > testTimeBasedLogRollJitter PASSED

kafka.log.LogTest > testSizeBasedLogRoll PASSED

kafka.log.LogTest > testLoadEmptyLog PASSED

kafka.log.LogTest > testAppendAndReadWithSequentialOffsets PASSED

kafka.log.LogTest > testAppendAndReadWithNonSequentialOffsets PASSED

kafka.log.LogTest > testReadAtLogGap PASSED

kafka.log.LogTest > testReadOutOfRange PASSED

kafka.log.LogTest > testLogRolls PASSED

kafka.log.LogTest > testCompressedMessages PASSED

kafka.log.LogTest > testThatGarbageCollectingSegmentsDoesntChangeOffset PASSED

kafka.log.LogTest > testMessageSetSizeCheck PASSED

kafka.log.LogTest > testMessageSizeCheck PASSED

kafka.log.LogTest > testLogRecoversToCorrectOffset PASSED

kafka.log.LogTest > testIndexRebuild PASSED

kafka.log.LogTest > testTruncateTo PASSED

kafka.log.LogTest > testIndexResizingAtTruncation PASSED

kafka.log.LogTest > testBogusIndexSegmentsAreRemoved PASSED

kafka.log.LogTest > testReopenThenTruncate PASSED

kafka.log.LogTest > testAsyncDelete PASSED

kafka.log.LogTest > testOpenDeletesObsoleteFiles PASSED

kafka.log.LogTest > testAppendMessageWithNullPayload PASSED

kafka.log.LogTest > testCorruptLog PASSED

kafka.log.LogTest > testCleanShutdownFile PASSED

kafka.log.LogTest > testParseTopicPartitionName PASSED

kafka.log.LogTest > testParseTopicPartitionNameForEmptyName PASSED

kafka.log.LogTest > testParseTopicPartitionNameForNull PASSED

kafka.log.LogTest > testParseTopicPartitionNameForMissingSeparator PASSED

kafka.log.LogTest > testParseTopicPartitionNameForMissingTopic PASSED

kafka.log.LogTest > testParseTopicPartitionNameForMissingPartition PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[0] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[1] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[2] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[3] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[4] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[5] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[6] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[7] PASSED

kafka.log.BrokerCompressionTest

Build failed in Jenkins: Kafka-trunk #392

2015-02-13 Thread Apache Jenkins Server
See 

Changes:

[jjkoshy] KAFKA-1697; Remove support for producer ack > 1 on the broker; 
reviewed by Joel Koshy

--
[...truncated 1373 lines...]
try {
^
:42:
 a pure expression does nothing in statement position; you may be omitting 
necessary parentheses
responseCallback
^
:206:
 a pure expression does nothing in statement position; you may be omitting 
necessary parentheses
ControllerStats.uncleanLeaderElectionRate
^
:207:
 a pure expression does nothing in statement position; you may be omitting 
necessary parentheses
ControllerStats.leaderElectionTimer
^
:28:
 object JSON in package json is deprecated: This object will be removed.
  JSON.globalNumberParser = myConversionFunc
  ^
:37:
 object JSON in package json is deprecated: This object will be removed.
JSON.parseFull(input)
^
:419:
 object Pair in object Predef is deprecated: Use built-in tuple syntax or 
Tuple2 instead
  Pair(s.substring(0,lio).trim, s.substring(lio + 1).trim)
  ^
:107:
 Reference to uninitialized value fetchWaitMaxMs
  require(fetchWaitMaxMs <= socketTimeoutMs, "socket.timeout.ms should always 
be at least fetch.wait.max.ms" +
  ^
:248:
 Reference to uninitialized value replicaFetchWaitMaxMs
  require(replicaFetchWaitMaxMs <= replicaSocketTimeoutMs, 
"replica.socket.timeout.ms should always be at least replica.fetch.wait.max.ms" 
+
  ^
there were 12 feature warning(s); re-run with -feature for details
12 warnings found
:core:processResources UP-TO-DATE
:core:classes
:core:compileTestJava UP-TO-DATE
:core:compileTestScala
:169:
 This catches all Throwables. If this is really intended, use `case ex : 
Throwable` to clear this warning.
  case ex => fail()
   ^
one warning found
:core:processTestResources UP-TO-DATE
:core:testClasses
:core:test

unit.kafka.KafkaTest > testGetKafkaConfigFromArgs PASSED

unit.kafka.KafkaTest > testGetKafkaConfigFromArgsWrongSetValue PASSED

unit.kafka.KafkaTest > testGetKafkaConfigFromArgsNonArgsAtTheEnd PASSED

unit.kafka.KafkaTest > testGetKafkaConfigFromArgsNonArgsOnly PASSED

unit.kafka.KafkaTest > testGetKafkaConfigFromArgsNonArgsAtTheBegging PASSED

unit.kafka.utils.ByteBoundedBlockingQueueTest > testByteBoundedBlockingQueue 
PASSED

unit.kafka.utils.CommandLineUtilsTest > testParseEmptyArg PASSED

unit.kafka.utils.CommandLineUtilsTest > testParseSingleArg PASSED

unit.kafka.utils.CommandLineUtilsTest > testParseArgs PASSED

unit.kafka.consumer.PartitionAssignorTest > testRoundRobinPartitionAssignor 
PASSED

unit.kafka.consumer.PartitionAssignorTest > testRangePartitionAssignor PASSED

unit.kafka.common.ConfigTest > testInvalidClientIds PASSED

unit.kafka.common.ConfigTest > testInvalidGroupIds PASSED

unit.kafka.common.TopicTest > testInvalidTopicNames PASSED

kafka.integration.FetcherTest > testFetcher PASSED

kafka.integration.UncleanLeaderElectionTest > testUncleanLeaderElectionEnabled 
PASSED

kafka.integration.UncleanLeaderElectionTest > testUncleanLeaderElectionDisabled 
PASSED

kafka.integration.UncleanLeaderElectionTest > 
testUncleanLeaderElectionEnabledByTopicOverride PASSED

kafka.integration.UncleanLeaderElectionTest > 
testCleanLeaderElectionDisabledByTopicOverride PASSED

kafka.integration.UncleanLeaderElectionTest > 
testUncleanLeaderElectionInvalidTopicOverride PASSED

kafka.integration.AutoOffsetResetTest > testResetToEarliestWhenOffsetTooHigh 
PASSED

kafka.integration.AutoOffsetResetTest > testResetToEarliestWhenOffsetTooLow 
PASSED

kafka.integration.AutoOffsetResetTest > testResetToLatestWhenOffsetTooHigh 
PASSED

kafka.integration.AutoOffsetResetTest > testResetToLatestWhenOffsetTooLow PASSED

kafka.integration.PrimitiveApiTest > testFetchRequestCanProperlySerialize PASSED

kafka.integration.PrimitiveApiTest > testEmptyFetchRequest PASSED

kafka.integration.PrimitiveApiTest > testDefaultEncoderProducerAndFetch PASSED

kafka.integration.PrimitiveApiTest > 
testDefaultEncoderProducerAndFetchWithCompression PA