Re: librdkafka 0.8.0 released

2013-11-25 Thread Magnus Edenhill
The following tests were using a single producer application
(examples/rdkafka_performance):

* Test1: 2 brokers, 2 partitions, required.acks=2, 100 byte messages:
85 messages/second, 85 MB/second

* Test2: 1 broker, 1 partition, required.acks=0, 100 byte messages: 71
messages/second, 71 MB/second

* Test3: 2 broker2, 2 partitions, required.acks=2, 100 byte messages,
snappy compression: 30 messages/second, 30 MB/second

* Test4: 2 broker2, 2 partitions, required.acks=2, 100 byte messages, gzip
compression: 23 messages/second, 23 MB/second


log.flush broker configuration was increased to avoid the disk being the
bottleneck.


/Magnus



2013/11/24 Neha Narkhede 

> So, a single producer'a throughput is 80 MB/s? That seems pretty high. What
> was the number of acks setting? Thanks for sharing these numbers.
>
> On Sunday, November 24, 2013, Magnus Edenhill wrote:
>
> > Hi Neha,
> >
> > these tests were done using 100 byte messages. More information about the
> > producer performance tests can be found here:
> >
> >
> https://github.com/edenhill/librdkafka/blob/master/INTRODUCTION.md#performance-numbers
> >
> > The tests are indicative at best and in no way scientific, but I must say
> > that the Kafka broker performance is impressive.
> >
> > Regards,
> > Magnus
> >
> >
> >
> > 2013/11/22 Neha Narkhede >
> >
> > > Thanks for sharing this! What is the message size for the throughput
> > > numbers stated below?
> > >
> > > Thanks,
> > > Neha
> > > On Nov 22, 2013 6:59 AM, "Magnus Edenhill"  >
> > wrote:
> > >
> > > > This announces the 0.8.0 release of librdkafka - The Apache Kafka
> > client
> > > C
> > > > library - now with 0.8 protocol support.
> > > >
> > > > Features:
> > > > * Producer (~800K msgs/s)
> > > > * Consumer  (~3M msgs/s)
> > > > * Compression (Snappy, gzip)
> > > > * Proper failover and leader re-election support - no message is ever
> > > lost.
> > > > * Configuration properties compatible with official Apache Kafka.
> > > > * Stabilized ABI-safe API
> > > > * Mainline Debian package submitted
> > > > * Production quality
> > > >
> > > >
> > > > Home:
> > > > https://github.com/edenhill/librdkafka
> > > >
> > > > Introduction and performance numbers:
> > > > https://github.com/edenhill/librdkafka/blob/master/INTRODUCTION.md
> > > >
> > > > Have fun.
> > > >
> > > > Regards,
> > > > Magnus
> > > >
> > > > P.S.
> > > > Check out Wikimedia Foundation's varnishkafka daemon for a use case -
> > > > varnish log forwarding over Kafka:
> > > > https://github.com/wikimedia/varnishkafka
> > > >
> > >
> >
>


[jira] [Created] (KAFKA-1144) commitOffsets can be passed the offsets to commit

2013-11-25 Thread Imran Rashid (JIRA)
Imran Rashid created KAFKA-1144:
---

 Summary: commitOffsets can be passed the offsets to commit
 Key: KAFKA-1144
 URL: https://issues.apache.org/jira/browse/KAFKA-1144
 Project: Kafka
  Issue Type: Improvement
  Components: consumer
Reporter: Imran Rashid
Assignee: Neha Narkhede




This adds another version of commitOffsets that takes the offsets to commit as 
a parameter.

Without this change, getting correct user code is very hard. Despite kafka's 
at-least-once guarantees, most user code doesn't actually have that guarantee, 
and is almost certainly wrong if doing batch processing. Getting it right 
requires some very careful synchronization between all consumer threads, which 
is both:
1) painful to get right
2) slow b/c of the need to stop all workers during a commit.

This small change simplifies a lot of this. This was discussed extensively on 
the user mailing list, on the thread "are kafka consumer apps guaranteed to see 
msgs at least once?"

You can also see an example implementation of a user api which makes use of 
this, to get proper at-least-once guarantees by user code, even for batches:
https://github.com/quantifind/kafka-utils/pull/1

I'm open to any suggestions on how to add unit tests for this.




--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (KAFKA-1144) commitOffsets can be passed the offsets to commit

2013-11-25 Thread Imran Rashid (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Imran Rashid updated KAFKA-1144:


Attachment: 0002-add-protection-against-backward-commits.patch
0001-allow-committing-of-arbitrary-offsets-to-facilitate-.patch

> commitOffsets can be passed the offsets to commit
> -
>
> Key: KAFKA-1144
> URL: https://issues.apache.org/jira/browse/KAFKA-1144
> Project: Kafka
>  Issue Type: Improvement
>  Components: consumer
>Affects Versions: 0.8
>Reporter: Imran Rashid
>Assignee: Neha Narkhede
> Attachments: 
> 0001-allow-committing-of-arbitrary-offsets-to-facilitate-.patch, 
> 0002-add-protection-against-backward-commits.patch
>
>
> This adds another version of commitOffsets that takes the offsets to commit 
> as a parameter.
> Without this change, getting correct user code is very hard. Despite kafka's 
> at-least-once guarantees, most user code doesn't actually have that 
> guarantee, and is almost certainly wrong if doing batch processing. Getting 
> it right requires some very careful synchronization between all consumer 
> threads, which is both:
> 1) painful to get right
> 2) slow b/c of the need to stop all workers during a commit.
> This small change simplifies a lot of this. This was discussed extensively on 
> the user mailing list, on the thread "are kafka consumer apps guaranteed to 
> see msgs at least once?"
> You can also see an example implementation of a user api which makes use of 
> this, to get proper at-least-once guarantees by user code, even for batches:
> https://github.com/quantifind/kafka-utils/pull/1
> I'm open to any suggestions on how to add unit tests for this.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (KAFKA-1144) commitOffsets can be passed the offsets to commit

2013-11-25 Thread Imran Rashid (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Imran Rashid updated KAFKA-1144:


Affects Version/s: 0.8
   Status: Patch Available  (was: Open)

> commitOffsets can be passed the offsets to commit
> -
>
> Key: KAFKA-1144
> URL: https://issues.apache.org/jira/browse/KAFKA-1144
> Project: Kafka
>  Issue Type: Improvement
>  Components: consumer
>Affects Versions: 0.8
>Reporter: Imran Rashid
>Assignee: Neha Narkhede
> Attachments: 
> 0001-allow-committing-of-arbitrary-offsets-to-facilitate-.patch, 
> 0002-add-protection-against-backward-commits.patch
>
>
> This adds another version of commitOffsets that takes the offsets to commit 
> as a parameter.
> Without this change, getting correct user code is very hard. Despite kafka's 
> at-least-once guarantees, most user code doesn't actually have that 
> guarantee, and is almost certainly wrong if doing batch processing. Getting 
> it right requires some very careful synchronization between all consumer 
> threads, which is both:
> 1) painful to get right
> 2) slow b/c of the need to stop all workers during a commit.
> This small change simplifies a lot of this. This was discussed extensively on 
> the user mailing list, on the thread "are kafka consumer apps guaranteed to 
> see msgs at least once?"
> You can also see an example implementation of a user api which makes use of 
> this, to get proper at-least-once guarantees by user code, even for batches:
> https://github.com/quantifind/kafka-utils/pull/1
> I'm open to any suggestions on how to add unit tests for this.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


kafka pull request: commitOffsets can be passed the offsets to commit

2013-11-25 Thread squito
Github user squito closed the pull request at:

https://github.com/apache/kafka/pull/10



Re: librdkafka 0.8.0 released

2013-11-25 Thread Jun Rao
Thanks for sharing the results. Was the topic created with replication
factor of 2? Could you test acks=-1 as well?

Thanks,

Jun


On Mon, Nov 25, 2013 at 4:30 AM, Magnus Edenhill  wrote:

> The following tests were using a single producer application
> (examples/rdkafka_performance):
>
> * Test1: 2 brokers, 2 partitions, required.acks=2, 100 byte messages:
> 85 messages/second, 85 MB/second
>
> * Test2: 1 broker, 1 partition, required.acks=0, 100 byte messages: 71
> messages/second, 71 MB/second
>
> * Test3: 2 broker2, 2 partitions, required.acks=2, 100 byte messages,
> snappy compression: 30 messages/second, 30 MB/second
>
> * Test4: 2 broker2, 2 partitions, required.acks=2, 100 byte messages, gzip
> compression: 23 messages/second, 23 MB/second
>
>
> log.flush broker configuration was increased to avoid the disk being the
> bottleneck.
>
>
> /Magnus
>
>
>
> 2013/11/24 Neha Narkhede 
>
> > So, a single producer'a throughput is 80 MB/s? That seems pretty high.
> What
> > was the number of acks setting? Thanks for sharing these numbers.
> >
> > On Sunday, November 24, 2013, Magnus Edenhill wrote:
> >
> > > Hi Neha,
> > >
> > > these tests were done using 100 byte messages. More information about
> the
> > > producer performance tests can be found here:
> > >
> > >
> >
> https://github.com/edenhill/librdkafka/blob/master/INTRODUCTION.md#performance-numbers
> > >
> > > The tests are indicative at best and in no way scientific, but I must
> say
> > > that the Kafka broker performance is impressive.
> > >
> > > Regards,
> > > Magnus
> > >
> > >
> > >
> > > 2013/11/22 Neha Narkhede >
> > >
> > > > Thanks for sharing this! What is the message size for the throughput
> > > > numbers stated below?
> > > >
> > > > Thanks,
> > > > Neha
> > > > On Nov 22, 2013 6:59 AM, "Magnus Edenhill"  > >
> > > wrote:
> > > >
> > > > > This announces the 0.8.0 release of librdkafka - The Apache Kafka
> > > client
> > > > C
> > > > > library - now with 0.8 protocol support.
> > > > >
> > > > > Features:
> > > > > * Producer (~800K msgs/s)
> > > > > * Consumer  (~3M msgs/s)
> > > > > * Compression (Snappy, gzip)
> > > > > * Proper failover and leader re-election support - no message is
> ever
> > > > lost.
> > > > > * Configuration properties compatible with official Apache Kafka.
> > > > > * Stabilized ABI-safe API
> > > > > * Mainline Debian package submitted
> > > > > * Production quality
> > > > >
> > > > >
> > > > > Home:
> > > > > https://github.com/edenhill/librdkafka
> > > > >
> > > > > Introduction and performance numbers:
> > > > > https://github.com/edenhill/librdkafka/blob/master/INTRODUCTION.md
> > > > >
> > > > > Have fun.
> > > > >
> > > > > Regards,
> > > > > Magnus
> > > > >
> > > > > P.S.
> > > > > Check out Wikimedia Foundation's varnishkafka daemon for a use
> case -
> > > > > varnish log forwarding over Kafka:
> > > > > https://github.com/wikimedia/varnishkafka
> > > > >
> > > >
> > >
> >
>


[jira] [Commented] (KAFKA-1144) commitOffsets can be passed the offsets to commit

2013-11-25 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13831608#comment-13831608
 ] 

Jun Rao commented on KAFKA-1144:


Thanks for the patch. A couple of things.

1. We have a jira that tries to move the storage of offsets off ZK 
(https://issues.apache.org/jira/browse/KAFKA-1000) since ZK is not really 
designed for that. So, we may not be able to do conditional updates for offsets 
in the future.

2. We will be rewriting the consumer client 
(https://cwiki.apache.org/confluence/display/KAFKA/Client+Rewrite). In the new 
api, we will add a callback during consumer rebalances. Do you think that 
addresses your issue as well?


> commitOffsets can be passed the offsets to commit
> -
>
> Key: KAFKA-1144
> URL: https://issues.apache.org/jira/browse/KAFKA-1144
> Project: Kafka
>  Issue Type: Improvement
>  Components: consumer
>Affects Versions: 0.8
>Reporter: Imran Rashid
>Assignee: Neha Narkhede
> Attachments: 
> 0001-allow-committing-of-arbitrary-offsets-to-facilitate-.patch, 
> 0002-add-protection-against-backward-commits.patch
>
>
> This adds another version of commitOffsets that takes the offsets to commit 
> as a parameter.
> Without this change, getting correct user code is very hard. Despite kafka's 
> at-least-once guarantees, most user code doesn't actually have that 
> guarantee, and is almost certainly wrong if doing batch processing. Getting 
> it right requires some very careful synchronization between all consumer 
> threads, which is both:
> 1) painful to get right
> 2) slow b/c of the need to stop all workers during a commit.
> This small change simplifies a lot of this. This was discussed extensively on 
> the user mailing list, on the thread "are kafka consumer apps guaranteed to 
> see msgs at least once?"
> You can also see an example implementation of a user api which makes use of 
> this, to get proper at-least-once guarantees by user code, even for batches:
> https://github.com/quantifind/kafka-utils/pull/1
> I'm open to any suggestions on how to add unit tests for this.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (KAFKA-1144) commitOffsets can be passed the offsets to commit

2013-11-25 Thread Imran Rashid (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13831658#comment-13831658
 ] 

Imran Rashid commented on KAFKA-1144:
-

Thanks for quick response, Jun.

(1) is unfortunate, though it doesn't technically break this.  It just 
increases the number of messages that would get processed multiple times on a 
rebalance + crash (I'm pretty sure thats the only way you could end up with a 
lasting backwards commit without the conditional update).  I thought the move 
off of zk wasn't until 0.9, sorry -- I think this patch can go forward w/ out 
the conditional update.  (should I update the patches to remove it?  or just 
leave it in, and then it will go away when there is no more zk?)

(2) Notification on rebalances does not eliminate the desire for this patch.  
(It would, however, eliminate the need for conditional updates!)  Even without 
rebalances, with the current api, you really need to stop all worker threads 
before doing a commit if you want to guarantee that your app has seen all the 
messages.  This is especially true w/ batch processing.

Again, the patch isn't necessary, but its a small change that makes it sooo 
much easier to get user code right, not to mention more efficient.

Maybe other changes in the 0.9 api will make this unnecessary, I dunno.  but I 
think this is useful for 0.8 in the meantime.  And I'd hope the client rewrite 
would also make it easy to write batch consumers, like the api I put together 
in the other repo.  (I'd happily submit that directly to kafka, if it was 
desired, though its very scala-y, and I guess the user api is going to be 
java-only?)

> commitOffsets can be passed the offsets to commit
> -
>
> Key: KAFKA-1144
> URL: https://issues.apache.org/jira/browse/KAFKA-1144
> Project: Kafka
>  Issue Type: Improvement
>  Components: consumer
>Affects Versions: 0.8
>Reporter: Imran Rashid
>Assignee: Neha Narkhede
> Attachments: 
> 0001-allow-committing-of-arbitrary-offsets-to-facilitate-.patch, 
> 0002-add-protection-against-backward-commits.patch
>
>
> This adds another version of commitOffsets that takes the offsets to commit 
> as a parameter.
> Without this change, getting correct user code is very hard. Despite kafka's 
> at-least-once guarantees, most user code doesn't actually have that 
> guarantee, and is almost certainly wrong if doing batch processing. Getting 
> it right requires some very careful synchronization between all consumer 
> threads, which is both:
> 1) painful to get right
> 2) slow b/c of the need to stop all workers during a commit.
> This small change simplifies a lot of this. This was discussed extensively on 
> the user mailing list, on the thread "are kafka consumer apps guaranteed to 
> see msgs at least once?"
> You can also see an example implementation of a user api which makes use of 
> this, to get proper at-least-once guarantees by user code, even for batches:
> https://github.com/quantifind/kafka-utils/pull/1
> I'm open to any suggestions on how to add unit tests for this.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Resolved] (KAFKA-394) update site with steps and notes for doing a release under developer

2013-11-25 Thread Joe Stein (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe Stein resolved KAFKA-394.
-

Resolution: Fixed

https://cwiki.apache.org/confluence/display/KAFKA/Release+Process

> update site with steps and notes for doing a release under developer
> 
>
> Key: KAFKA-394
> URL: https://issues.apache.org/jira/browse/KAFKA-394
> Project: Kafka
>  Issue Type: Bug
>Reporter: Joe Stein
>Assignee: Joe Stein
>
> steps in release process including updating the dist directory



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] Subscription: outstanding kafka patches

2013-11-25 Thread jira
Issue Subscription
Filter: outstanding kafka patches (75 issues)
The list of outstanding kafka patches
Subscriber: kafka-mailing-list

Key Summary
KAFKA-1144  commitOffsets can be passed the offsets to commit
https://issues.apache.org/jira/browse/KAFKA-1144
KAFKA-1142  Patch review tool should take diff with origin from last divergent 
point
https://issues.apache.org/jira/browse/KAFKA-1142
KAFKA-1140  Move the decoding logic from ConsumerIterator.makeNext to next
https://issues.apache.org/jira/browse/KAFKA-1140
KAFKA-1130  "log.dirs" is a confusing property name
https://issues.apache.org/jira/browse/KAFKA-1130
KAFKA-1116  Need to upgrade sbt-assembly to compile on scala 2.10.2
https://issues.apache.org/jira/browse/KAFKA-1116
KAFKA-1110  Unable to produce messages with snappy/gzip compression
https://issues.apache.org/jira/browse/KAFKA-1110
KAFKA-1109  Need to fix GC log configuration code, not able to override 
KAFKA_GC_LOG_OPTS
https://issues.apache.org/jira/browse/KAFKA-1109
KAFKA-1106  HighwaterMarkCheckpoint failure puting broker into a bad state
https://issues.apache.org/jira/browse/KAFKA-1106
KAFKA-1093  Log.getOffsetsBefore(t, …) does not return the last confirmed 
offset before t
https://issues.apache.org/jira/browse/KAFKA-1093
KAFKA-1086  Improve GetOffsetShell to find metadata automatically
https://issues.apache.org/jira/browse/KAFKA-1086
KAFKA-1082  zkclient dies after UnknownHostException in zk reconnect
https://issues.apache.org/jira/browse/KAFKA-1082
KAFKA-1079  Liars in PrimitiveApiTest that promise to test api in compression 
mode, but don't do this actually
https://issues.apache.org/jira/browse/KAFKA-1079
KAFKA-1074  Reassign partitions should delete the old replicas from disk
https://issues.apache.org/jira/browse/KAFKA-1074
KAFKA-1049  Encoder implementations are required to provide an undocumented 
constructor.
https://issues.apache.org/jira/browse/KAFKA-1049
KAFKA-1032  Messages sent to the old leader will be lost on broker GC resulted 
failure
https://issues.apache.org/jira/browse/KAFKA-1032
KAFKA-1020  Remove getAllReplicasOnBroker from KafkaController
https://issues.apache.org/jira/browse/KAFKA-1020
KAFKA-1012  Implement an Offset Manager and hook offset requests to it
https://issues.apache.org/jira/browse/KAFKA-1012
KAFKA-1011  Decompression and re-compression on MirrorMaker could result in 
messages being dropped in the pipeline
https://issues.apache.org/jira/browse/KAFKA-1011
KAFKA-1005  kafka.perf.ConsumerPerformance not shutting down consumer
https://issues.apache.org/jira/browse/KAFKA-1005
KAFKA-1004  Handle topic event for trivial whitelist topic filters
https://issues.apache.org/jira/browse/KAFKA-1004
KAFKA-998   Producer should not retry on non-recoverable error codes
https://issues.apache.org/jira/browse/KAFKA-998
KAFKA-997   Provide a strict verification mode when reading configuration 
properties
https://issues.apache.org/jira/browse/KAFKA-997
KAFKA-996   Capitalize first letter for log entries
https://issues.apache.org/jira/browse/KAFKA-996
KAFKA-984   Avoid a full rebalance in cases when a new topic is discovered but 
container/broker set stay the same
https://issues.apache.org/jira/browse/KAFKA-984
KAFKA-976   Order-Preserving Mirror Maker Testcase
https://issues.apache.org/jira/browse/KAFKA-976
KAFKA-967   Use key range in ProducerPerformance
https://issues.apache.org/jira/browse/KAFKA-967
KAFKA-917   Expose zk.session.timeout.ms in console consumer
https://issues.apache.org/jira/browse/KAFKA-917
KAFKA-885   sbt package builds two kafka jars
https://issues.apache.org/jira/browse/KAFKA-885
KAFKA-881   Kafka broker not respecting log.roll.hours
https://issues.apache.org/jira/browse/KAFKA-881
KAFKA-873   Consider replacing zkclient with curator (with zkclient-bridge)
https://issues.apache.org/jira/browse/KAFKA-873
KAFKA-868   System Test - add test case for rolling controlled shutdown
https://issues.apache.org/jira/browse/KAFKA-868
KAFKA-863   System Test - update 0.7 version of kafka-run-class.sh for 
Migration Tool test cases
https://issues.apache.org/jira/browse/KAFKA-863
KAFKA-859   support basic auth protection of mx4j console
https://issues.apache.org/jira/browse/KAFKA-859
KAFKA-855   Ant+Ivy build for Kafka
https://issues.apache.org/jira/browse/KAFKA-855
KAFKA-854   Upgrade dependencies for 0.8
https://issues.apache.org/jira/browse/KAFKA-854
KAFKA-815   Improve SimpleConsumerShell to take in a max messages config option
https://issues.apache.org/jira/browse/KAFKA-815
KAFKA-745   Remove getShutdownReceive() and other kafka specific c

[VOTE] Apache Kafka Release 0.8.0 - Candidate 4

2013-11-25 Thread Joe Stein
This is the fourth candidate for release of Apache Kafka 0.8.0.   This
release resolves https://issues.apache.org/jira/browse/KAFKA-1131 and
https://issues.apache.org/jira/browse/KAFKA-1133

Release Notes for the 0.8.0 release
http://people.apache.org/~joestein/kafka-0.8.0-candidate4/RELEASE_NOTES.html

*** Please download, test and vote by Monday December, 2nd, 12pm PDT

Kafka's KEYS file containing PGP keys we use to sign the release:
http://svn.apache.org/repos/asf/kafka/KEYS in addition to the md5 and sha1
checksum

* Release artifacts to be voted upon (source and binary):
http://people.apache.org/~joestein/kafka-0.8.0-candidate4/

* Maven artifacts to be voted upon prior to release:
https://repository.apache.org/content/groups/staging/

(i.e. in sbt land this can be added to the build.sbt to use Kafka
resolvers += "Apache Staging" at "
https://repository.apache.org/content/groups/staging/";
libraryDependencies += "org.apache.kafka" % "kafka_2.10" % "0.8.0"
)

* The tag to be voted upon (off the 0.8 branch) is the 0.8.0 tag
https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=2c20a71a010659e25af075a024cbd692c87d4c89

/***
 Joe Stein
 Founder, Principal Consultant
 Big Data Open Source Security LLC
 http://www.stealth.ly
 Twitter: @allthingshadoop 
/


Re: Review Request 15805: KAFKA-1140.v2: addressed Jun's comments

2013-11-25 Thread Guozhang Wang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15805/
---

(Updated Nov. 25, 2013, 8:53 p.m.)


Review request for kafka.


Summary (updated)
-

KAFKA-1140.v2: addressed Jun's comments


Bugs: KAFKA-1140
https://issues.apache.org/jira/browse/KAFKA-1140


Repository: kafka


Description
---

KAFKA-1140.v1


Dummy


Diffs (updated)
-

  core/src/main/scala/kafka/consumer/ConsumerIterator.scala 
a4227a49684c7de08e07cb1f3a10d2f76ba28da7 

Diff: https://reviews.apache.org/r/15805/diff/


Testing
---


Thanks,

Guozhang Wang



[jira] [Commented] (KAFKA-1140) Move the decoding logic from ConsumerIterator.makeNext to next

2013-11-25 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13831892#comment-13831892
 ] 

Guozhang Wang commented on KAFKA-1140:
--

Updated reviewboard https://reviews.apache.org/r/15805/
 against branch origin/trunk

> Move the decoding logic from ConsumerIterator.makeNext to next
> --
>
> Key: KAFKA-1140
> URL: https://issues.apache.org/jira/browse/KAFKA-1140
> Project: Kafka
>  Issue Type: Bug
>Reporter: Guozhang Wang
>Assignee: Guozhang Wang
> Fix For: 0.8.1
>
> Attachments: KAFKA-1140.patch, KAFKA-1140_2013-11-25_12:53:17.patch
>
>
> Usually people will write code around consumer like
> while(iter.hasNext()) {
> try {
>   msg = iter.next()
>   // do something
> }
> catch{
> }
> }
> 
> However, the iter.hasNext() call itself can throw exceptions due to decoding 
> failures. It would be better to move the decoding to the next function call.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (KAFKA-1140) Move the decoding logic from ConsumerIterator.makeNext to next

2013-11-25 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-1140:
-

Attachment: KAFKA-1140_2013-11-25_12:53:17.patch

> Move the decoding logic from ConsumerIterator.makeNext to next
> --
>
> Key: KAFKA-1140
> URL: https://issues.apache.org/jira/browse/KAFKA-1140
> Project: Kafka
>  Issue Type: Bug
>Reporter: Guozhang Wang
>Assignee: Guozhang Wang
> Fix For: 0.8.1
>
> Attachments: KAFKA-1140.patch, KAFKA-1140_2013-11-25_12:53:17.patch
>
>
> Usually people will write code around consumer like
> while(iter.hasNext()) {
> try {
>   msg = iter.next()
>   // do something
> }
> catch{
> }
> }
> 
> However, the iter.hasNext() call itself can throw exceptions due to decoding 
> failures. It would be better to move the decoding to the next function call.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: Review Request 15805: KAFKA-1140.v2: addressed Jun's comments

2013-11-25 Thread Guozhang Wang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15805/
---

(Updated Nov. 25, 2013, 8:55 p.m.)


Review request for kafka.


Bugs: KAFKA-1140
https://issues.apache.org/jira/browse/KAFKA-1140


Repository: kafka


Description (updated)
---

KAFKA-1140.v2


KAFKA-1140.v1


Dummy


Diffs (updated)
-

  core/src/main/scala/kafka/consumer/ConsumerIterator.scala 
a4227a49684c7de08e07cb1f3a10d2f76ba28da7 
  core/src/test/scala/unit/kafka/consumer/ConsumerIteratorTest.scala 
ef1de8321c713cd9d27ef937216f5b76a5d8c574 

Diff: https://reviews.apache.org/r/15805/diff/


Testing
---


Thanks,

Guozhang Wang



[jira] [Commented] (KAFKA-1140) Move the decoding logic from ConsumerIterator.makeNext to next

2013-11-25 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13831893#comment-13831893
 ] 

Guozhang Wang commented on KAFKA-1140:
--

Updated reviewboard https://reviews.apache.org/r/15805/
 against branch origin/trunk

> Move the decoding logic from ConsumerIterator.makeNext to next
> --
>
> Key: KAFKA-1140
> URL: https://issues.apache.org/jira/browse/KAFKA-1140
> Project: Kafka
>  Issue Type: Bug
>Reporter: Guozhang Wang
>Assignee: Guozhang Wang
> Fix For: 0.8.1
>
> Attachments: KAFKA-1140.patch, KAFKA-1140_2013-11-25_12:53:17.patch, 
> KAFKA-1140_2013-11-25_12:55:34.patch
>
>
> Usually people will write code around consumer like
> while(iter.hasNext()) {
> try {
>   msg = iter.next()
>   // do something
> }
> catch{
> }
> }
> 
> However, the iter.hasNext() call itself can throw exceptions due to decoding 
> failures. It would be better to move the decoding to the next function call.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (KAFKA-1140) Move the decoding logic from ConsumerIterator.makeNext to next

2013-11-25 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-1140:
-

Attachment: KAFKA-1140_2013-11-25_12:55:34.patch

> Move the decoding logic from ConsumerIterator.makeNext to next
> --
>
> Key: KAFKA-1140
> URL: https://issues.apache.org/jira/browse/KAFKA-1140
> Project: Kafka
>  Issue Type: Bug
>Reporter: Guozhang Wang
>Assignee: Guozhang Wang
> Fix For: 0.8.1
>
> Attachments: KAFKA-1140.patch, KAFKA-1140_2013-11-25_12:53:17.patch, 
> KAFKA-1140_2013-11-25_12:55:34.patch
>
>
> Usually people will write code around consumer like
> while(iter.hasNext()) {
> try {
>   msg = iter.next()
>   // do something
> }
> catch{
> }
> }
> 
> However, the iter.hasNext() call itself can throw exceptions due to decoding 
> failures. It would be better to move the decoding to the next function call.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: librdkafka 0.8.0 released

2013-11-25 Thread Magnus Edenhill
Producing to one partition, no replication, required.acks = 0:
% 100 messages and 1 bytes produced in 1215ms: 822383 msgs/s
and 82.24 Mb/s, 0 messages failed, no compression

Producing to one partition, no replication, required.acks = -1:
% 100 messages and 1 bytes produced in 1422ms: 703091 msgs/s
and 70.31 Mb/s, 0 messages failed, no compression

Producing to one partition, no replication, required.acks = 1:
% 100 messages and 1 bytes produced in 1295ms: 771881 msgs/s
and 77.19 Mb/s, 0 messages failed, no compression



Producing to one partition, replication factor 2, 2 brokers ISR,
required.acks = 0:
% 100 messages and 1 bytes produced in 1354ms: 738483 msgs/s
and 73.85 Mb/s, 0 messages failed, no compression

Producing to one partition, replication factor 2, 2 brokers ISR,
required.acks = -1:
% 100 messages and 1 bytes produced in 3698ms: 270396 msgs/s
and 27.04 Mb/s, 0 messages failed, no compression

Producing to one partition, replication factor 2, 2 brokers ISR,
required.acks = 1:
% 100 messages and 1 bytes produced in 1360ms: 735224 msgs/s
and 73.52 Mb/s, 0 messages failed, no compression

Producing to one partition, replication factor 2, 2 brokers ISR,
required.acks = 2:
% 100 messages and 1 bytes produced in 3568ms: 280241 msgs/s
and 28.02 Mb/s, 0 messages failed, no compression


These are the maximum values from a smaller number of naive tests.

It would be interesting to see some numbers from relevant environments with
proper hardware and networks.
(rdkafka_performance -P -t  -p  -s  -a
 -c 100 -q)

Regards,
Magnus


2013/11/25 Jun Rao 

> Thanks for sharing the results. Was the topic created with replication
> factor of 2? Could you test acks=-1 as well?
>
> Thanks,
>
> Jun
>
>
> On Mon, Nov 25, 2013 at 4:30 AM, Magnus Edenhill 
> wrote:
>
> > The following tests were using a single producer application
> > (examples/rdkafka_performance):
> >
> > * Test1: 2 brokers, 2 partitions, required.acks=2, 100 byte messages:
> > 85 messages/second, 85 MB/second
> >
> > * Test2: 1 broker, 1 partition, required.acks=0, 100 byte messages:
> 71
> > messages/second, 71 MB/second
> >
> > * Test3: 2 broker2, 2 partitions, required.acks=2, 100 byte messages,
> > snappy compression: 30 messages/second, 30 MB/second
> >
> > * Test4: 2 broker2, 2 partitions, required.acks=2, 100 byte messages,
> gzip
> > compression: 23 messages/second, 23 MB/second
> >
> >
> > log.flush broker configuration was increased to avoid the disk being the
> > bottleneck.
> >
> >
> > /Magnus
> >
> >
> >
> > 2013/11/24 Neha Narkhede 
> >
> > > So, a single producer'a throughput is 80 MB/s? That seems pretty high.
> > What
> > > was the number of acks setting? Thanks for sharing these numbers.
> > >
> > > On Sunday, November 24, 2013, Magnus Edenhill wrote:
> > >
> > > > Hi Neha,
> > > >
> > > > these tests were done using 100 byte messages. More information about
> > the
> > > > producer performance tests can be found here:
> > > >
> > > >
> > >
> >
> https://github.com/edenhill/librdkafka/blob/master/INTRODUCTION.md#performance-numbers
> > > >
> > > > The tests are indicative at best and in no way scientific, but I must
> > say
> > > > that the Kafka broker performance is impressive.
> > > >
> > > > Regards,
> > > > Magnus
> > > >
> > > >
> > > >
> > > > 2013/11/22 Neha Narkhede >
> > > >
> > > > > Thanks for sharing this! What is the message size for the
> throughput
> > > > > numbers stated below?
> > > > >
> > > > > Thanks,
> > > > > Neha
> > > > > On Nov 22, 2013 6:59 AM, "Magnus Edenhill"  > > >
> > > > wrote:
> > > > >
> > > > > > This announces the 0.8.0 release of librdkafka - The Apache Kafka
> > > > client
> > > > > C
> > > > > > library - now with 0.8 protocol support.
> > > > > >
> > > > > > Features:
> > > > > > * Producer (~800K msgs/s)
> > > > > > * Consumer  (~3M msgs/s)
> > > > > > * Compression (Snappy, gzip)
> > > > > > * Proper failover and leader re-election support - no message is
> > ever
> > > > > lost.
> > > > > > * Configuration properties compatible with official Apache Kafka.
> > > > > > * Stabilized ABI-safe API
> > > > > > * Mainline Debian package submitted
> > > > > > * Production quality
> > > > > >
> > > > > >
> > > > > > Home:
> > > > > > https://github.com/edenhill/librdkafka
> > > > > >
> > > > > > Introduction and performance numbers:
> > > > > >
> https://github.com/edenhill/librdkafka/blob/master/INTRODUCTION.md
> > > > > >
> > > > > > Have fun.
> > > > > >
> > > > > > Regards,
> > > > > > Magnus
> > > > > >
> > > > > > P.S.
> > > > > > Check out Wikimedia Foundation's varnishkafka daemon for a use
> > case -
> > > > > > varnish log forwarding over Kafka:
> > > > > > https://github.com/wikimedia/varnishkafka
> > > > > >
> > > > >
> > > >
> > >
> >
>


[jira] [Commented] (KAFKA-1144) commitOffsets can be passed the offsets to commit

2013-11-25 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13831993#comment-13831993
 ] 

Guozhang Wang commented on KAFKA-1144:
--

Imran, regarding the client rewrite, your proposal is also discussed as part of 
the project. Basically, the consumer commitOffset function can be something 
like:

void commit(List[String, Int, Long]) // topic, partition-id, offset

The wiki page will be updated soon so that you can take a look by then.

Guozhang



> commitOffsets can be passed the offsets to commit
> -
>
> Key: KAFKA-1144
> URL: https://issues.apache.org/jira/browse/KAFKA-1144
> Project: Kafka
>  Issue Type: Improvement
>  Components: consumer
>Affects Versions: 0.8
>Reporter: Imran Rashid
>Assignee: Neha Narkhede
> Attachments: 
> 0001-allow-committing-of-arbitrary-offsets-to-facilitate-.patch, 
> 0002-add-protection-against-backward-commits.patch
>
>
> This adds another version of commitOffsets that takes the offsets to commit 
> as a parameter.
> Without this change, getting correct user code is very hard. Despite kafka's 
> at-least-once guarantees, most user code doesn't actually have that 
> guarantee, and is almost certainly wrong if doing batch processing. Getting 
> it right requires some very careful synchronization between all consumer 
> threads, which is both:
> 1) painful to get right
> 2) slow b/c of the need to stop all workers during a commit.
> This small change simplifies a lot of this. This was discussed extensively on 
> the user mailing list, on the thread "are kafka consumer apps guaranteed to 
> see msgs at least once?"
> You can also see an example implementation of a user api which makes use of 
> this, to get proper at-least-once guarantees by user code, even for batches:
> https://github.com/quantifind/kafka-utils/pull/1
> I'm open to any suggestions on how to add unit tests for this.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (KAFKA-1144) commitOffsets can be passed the offsets to commit

2013-11-25 Thread Imran Rashid (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13832019#comment-13832019
 ] 

Imran Rashid commented on KAFKA-1144:
-

yes, that addition to the commit api would do the trick.  (that is the main 
point of this patch, I just happened to choose a different signature.)  With 
rebalance notifications, that removes the need for conditional updates.

glad to hear this will all be in 0.9 -- but does that mean this patch is off 
the table for 0.8.*?  That's too bad, this would be a big help now.  I could 
change the signature to match what it will be in 0.9.  If not, I suppose I can 
always just make the zk updates myself directly, under the covers of the 
consumer api wrapper I'm writing.

> commitOffsets can be passed the offsets to commit
> -
>
> Key: KAFKA-1144
> URL: https://issues.apache.org/jira/browse/KAFKA-1144
> Project: Kafka
>  Issue Type: Improvement
>  Components: consumer
>Affects Versions: 0.8
>Reporter: Imran Rashid
>Assignee: Neha Narkhede
> Attachments: 
> 0001-allow-committing-of-arbitrary-offsets-to-facilitate-.patch, 
> 0002-add-protection-against-backward-commits.patch
>
>
> This adds another version of commitOffsets that takes the offsets to commit 
> as a parameter.
> Without this change, getting correct user code is very hard. Despite kafka's 
> at-least-once guarantees, most user code doesn't actually have that 
> guarantee, and is almost certainly wrong if doing batch processing. Getting 
> it right requires some very careful synchronization between all consumer 
> threads, which is both:
> 1) painful to get right
> 2) slow b/c of the need to stop all workers during a commit.
> This small change simplifies a lot of this. This was discussed extensively on 
> the user mailing list, on the thread "are kafka consumer apps guaranteed to 
> see msgs at least once?"
> You can also see an example implementation of a user api which makes use of 
> this, to get proper at-least-once guarantees by user code, even for batches:
> https://github.com/quantifind/kafka-utils/pull/1
> I'm open to any suggestions on how to add unit tests for this.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (KAFKA-1135) Code cleanup - use Json.encode() to write json data to zk

2013-11-25 Thread David Lao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13832026#comment-13832026
 ] 

David Lao commented on KAFKA-1135:
--

Hi. This patch seem to have undone all the KAFKA-1112 changes. Can you verify?

> Code cleanup - use Json.encode() to write json data to zk
> -
>
> Key: KAFKA-1135
> URL: https://issues.apache.org/jira/browse/KAFKA-1135
> Project: Kafka
>  Issue Type: Bug
>Reporter: Swapnil Ghike
>Assignee: Swapnil Ghike
> Fix For: 0.8.1
>
> Attachments: KAFKA-1135.patch, KAFKA-1135_2013-11-18_19:17:54.patch, 
> KAFKA-1135_2013-11-18_19:20:58.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (KAFKA-1135) Code cleanup - use Json.encode() to write json data to zk

2013-11-25 Thread Swapnil Ghike (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13832037#comment-13832037
 ] 

Swapnil Ghike commented on KAFKA-1135:
--

Thanks for catching this David! Jun, it seems that the diff in the reviewboard 
and what got attached to this JIRA is different. Can you please revert commit 
9b0776d157afd9eacddb84a99f2420fa9c0d505b, download the diff from the 
reviewboard and commit it?

> Code cleanup - use Json.encode() to write json data to zk
> -
>
> Key: KAFKA-1135
> URL: https://issues.apache.org/jira/browse/KAFKA-1135
> Project: Kafka
>  Issue Type: Bug
>Reporter: Swapnil Ghike
>Assignee: Swapnil Ghike
> Fix For: 0.8.1
>
> Attachments: KAFKA-1135.patch, KAFKA-1135_2013-11-18_19:17:54.patch, 
> KAFKA-1135_2013-11-18_19:20:58.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (KAFKA-1135) Code cleanup - use Json.encode() to write json data to zk

2013-11-25 Thread Swapnil Ghike (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13832040#comment-13832040
 ] 

Swapnil Ghike commented on KAFKA-1135:
--

[~jjkoshy], does the above issue look similar to KAFKA-1142? 

> Code cleanup - use Json.encode() to write json data to zk
> -
>
> Key: KAFKA-1135
> URL: https://issues.apache.org/jira/browse/KAFKA-1135
> Project: Kafka
>  Issue Type: Bug
>Reporter: Swapnil Ghike
>Assignee: Swapnil Ghike
> Fix For: 0.8.1
>
> Attachments: KAFKA-1135.patch, KAFKA-1135_2013-11-18_19:17:54.patch, 
> KAFKA-1135_2013-11-18_19:20:58.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (KAFKA-1135) Code cleanup - use Json.encode() to write json data to zk

2013-11-25 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13832067#comment-13832067
 ] 

Jun Rao commented on KAFKA-1135:


Thanks for pointing this out. Recommitted KAFKA-1112.

> Code cleanup - use Json.encode() to write json data to zk
> -
>
> Key: KAFKA-1135
> URL: https://issues.apache.org/jira/browse/KAFKA-1135
> Project: Kafka
>  Issue Type: Bug
>Reporter: Swapnil Ghike
>Assignee: Swapnil Ghike
> Fix For: 0.8.1
>
> Attachments: KAFKA-1135.patch, KAFKA-1135_2013-11-18_19:17:54.patch, 
> KAFKA-1135_2013-11-18_19:20:58.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (KAFKA-1144) commitOffsets can be passed the offsets to commit

2013-11-25 Thread Imran Rashid (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Imran Rashid updated KAFKA-1144:


Attachment: 0003-dont-do-conditional-update-check-if-the-path-doesnt-.patch

just discovered a small bug -- conditional update doesn't work if the path 
doesn't already exist.  fiured I'd supdate this just in case its still in 
consideration ...

> commitOffsets can be passed the offsets to commit
> -
>
> Key: KAFKA-1144
> URL: https://issues.apache.org/jira/browse/KAFKA-1144
> Project: Kafka
>  Issue Type: Improvement
>  Components: consumer
>Affects Versions: 0.8
>Reporter: Imran Rashid
>Assignee: Neha Narkhede
> Attachments: 
> 0001-allow-committing-of-arbitrary-offsets-to-facilitate-.patch, 
> 0002-add-protection-against-backward-commits.patch, 
> 0003-dont-do-conditional-update-check-if-the-path-doesnt-.patch
>
>
> This adds another version of commitOffsets that takes the offsets to commit 
> as a parameter.
> Without this change, getting correct user code is very hard. Despite kafka's 
> at-least-once guarantees, most user code doesn't actually have that 
> guarantee, and is almost certainly wrong if doing batch processing. Getting 
> it right requires some very careful synchronization between all consumer 
> threads, which is both:
> 1) painful to get right
> 2) slow b/c of the need to stop all workers during a commit.
> This small change simplifies a lot of this. This was discussed extensively on 
> the user mailing list, on the thread "are kafka consumer apps guaranteed to 
> see msgs at least once?"
> You can also see an example implementation of a user api which makes use of 
> this, to get proper at-least-once guarantees by user code, even for batches:
> https://github.com/quantifind/kafka-utils/pull/1
> I'm open to any suggestions on how to add unit tests for this.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (KAFKA-1135) Code cleanup - use Json.encode() to write json data to zk

2013-11-25 Thread Joel Koshy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13832221#comment-13832221
 ] 

Joel Koshy commented on KAFKA-1135:
---

[~swapnilghike] I'm pretty sure the extra diff comes due to the issue in 
KAFKA-1142, but I don't know why that change does not show up in the RB itself.

> Code cleanup - use Json.encode() to write json data to zk
> -
>
> Key: KAFKA-1135
> URL: https://issues.apache.org/jira/browse/KAFKA-1135
> Project: Kafka
>  Issue Type: Bug
>Reporter: Swapnil Ghike
>Assignee: Swapnil Ghike
> Fix For: 0.8.1
>
> Attachments: KAFKA-1135.patch, KAFKA-1135_2013-11-18_19:17:54.patch, 
> KAFKA-1135_2013-11-18_19:20:58.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: Review Request 15805: KAFKA-1140.v2: addressed Jun's comments

2013-11-25 Thread Jun Rao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15805/#review29412
---



core/src/test/scala/unit/kafka/consumer/ConsumerIteratorTest.scala


We can just add messageSet to queue directly. In this test, topicInfo is 
irrelevant.




core/src/test/scala/unit/kafka/consumer/ConsumerIteratorTest.scala


We probably should just use ConsumerConfig.ConsumerTimeoutMs here, to make 
it clear that we don't want to timeout.



core/src/test/scala/unit/kafka/consumer/ConsumerIteratorTest.scala


To really test that we can iterate over messages with decoding errors, 
could we have the first half of messages with decoding errors and the second 
half without, and make sure that we get the correct offsets when iterating 
messages in the second half?



core/src/test/scala/unit/kafka/consumer/ConsumerIteratorTest.scala


Instead of throwing an exception, we should fail the test.



core/src/test/scala/unit/kafka/consumer/ConsumerIteratorTest.scala


Not sure why this is useful. ConsumedOffset is a val and will never change.


- Jun Rao


On Nov. 25, 2013, 8:55 p.m., Guozhang Wang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15805/
> ---
> 
> (Updated Nov. 25, 2013, 8:55 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1140
> https://issues.apache.org/jira/browse/KAFKA-1140
> 
> 
> Repository: kafka
> 
> 
> Description
> ---
> 
> KAFKA-1140.v2
> 
> 
> KAFKA-1140.v1
> 
> 
> Dummy
> 
> 
> Diffs
> -
> 
>   core/src/main/scala/kafka/consumer/ConsumerIterator.scala 
> a4227a49684c7de08e07cb1f3a10d2f76ba28da7 
>   core/src/test/scala/unit/kafka/consumer/ConsumerIteratorTest.scala 
> ef1de8321c713cd9d27ef937216f5b76a5d8c574 
> 
> Diff: https://reviews.apache.org/r/15805/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Guozhang Wang
> 
>



[jira] [Updated] (KAFKA-1004) Handle topic event for trivial whitelist topic filters

2013-11-25 Thread Joel Koshy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Koshy updated KAFKA-1004:
--

Resolution: Fixed
Status: Resolved  (was: Patch Available)

This was fixed in KAFKA-1103


> Handle topic event for trivial whitelist topic filters
> --
>
> Key: KAFKA-1004
> URL: https://issues.apache.org/jira/browse/KAFKA-1004
> Project: Kafka
>  Issue Type: Bug
>Reporter: Guozhang Wang
>Assignee: Guozhang Wang
> Fix For: 0.8.1, 0.7
>
> Attachments: KAFKA-1004.v1.patch, KAFKA-1004.v2.patch
>
>
> Toay consumer's TopicEventWatcher is not subscribed with trivial whitelist 
> topic names. Hence if the topic is not registered on ZK when the consumer is 
> started, it will not trigger the rebalance of consumers later when it is 
> created and hence not be consumed even if it is in the whilelist. A proposed 
> fix would be always subscribe TopicEventWatcher for all whitelist consumers.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Closed] (KAFKA-1004) Handle topic event for trivial whitelist topic filters

2013-11-25 Thread Joel Koshy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Koshy closed KAFKA-1004.
-


> Handle topic event for trivial whitelist topic filters
> --
>
> Key: KAFKA-1004
> URL: https://issues.apache.org/jira/browse/KAFKA-1004
> Project: Kafka
>  Issue Type: Bug
>Reporter: Guozhang Wang
>Assignee: Guozhang Wang
> Fix For: 0.7, 0.8.1
>
> Attachments: KAFKA-1004.v1.patch, KAFKA-1004.v2.patch
>
>
> Toay consumer's TopicEventWatcher is not subscribed with trivial whitelist 
> topic names. Hence if the topic is not registered on ZK when the consumer is 
> started, it will not trigger the rebalance of consumers later when it is 
> created and hence not be consumed even if it is in the whilelist. A proposed 
> fix would be always subscribe TopicEventWatcher for all whitelist consumers.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: librdkafka 0.8.0 released

2013-11-25 Thread Jun Rao
Thanks. The results make sense. Higher consistency (ack=-1 and ack=2)
typically means longer latency.

Do those number match our java producers?

Thanks,

Jun


On Mon, Nov 25, 2013 at 1:16 PM, Magnus Edenhill  wrote:

> Producing to one partition, no replication, required.acks = 0:
> % 100 messages and 1 bytes produced in 1215ms: 822383 msgs/s
> and 82.24 Mb/s, 0 messages failed, no compression
>
> Producing to one partition, no replication, required.acks = -1:
> % 100 messages and 1 bytes produced in 1422ms: 703091 msgs/s
> and 70.31 Mb/s, 0 messages failed, no compression
>
> Producing to one partition, no replication, required.acks = 1:
> % 100 messages and 1 bytes produced in 1295ms: 771881 msgs/s
> and 77.19 Mb/s, 0 messages failed, no compression
>
>
>
> Producing to one partition, replication factor 2, 2 brokers ISR,
> required.acks = 0:
> % 100 messages and 1 bytes produced in 1354ms: 738483 msgs/s
> and 73.85 Mb/s, 0 messages failed, no compression
>
> Producing to one partition, replication factor 2, 2 brokers ISR,
> required.acks = -1:
> % 100 messages and 1 bytes produced in 3698ms: 270396 msgs/s
> and 27.04 Mb/s, 0 messages failed, no compression
>
> Producing to one partition, replication factor 2, 2 brokers ISR,
> required.acks = 1:
> % 100 messages and 1 bytes produced in 1360ms: 735224 msgs/s
> and 73.52 Mb/s, 0 messages failed, no compression
>
> Producing to one partition, replication factor 2, 2 brokers ISR,
> required.acks = 2:
> % 100 messages and 1 bytes produced in 3568ms: 280241 msgs/s
> and 28.02 Mb/s, 0 messages failed, no compression
>
>
> These are the maximum values from a smaller number of naive tests.
>
> It would be interesting to see some numbers from relevant environments with
> proper hardware and networks.
> (rdkafka_performance -P -t  -p  -s  -a
>  -c 100 -q)
>
> Regards,
> Magnus
>
>
> 2013/11/25 Jun Rao 
>
> > Thanks for sharing the results. Was the topic created with replication
> > factor of 2? Could you test acks=-1 as well?
> >
> > Thanks,
> >
> > Jun
> >
> >
> > On Mon, Nov 25, 2013 at 4:30 AM, Magnus Edenhill 
> > wrote:
> >
> > > The following tests were using a single producer application
> > > (examples/rdkafka_performance):
> > >
> > > * Test1: 2 brokers, 2 partitions, required.acks=2, 100 byte messages:
> > > 85 messages/second, 85 MB/second
> > >
> > > * Test2: 1 broker, 1 partition, required.acks=0, 100 byte messages:
> > 71
> > > messages/second, 71 MB/second
> > >
> > > * Test3: 2 broker2, 2 partitions, required.acks=2, 100 byte messages,
> > > snappy compression: 30 messages/second, 30 MB/second
> > >
> > > * Test4: 2 broker2, 2 partitions, required.acks=2, 100 byte messages,
> > gzip
> > > compression: 23 messages/second, 23 MB/second
> > >
> > >
> > > log.flush broker configuration was increased to avoid the disk being
> the
> > > bottleneck.
> > >
> > >
> > > /Magnus
> > >
> > >
> > >
> > > 2013/11/24 Neha Narkhede 
> > >
> > > > So, a single producer'a throughput is 80 MB/s? That seems pretty
> high.
> > > What
> > > > was the number of acks setting? Thanks for sharing these numbers.
> > > >
> > > > On Sunday, November 24, 2013, Magnus Edenhill wrote:
> > > >
> > > > > Hi Neha,
> > > > >
> > > > > these tests were done using 100 byte messages. More information
> about
> > > the
> > > > > producer performance tests can be found here:
> > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/edenhill/librdkafka/blob/master/INTRODUCTION.md#performance-numbers
> > > > >
> > > > > The tests are indicative at best and in no way scientific, but I
> must
> > > say
> > > > > that the Kafka broker performance is impressive.
> > > > >
> > > > > Regards,
> > > > > Magnus
> > > > >
> > > > >
> > > > >
> > > > > 2013/11/22 Neha Narkhede >
> > > > >
> > > > > > Thanks for sharing this! What is the message size for the
> > throughput
> > > > > > numbers stated below?
> > > > > >
> > > > > > Thanks,
> > > > > > Neha
> > > > > > On Nov 22, 2013 6:59 AM, "Magnus Edenhill"  > > > >
> > > > > wrote:
> > > > > >
> > > > > > > This announces the 0.8.0 release of librdkafka - The Apache
> Kafka
> > > > > client
> > > > > > C
> > > > > > > library - now with 0.8 protocol support.
> > > > > > >
> > > > > > > Features:
> > > > > > > * Producer (~800K msgs/s)
> > > > > > > * Consumer  (~3M msgs/s)
> > > > > > > * Compression (Snappy, gzip)
> > > > > > > * Proper failover and leader re-election support - no message
> is
> > > ever
> > > > > > lost.
> > > > > > > * Configuration properties compatible with official Apache
> Kafka.
> > > > > > > * Stabilized ABI-safe API
> > > > > > > * Mainline Debian package submitted
> > > > > > > * Production quality
> > > > > > >
> > > > > > >
> > > > > > > Home:
> > > > > > > https://github.com/edenhill/librdkafka
> > > > > > >
> > > > > > > Introduction and performance numbers:
> > > > > > >
> > https://github.com/edenhill/librdkafka/