Synced up offline about this. Makes more sense now.

On Thu, Feb 13, 2014 at 10:15 AM, Jay Kreps <jay.kr...@gmail.com> wrote:

> Read my email again :-)
>
> -Jay
>
>
> On Thu, Feb 13, 2014 at 9:43 AM, Neha Narkhede <neha.narkh...@gmail.com
> >wrote:
>
> >  Did you try this on an otherwise empty topic?
> >
> > Yes and it blocks until --messages are read. But again, my interpretation
> > of that config has been - "read exactly --messages # of messages, not
> less
> > not more".
> >
> > Thanks,
> > Neha
> >
> >
> > On Mon, Feb 10, 2014 at 9:31 AM, Jay Kreps <jay.kr...@gmail.com> wrote:
> >
> > > Hmm, I could be wrong... Did you try this on an otherwise empty topic?
> > Try
> > > this:
> > > 1. Produce 100GB of messages
> > > 2. Run consumer performance test with --messages 1
> > > 3. Go get some coffee
> > > 4. Observe performance tests results with a time taken to consume 100GB
> > of
> > > messages but only recording a single message consumed
> > >
> > > :-)
> > >
> > > Here is what I mean:
> > > scala> val l = List(1,2,3,4,5,6,7,8,9,10)
> > > scala> val iter = l.iterator
> > > scala> var count = 0
> > > scala> for(i <- iter if count < 5) {
> > > scala> iter.next
> > > java.util.NoSuchElementException: next on empty iterator
> > > ...
> > >
> > > This is the exact equivalent of
> > > val l = List(1,2,3,4,5,6,7,8,9,10)
> > > val iter = l.iterator
> > > var count = 0
> > > while(iter.hasNext) {
> > >   if(iter.next < 5)
> > >     count += 1
> > > }
> > >
> > > -Jay
> > >
> > >
> > > On Sun, Feb 9, 2014 at 9:43 PM, Neha Narkhede <neha.narkh...@gmail.com
> > > >wrote:
> > >
> > > > I ran ConsumerPerformance with different number of messages config
> for
> > a
> > > > topic that has 100s of messages in the log. It did not hang in any of
> > > these
> > > > tests -
> > > >
> > > > nnarkhed-mn1:kafka-git-idea nnarkhed$ ./bin/kafka-run-class.sh
> > > > kafka.perf.ConsumerPerformance --zookeeper localhost:2181 --topic
> test
> > > > *--messages
> > > > 10* --threads 1
> > > > start.time, end.time, fetch.size, data.consumed.in.MB, MB.sec,
> > > > data.consumed.in.nMsg, nMsg.sec
> > > > 1 messages read
> > > > 2 messages read
> > > > 3 messages read
> > > > 4 messages read
> > > > 5 messages read
> > > > 6 messages read
> > > > 7 messages read
> > > > 8 messages read
> > > > 9 messages read
> > > > 10 messages read
> > > > *2014-02-09 21:39:47:385, 2014-02-09 21:39:52:641, 1048576, 0.0015,
> > > 0.0060,
> > > > 10, 39.0625*  (test ends with this)
> > > > nnarkhed-mn1:kafka-git-idea nnarkhed$ ./bin/kafka-run-class.sh
> > > > kafka.perf.ConsumerPerformance --zookeeper localhost:2181 --topic
> test
> > > > *--messages
> > > > 2* --threads 1
> > > > 1 messages read
> > > > 2 messages read
> > > > *2014-02-09 21:39:59:853, 2014-02-09 21:40:05:100, 1048576, 0.0003,
> > > 0.0011,
> > > > 2, 8.0972*      (test ends with this)
> > > >
> > > > The way I read the --messages config is "read exactly --messages # of
> > > > messages, not less not more". If the topic has less than --messages
> > > > records, it will hang until it gets more data, else it exits
> properly.
> > > > Seems like expected behavior.
> > > >
> > > > Thanks,
> > > > Neha
> > > >
> > > >
> > > >
> > > > On Sat, Feb 8, 2014 at 2:11 PM, Jay Kreps <jay.kr...@gmail.com>
> wrote:
> > > >
> > > > > It loops on
> > > > >   for (messageAndMetadata <- stream if messagesRead <
> > > > config.numMessages) {
> > > > >    // count message
> > > > > }
> > > > >
> > > > > That means it loops over *all* messages in the stream (eventually
> > > > blocking
> > > > > when all messages are exhausted) but only counts the first N as
> > > > determined
> > > > > by the config.
> > > > >
> > > > > As a result this command seems to always hang forever. Seems like
> > > someone
> > > > > "improved" this without ever trying to run it :-(
> > > > >
> > > > > -Jay
> > > > >
> > > >
> > >
> >
>

Reply via email to