Synced up offline about this. Makes more sense now.
On Thu, Feb 13, 2014 at 10:15 AM, Jay Kreps <jay.kr...@gmail.com> wrote: > Read my email again :-) > > -Jay > > > On Thu, Feb 13, 2014 at 9:43 AM, Neha Narkhede <neha.narkh...@gmail.com > >wrote: > > > Did you try this on an otherwise empty topic? > > > > Yes and it blocks until --messages are read. But again, my interpretation > > of that config has been - "read exactly --messages # of messages, not > less > > not more". > > > > Thanks, > > Neha > > > > > > On Mon, Feb 10, 2014 at 9:31 AM, Jay Kreps <jay.kr...@gmail.com> wrote: > > > > > Hmm, I could be wrong... Did you try this on an otherwise empty topic? > > Try > > > this: > > > 1. Produce 100GB of messages > > > 2. Run consumer performance test with --messages 1 > > > 3. Go get some coffee > > > 4. Observe performance tests results with a time taken to consume 100GB > > of > > > messages but only recording a single message consumed > > > > > > :-) > > > > > > Here is what I mean: > > > scala> val l = List(1,2,3,4,5,6,7,8,9,10) > > > scala> val iter = l.iterator > > > scala> var count = 0 > > > scala> for(i <- iter if count < 5) { > > > scala> iter.next > > > java.util.NoSuchElementException: next on empty iterator > > > ... > > > > > > This is the exact equivalent of > > > val l = List(1,2,3,4,5,6,7,8,9,10) > > > val iter = l.iterator > > > var count = 0 > > > while(iter.hasNext) { > > > if(iter.next < 5) > > > count += 1 > > > } > > > > > > -Jay > > > > > > > > > On Sun, Feb 9, 2014 at 9:43 PM, Neha Narkhede <neha.narkh...@gmail.com > > > >wrote: > > > > > > > I ran ConsumerPerformance with different number of messages config > for > > a > > > > topic that has 100s of messages in the log. It did not hang in any of > > > these > > > > tests - > > > > > > > > nnarkhed-mn1:kafka-git-idea nnarkhed$ ./bin/kafka-run-class.sh > > > > kafka.perf.ConsumerPerformance --zookeeper localhost:2181 --topic > test > > > > *--messages > > > > 10* --threads 1 > > > > start.time, end.time, fetch.size, data.consumed.in.MB, MB.sec, > > > > data.consumed.in.nMsg, nMsg.sec > > > > 1 messages read > > > > 2 messages read > > > > 3 messages read > > > > 4 messages read > > > > 5 messages read > > > > 6 messages read > > > > 7 messages read > > > > 8 messages read > > > > 9 messages read > > > > 10 messages read > > > > *2014-02-09 21:39:47:385, 2014-02-09 21:39:52:641, 1048576, 0.0015, > > > 0.0060, > > > > 10, 39.0625* (test ends with this) > > > > nnarkhed-mn1:kafka-git-idea nnarkhed$ ./bin/kafka-run-class.sh > > > > kafka.perf.ConsumerPerformance --zookeeper localhost:2181 --topic > test > > > > *--messages > > > > 2* --threads 1 > > > > 1 messages read > > > > 2 messages read > > > > *2014-02-09 21:39:59:853, 2014-02-09 21:40:05:100, 1048576, 0.0003, > > > 0.0011, > > > > 2, 8.0972* (test ends with this) > > > > > > > > The way I read the --messages config is "read exactly --messages # of > > > > messages, not less not more". If the topic has less than --messages > > > > records, it will hang until it gets more data, else it exits > properly. > > > > Seems like expected behavior. > > > > > > > > Thanks, > > > > Neha > > > > > > > > > > > > > > > > On Sat, Feb 8, 2014 at 2:11 PM, Jay Kreps <jay.kr...@gmail.com> > wrote: > > > > > > > > > It loops on > > > > > for (messageAndMetadata <- stream if messagesRead < > > > > config.numMessages) { > > > > > // count message > > > > > } > > > > > > > > > > That means it loops over *all* messages in the stream (eventually > > > > blocking > > > > > when all messages are exhausted) but only counts the first N as > > > > determined > > > > > by the config. > > > > > > > > > > As a result this command seems to always hang forever. Seems like > > > someone > > > > > "improved" this without ever trying to run it :-( > > > > > > > > > > -Jay > > > > > > > > > > > > > > >