I think we figured this out. It looks like the consumption of partitions is
wildly unpredictable. We see a single partition being consumed almost
halfway before switching to another partition for consumption. This causes
us to read messages from a range of dates out of order.

Interesting at least. Thanks for your help.


-----------------------------------------------------------------------------


It may be hard to reason about ordering across 1400 partitions. Could you
use the SimpleConsumerShell to consume messages from 1 partition and see if
messages are ordered?



Thanks,



Jun





On Fri, Mar 21, 2014 at 1:04 AM, Tom Amon <ta46...@gmail.com> wrote:



> Hi All,

>

> I have a question regarding ordering of consumed messages. We

> timestamp our messages and send them into Kafka in order. I wrote a

> simple consumer that simply consumes the messages and prints out the

> timestamp. I see messages for all seven days worth of date being consumed
at once.

>

> Our setup:

> Kafka 0.8

> 5 Kafka brokers

> 1400 partitions

>

> The consumer has 10 threads, simply connects, consumes and prints

> timestamps. It is set to the "smallest" offset so that it reads from

> the beginning. There are many millions of messages so I think I can

> rule out some partitions not having messages for certain days as the

> cause. I know that Kafka doesn't guarantee ordering across partitions

> but I would assume that with this volume of messages I would see the

> timestamps for the first day, followed by the second day, etc. Instead I
see them all print at once.

>

> Any ideas what I might be doing wrong?

>

Reply via email to