I've attached the full output. The only other thing it produced was our old favorite:
Non-secutive offsets in :/home/steve/mytopic-9/00000000000000000000.log 1327 is followed by 1327 For the first time, earlier today, we've seen this happen from one of our other producers; offhand I'm thinking that there's a race of some sort somewhere and the other producers aren't immune, they're just much much less likely to run into the issue. The other possibility is that since those are all much higher-volume producers, maybe this has been happening with them before, but given the size of the log segments relative to the size of the data stream, the bad segment is rotated out in a few minutes -- so there's less of a window for us to notice. I changed the one producer who was consistently having issues so that it's now not publishing lots of small messages, each in its own single-message message set. Instead it's batching, which seems like it might help if it's message-arrival-rate related or message-size related. It hasn't failed since then but then again sometimes this runs OK for, well, just long enough to make me think I have it figured out. Then it breaks again. (-: Given that other kafka users don't seem to be having this sort of issue, and given that I'm out of ideas that aren't either "race condition in kafka that no one but us sees" or "Java versionitis", I'm thinking we should try to eliminate Java versionitis as a cause. We were already planning on moving from Java 6 to Java 7 so we're dragging that forward, and hope to get that taken care of over the next few days. If you have another idea, that's awesome, but if it is versionitis I'd hate to have wasted anyone's time but my own on it. So we'll let you know if we see a change over the next few days, particularly once we get the new Java setup, uh, setup. Thanks! -Steve On Thu, Aug 14, 2014 at 11:22:08AM -0700, Jun Rao wrote: > What's the output of the following command? > > /opt/kafka/bin/kafka-run-class.sh kafka.tools.DumpLogSegments > --files 00000000000000000000.log