Hi John, Glad to help :) I ran into similar issues recently being confused by what does the offsets mean as well so I understand you pain haha.
Best of luck, Leo On Tue, Feb 23, 2016 at 1:53 PM, John Bickerstaff <j...@johnbickerstaff.com> wrote: > Thanks Leo! > > ========= > TL;DR summary: > > You're correct - I didn't absolutely need the offset. > I had to provide Disaster Recovery advice and couldn't explain the offset > numbers, which wouldn't fly > Explanation for how I got myself confused in the text below -- in case it > helps someone else later. > Thanks for your reply! > ========= > > You're right. Strictly speaking, I don't need the offset. In my testing > I've been issuing the rmr /kafka/consumers command from the Zookeeper > zkCli.sh. > I'm adding it to my microservice using the Zookeeper API this week - since > that seems a lot easier than figuring out the low level Kafka API code and > it works just as well. > > Being a developer, I just couldn't help trying to change the least > significant thing required to get the job done - and the Zookeeper API does > allow me to change that offset number... Which led me to try to understand > why that number wasn't matching my expectations... > > In addition, I'm building a SOLR / Kafka / Zookeeper infrastructure from > scratch and part of my mandate is to provide a handoff to our (very capable > and very careful) IT manager. The handoff is to include plans and > documentation for disaster recovery as well as how to build and manage the > cluster. > > For both of those reasons, my curiosity was piqued and I wanted to find out > exactly what was going on. I could just imagine the look on our IT > manager's face when I said "Trust me, the numbers don't line up, but it > won't affect disaster recovery." > > In hindsight, I understand what I did that confused me. Since I'm still in > development "mode" I sent messages to the same topic repeatedly for weeks. > Then instead of deleting the topic, I issued the following command to reset > the retention of the messages like this: > > bin/kafka-topics.sh --zookeeper 192.168.56.5:2181/kafka --alter --topic > topicName --config retention.ms=1000 > > Then I reset it once the messages were deleted, thus: > > bin/kafka-topics.sh --zookeeper 192.168.56.5:2181/kafka --alter --topic > topicName --delete-config retention.ms > > What I didn't realize is that (not unreasonably) the offset count isn't > reset by changing the config retention setting. As you said, it won't > necessarily be 0. > > Sending the same set of messages repeatedly resulted in having a very large > count in the offset - a count that bore no relation to the number of > messages in the topic - which worried me because I couldn't explain it -- > and things I can't explain make me nervous in the context of disaster > recovery... > > I appreciate your confirmation of my theory about what is going on. > > --JohnB (aka solrJohn) > > On Thu, Feb 18, 2016 at 12:19 PM, Leo Lin <leo....@brigade.com> wrote: > > > Hi John, > > > > Kafka offsets are sequential id numbers that identify messages in each > > partition. It might not be sequential within a topic (which can have > > multiple partition). > > > > Offsets don't necessarily start at 0 since messages are deleted. > > > > .bin/kafka-run-class.sh kafka.tools.GetOffsetShell is pretty neat to look > > at offsets in your topic > > > > I'm not sure why resetting offset is needed in your case. If you need to > > read from the beginning using the high level consumer, > > you just need to delete that consumer group in zookeeper and set > > "auto.offset.reset" to "smallest". (this will direct the consumer to > look > > for smallest offset if it doesnt find one in zookeeper) > > > > On Wed, Feb 17, 2016 at 1:06 PM, John Bickerstaff < > > j...@johnbickerstaff.com> > > wrote: > > > > > Hmmm... more info. > > > > > > So, inside /var/log/kafka-logs/myTopicName-0 I find two files > > > > > > 00000000000000026524.index 00000000000000026524.log > > > > > > Interestingly, they both bear the number of the "lowest" offset > returned > > by > > > the command I mention above. > > > > > > If I "cat" the 000.....26524.log file, I get all my messages on the > > > commandline as if I'd issued the --from-beginning command > > > > > > I'm not sure what the index has, it's unreadable by the simple tools > I've > > > tried.... > > > > > > I'm still scratching my head a bit - as the link you sent for Kafka > > > introduction says this: > > > > > > The messages in the partitions are each assigned a sequential id number > > > called the *offset* that uniquely identifies each message within the > > > partition. > > > I see how that could be exactly what you said (the previous message(s) > > byte > > > count) -- but the picture implies that it's a linear progression - > 1,2,3 > > > etc... (and that could be an oversimplification for purposes of the > > > introduction - I get that...) > > > > > > Feel free to comment or not - I'm going to keep digging into it as > best I > > > can - any clarifications will be gratefully accepted... > > > > > > > > > > > > On Wed, Feb 17, 2016 at 1:50 PM, John Bickerstaff < > > > j...@johnbickerstaff.com> > > > wrote: > > > > > > > Thank you Christian -- I appreciate your taking the time to help me > out > > > on > > > > this. > > > > > > > > Here's what I found while continuing to dig into this. > > > > > > > > If I take 30024 and subtract the number of messages I know I have in > > > Kafka > > > > (3500) I get 26524. > > > > > > > > If I reset thus: set > > /kafka/consumers/myGroupName/offsets/myTopicName/0 > > > > 26524 > > > > > > > > ... and then re-run my consumer - I get all 3500 messages again. > > > > > > > > If I do this: set /kafka/consumers/myGroupName/offsets/myTopicName/0 > > > 26624 > > > > > > > > In other words, I increase the offset number by 100 -- then I get > > exactly > > > > 3400 messages on my consumer -- exactly 100 less than before which I > > > think > > > > makes sense, since I started the offset 100 higher... > > > > > > > > This seems to suggest that each number between 26624 and 30024 in the > > log > > > > represents one of my 3500 messages on this topic, but what you say > > > suggests > > > > that they represent byte count of the actual messages and not "one > > number > > > > per message"... > > > > > > > > I also find that if I issue this command: > > > > > > > > bin/kafka-run-class.sh kafka.tools.GetOffsetShell --topic=myTopicName > > > > --broker-list=192.168.56.3:9092 --time=-2 > > > > > > > > I get back that same number -- 26524... > > > > > > > > Hmmmm.... A little confused still... These messages are literally > > > stored > > > > in the Kafka logs, yes? I think I'll go digging in there and see... > > > > > > > > Thanks again! > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Feb 17, 2016 at 12:38 PM, Christian Posta < > > > > christian.po...@gmail.com> wrote: > > > > > > > >> The number is the log-ordered number of bytes. So really, the offset > > is > > > >> kinda like the "number of bytes" to begin reading from. 0 means read > > the > > > >> log from the beginning. The second message is 0 + size of message. > So > > > the > > > >> message "ids" are really just the offset of the previous message > > sizes. > > > >> > > > >> For example, if I have three messages of 10 bytes each, and set the > > > >> consumer offset to 0, i'll read everything. If you set the offset to > > 10, > > > >> I'll read the second and third messages, and so on. > > > >> > > > >> see more here: > > > >> > > > >> > > > > > > http://research.microsoft.com/en-us/um/people/srikanth/netdb11/netdb11papers/netdb11-final12.pdf > > > >> and here: http://kafka.apache.org/documentation.html#introduction > > > >> > > > >> HTH! > > > >> > > > >> On Wed, Feb 17, 2016 at 12:16 PM, John Bickerstaff < > > > >> j...@johnbickerstaff.com > > > >> > wrote: > > > >> > > > >> > *Use Case: Disaster Recovery & Re-indexing SOLR* > > > >> > > > > >> > I'm using Kafka to hold messages from a service that prepares > > > >> "documents" > > > >> > for SOLR. > > > >> > > > > >> > A second micro service (a consumer) requests these messages, does > > any > > > >> final > > > >> > processing, and fires them into SOLR. > > > >> > > > > >> > The whole thing is (in part) designed to be used for disaster > > > recovery - > > > >> > allowing the rebuild of the SOLR index in the shortest possible > > time. > > > >> > > > > >> > To do this (and to be able to use it for re-indexing SOLR while > > > testing > > > >> > relevancy) I need to be able to "play all messages from the > > beginning" > > > >> at > > > >> > will. > > > >> > > > > >> > I find I can use the zkCli.sh tool to delete the Consumer Group > Name > > > >> like > > > >> > this: > > > >> > rmr /kafka/consumers/myGroupName > > > >> > > > > >> > After which my microservice will get all the messages again when > it > > > >> runs. > > > >> > > > > >> > I was trying to find a way to do this programmatically without > > > actually > > > >> > using the "low level" consumer api since the high level one is so > > > simple > > > >> > and my code already works. So I started playing with Zookeeper > api > > > for > > > >> > duplicating "rmr /kafka/consumers/myGroupName" > > > >> > > > > >> > *The Question: What does that offset actually represent?* > > > >> > > > > >> > It was at this point that I discovered the offset must represent > > > >> something > > > >> > other than what I thought it would. Things obviously work, but > I'm > > > >> > wondering what - exactly do the offsets represent? > > > >> > > > > >> > To clarify - if I run this command on a zookeeper node, after the > > > >> > microservice has run: > > > >> > get /kafka/consumers/myGroupName/offsets/myTopicName/0 > > > >> > > > > >> > I get the following: > > > >> > > > > >> > 30024 > > > >> > cZxid = 0x3600000355 > > > >> > ctime = Fri Feb 12 07:27:50 MST 2016 > > > >> > mZxid = 0x3600000357 > > > >> > mtime = Fri Feb 12 07:29:50 MST 2016 > > > >> > pZxid = 0x3600000355 > > > >> > cversion = 0 > > > >> > dataVersion = 2 > > > >> > aclVersion = 0 > > > >> > ephemeralOwner = 0x0 > > > >> > dataLength = 5 > > > >> > numChildren = 0 > > > >> > > > > >> > Now - I have exactly 3500 messages in this Kafka topic. I verify > > that > > > >> by > > > >> > running this command: > > > >> > bin/kafka-console-consumer.sh --zookeeper > > > 192.168.56.5:2181/kafka > > > >> > --topic myTopicName --from-beginning > > > >> > > > > >> > When I hit Ctrl-C, it tells me it consumed 3500 messages. > > > >> > > > > >> > So - what does that 30024 actually represent? If I reset that > > number > > > >> to 1 > > > >> > or 0 and re-run my consumer microservice, I get all the messages > > > again - > > > >> > and the number again goes to 30024. However, I'm not comfortable > to > > > >> trust > > > >> > that because my assumption that the number represents a simple > count > > > of > > > >> > messages that have been sent to this consumer is obviously wrong. > > > >> > > > > >> > (I reset the number like this -- to 1 -- and assume there's an API > > > >> command > > > >> > that will do it too.) > > > >> > set /kafka/consumers/myGroupName/offsets/myTopicName/0 1 > > > >> > > > > >> > Can someone help me clarify or point me at a doc that explains > what > > is > > > >> > getting counted here? You can shoot me if you like for attempting > > the > > > >> > hack-ish solution of re-setting the offset through the Zookeeper > > API, > > > >> but I > > > >> > would still like to understand what, exactly, is represented by > that > > > >> number > > > >> > 30024. > > > >> > > > > >> > I need to hand off to IT for the Disaster Recovery portion and > > saying > > > >> > "trust me, it just works" isn't going to fly very far... > > > >> > > > > >> > Thanks. > > > >> > > > > >> > > > >> > > > >> > > > >> -- > > > >> *Christian Posta* > > > >> twitter: @christianposta > > > >> http://www.christianposta.com/blog > > > >> http://fabric8.io > > > >> > > > > > > > > > > > > > > > > > > > -- > > "Dream no small dreams for they have no power to move the hearts of men." > > > > Johann Wolfgang von Goethe > > > -- "Dream no small dreams for they have no power to move the hearts of men." Johann Wolfgang von Goethe