Thanks Leo!

=========
TL;DR summary:

You're correct - I didn't absolutely need the offset.
I had to provide Disaster Recovery advice and couldn't explain the offset
numbers, which wouldn't fly
Explanation for how I got myself confused in the text below -- in case it
helps someone else later.
Thanks for your reply!
=========

You're right.  Strictly speaking, I don't need the offset.  In my testing
I've been issuing the rmr /kafka/consumers command from the Zookeeper
zkCli.sh.
I'm adding it to my microservice using the Zookeeper API this week - since
that seems a lot easier than figuring out the low level Kafka API code and
it works just as well.

Being a developer, I just couldn't help trying to change the least
significant thing required to get the job done - and the Zookeeper API does
allow me to change that offset number...  Which led me to try to understand
why that number wasn't matching my expectations...

In addition, I'm building a SOLR / Kafka / Zookeeper infrastructure from
scratch and part of my mandate is to provide a handoff to our (very capable
and very careful) IT manager.  The handoff is to include plans and
documentation for disaster recovery as well as how to build and manage the
cluster.

For both of those reasons, my curiosity was piqued and I wanted to find out
exactly what was going on.  I could just imagine the look on our IT
manager's face when I said "Trust me, the numbers don't line up, but it
won't affect disaster recovery."

In hindsight, I understand what I did that confused me.  Since I'm still in
development "mode" I sent messages to the same topic repeatedly for weeks.
Then instead of deleting the topic, I issued the following command to reset
the retention of the messages like this:

bin/kafka-topics.sh --zookeeper 192.168.56.5:2181/kafka --alter --topic
topicName --config retention.ms=1000

Then I reset it once the messages were deleted, thus:

bin/kafka-topics.sh --zookeeper 192.168.56.5:2181/kafka --alter --topic
topicName --delete-config retention.ms

What I didn't realize is that (not unreasonably) the offset count isn't
reset by changing the config retention setting.  As you said, it won't
necessarily be 0.

Sending the same set of messages repeatedly resulted in having a very large
count in the offset - a count that bore no relation to the number of
messages in the topic - which worried me because I couldn't explain it --
and things I can't explain make me nervous in the context of disaster
recovery...

I appreciate your confirmation of my theory about what is going on.

--JohnB (aka solrJohn)

On Thu, Feb 18, 2016 at 12:19 PM, Leo Lin <leo....@brigade.com> wrote:

> Hi John,
>
> Kafka offsets are sequential id numbers that identify messages in each
> partition. It might not be sequential within a topic (which can have
> multiple partition).
>
> Offsets don't necessarily start at 0 since messages are deleted.
>
> .bin/kafka-run-class.sh kafka.tools.GetOffsetShell is pretty neat to look
> at offsets in your topic
>
> I'm not sure why resetting offset is needed in your case. If you need to
> read from the beginning using the high level consumer,
> you just need to delete that consumer group in zookeeper and set
> "auto.offset.reset"  to "smallest". (this will direct the consumer to look
> for smallest offset if it doesnt find one in zookeeper)
>
> On Wed, Feb 17, 2016 at 1:06 PM, John Bickerstaff <
> j...@johnbickerstaff.com>
> wrote:
>
> > Hmmm...  more info.
> >
> > So, inside /var/log/kafka-logs/myTopicName-0 I find two files
> >
> > 00000000000000026524.index  00000000000000026524.log
> >
> > Interestingly, they both bear the number of the "lowest" offset returned
> by
> > the command I mention above.
> >
> > If I "cat" the 000.....26524.log file, I get all my messages on the
> > commandline as if I'd issued the --from-beginning command
> >
> > I'm not sure what the index has, it's unreadable by the simple tools I've
> > tried....
> >
> > I'm still scratching my head a bit - as the link you sent for Kafka
> > introduction says this:
> >
> > The messages in the partitions are each assigned a sequential id number
> > called the *offset* that uniquely identifies each message within the
> > partition.
> > I see how that could be exactly what you said (the previous message(s)
> byte
> > count) -- but the picture implies that it's a linear progression - 1,2,3
> > etc...  (and that could be an oversimplification for purposes of the
> > introduction - I get that...)
> >
> > Feel free to comment or not - I'm going to keep digging into it as best I
> > can - any clarifications will be gratefully accepted...
> >
> >
> >
> > On Wed, Feb 17, 2016 at 1:50 PM, John Bickerstaff <
> > j...@johnbickerstaff.com>
> > wrote:
> >
> > > Thank you Christian -- I appreciate your taking the time to help me out
> > on
> > > this.
> > >
> > > Here's what I found while continuing to dig into this.
> > >
> > > If I take 30024 and subtract the number of messages I know I have in
> > Kafka
> > > (3500) I get 26524.
> > >
> > > If I reset thus:  set
> /kafka/consumers/myGroupName/offsets/myTopicName/0
> > > 26524
> > >
> > > ... and then re-run my consumer - I get all 3500 messages again.
> > >
> > > If I do this: set /kafka/consumers/myGroupName/offsets/myTopicName/0
> > 26624
> > >
> > > In other words, I increase the offset number by 100 -- then I get
> exactly
> > > 3400 messages on my consumer --  exactly 100 less than before which I
> > think
> > > makes sense, since I started the offset 100 higher...
> > >
> > > This seems to suggest that each number between 26624 and 30024 in the
> log
> > > represents one of my 3500 messages on this topic, but what you say
> > suggests
> > > that they represent byte count of the actual messages and not "one
> number
> > > per message"...
> > >
> > > I also find that if I issue this command:
> > >
> > > bin/kafka-run-class.sh kafka.tools.GetOffsetShell --topic=myTopicName
> > > --broker-list=192.168.56.3:9092  --time=-2
> > >
> > > I get back that same number -- 26524...
> > >
> > > Hmmmm....  A little confused still...  These messages are literally
> > stored
> > > in the Kafka logs, yes?  I think I'll go digging in there and see...
> > >
> > > Thanks again!
> > >
> > >
> > >
> > >
> > >
> > > On Wed, Feb 17, 2016 at 12:38 PM, Christian Posta <
> > > christian.po...@gmail.com> wrote:
> > >
> > >> The number is the log-ordered number of bytes. So really, the offset
> is
> > >> kinda like the "number of bytes" to begin reading from. 0 means read
> the
> > >> log from the beginning. The second message is 0 + size of message. So
> > the
> > >> message "ids" are really just the offset of the previous message
> sizes.
> > >>
> > >> For example, if I have three messages of 10 bytes each, and set the
> > >> consumer offset to 0, i'll read everything. If you set the offset to
> 10,
> > >> I'll read the second and third messages, and so on.
> > >>
> > >> see more here:
> > >>
> > >>
> >
> http://research.microsoft.com/en-us/um/people/srikanth/netdb11/netdb11papers/netdb11-final12.pdf
> > >> and here: http://kafka.apache.org/documentation.html#introduction
> > >>
> > >> HTH!
> > >>
> > >> On Wed, Feb 17, 2016 at 12:16 PM, John Bickerstaff <
> > >> j...@johnbickerstaff.com
> > >> > wrote:
> > >>
> > >> > *Use Case: Disaster Recovery & Re-indexing SOLR*
> > >> >
> > >> > I'm using Kafka to hold messages from a service that prepares
> > >> "documents"
> > >> > for SOLR.
> > >> >
> > >> > A second micro service (a consumer) requests these messages, does
> any
> > >> final
> > >> > processing, and fires them into SOLR.
> > >> >
> > >> > The whole thing is (in part) designed to be used for disaster
> > recovery -
> > >> > allowing the rebuild of the SOLR index in the shortest possible
> time.
> > >> >
> > >> > To do this (and to be able to use it for re-indexing SOLR while
> > testing
> > >> > relevancy) I need to be able to "play all messages from the
> beginning"
> > >> at
> > >> > will.
> > >> >
> > >> > I find I can use the zkCli.sh tool to delete the Consumer Group Name
> > >> like
> > >> > this:
> > >> >      rmr /kafka/consumers/myGroupName
> > >> >
> > >> > After which my microservice will get all the messages again when it
> > >> runs.
> > >> >
> > >> > I was trying to find a way to do this programmatically without
> > actually
> > >> > using the "low level" consumer api since the high level one is so
> > simple
> > >> > and my code already works.  So I started playing with Zookeeper api
> > for
> > >> > duplicating "rmr /kafka/consumers/myGroupName"
> > >> >
> > >> > *The Question: What does that offset actually represent?*
> > >> >
> > >> > It was at this point that I discovered the offset must represent
> > >> something
> > >> > other than what I thought it would.  Things obviously work, but I'm
> > >> > wondering what - exactly do the offsets represent?
> > >> >
> > >> > To clarify - if I run this command on a zookeeper node, after the
> > >> > microservice has run:
> > >> >      get /kafka/consumers/myGroupName/offsets/myTopicName/0
> > >> >
> > >> > I get the following:
> > >> >
> > >> > 30024
> > >> > cZxid = 0x3600000355
> > >> > ctime = Fri Feb 12 07:27:50 MST 2016
> > >> > mZxid = 0x3600000357
> > >> > mtime = Fri Feb 12 07:29:50 MST 2016
> > >> > pZxid = 0x3600000355
> > >> > cversion = 0
> > >> > dataVersion = 2
> > >> > aclVersion = 0
> > >> > ephemeralOwner = 0x0
> > >> > dataLength = 5
> > >> > numChildren = 0
> > >> >
> > >> > Now - I have exactly 3500 messages in this Kafka topic.  I verify
> that
> > >> by
> > >> > running this command:
> > >> >      bin/kafka-console-consumer.sh --zookeeper
> > 192.168.56.5:2181/kafka
> > >> > --topic myTopicName --from-beginning
> > >> >
> > >> > When I hit Ctrl-C, it tells me it consumed 3500 messages.
> > >> >
> > >> > So - what does that 30024 actually represent?  If I reset that
> number
> > >> to 1
> > >> > or 0 and re-run my consumer microservice, I get all the messages
> > again -
> > >> > and the number again goes to 30024.  However, I'm not comfortable to
> > >> trust
> > >> > that because my assumption that the number represents a simple count
> > of
> > >> > messages that have been sent to this consumer is obviously wrong.
> > >> >
> > >> > (I reset the number like this -- to 1 -- and assume there's an API
> > >> command
> > >> > that will do it too.)
> > >> >      set /kafka/consumers/myGroupName/offsets/myTopicName/0 1
> > >> >
> > >> > Can someone help me clarify or point me at a doc that explains what
> is
> > >> > getting counted here?  You can shoot me if you like for attempting
> the
> > >> > hack-ish solution of re-setting the offset through the Zookeeper
> API,
> > >> but I
> > >> > would still like to understand what, exactly, is represented by that
> > >> number
> > >> > 30024.
> > >> >
> > >> > I need to hand off to IT for the Disaster Recovery portion and
> saying
> > >> > "trust me, it just works" isn't going to fly very far...
> > >> >
> > >> > Thanks.
> > >> >
> > >>
> > >>
> > >>
> > >> --
> > >> *Christian Posta*
> > >> twitter: @christianposta
> > >> http://www.christianposta.com/blog
> > >> http://fabric8.io
> > >>
> > >
> > >
> >
>
>
>
> --
> "Dream no small dreams for they have no power to move the hearts of men."
>
> Johann Wolfgang von Goethe
>

Reply via email to