Hi Jun,

11. Yes, that sounds a reasonable solution. In the latest patch I am doing
the following in order:
a. Create an empty time index for a log segment if there isn't one.
b. For all non-active log segments, append an entry of
(last_modification_time -> next_base_offset) into the index. The time index
of the active segment is left empty. All the in-memory maxTimestamp is set
to -1.
c. If there is a hard failure, the time index and offset index will both be
rebuilt.

So we do not rebuild the time index during upgrade unless there was a hard
failure. I have updated the wiki to reflect this.

BTW, it seems that the current code will never hit the case where an index
is missing. I commented on PR.

Thanks,

Jiangjie (Becket) Qin



On Thu, Apr 14, 2016 at 10:00 AM, Jun Rao <j...@confluent.io> wrote:

> Hi, Jiangjie,
>
> 11. Rebuilding all missing time indexes will make the upgrade process
> longer since the log segments pre 0.10.0 don't have the time index. Could
> we only rebuild the missing indexes after the last flush offset? For other
> segments missing the time index, we just assume lastTimestamp to be -1?
>
> Thanks,
>
> Jun
>
> On Wed, Apr 13, 2016 at 9:55 AM, Becket Qin <becket....@gmail.com> wrote:
>
> > Hi Jun and Guozhang,
> >
> > I have updated the KIP wiki to incorporate your comments. Please let me
> > know if you prefer starting another discussion thread for further
> > discussion.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> > On Mon, Apr 11, 2016 at 12:21 AM, Becket Qin <becket....@gmail.com>
> wrote:
> >
> > > Hi Guozhang and Jun,
> > >
> > > Thanks for the comments. Please see the responses below.
> > >
> > > Regarding to Guozhang's question #1 and Jun's question #12. I was
> > > inserting the time index and offset index entry together mostly for
> > > simplicity as Guozhang mentioned. The purpose of using index interval
> > bytes
> > > for time index was to control the density of the time index, which is
> the
> > > same purpose as offset index. It seems reasonable to make them aligned.
> > We
> > > can track separately the physical position when we insert the last time
> > > index entry(my original code did that), but when I wrote the code I
> feel
> > it
> > > seems unnecessary. Another minor benefit is that searching by timestamp
> > > could be potentially faster if we align the time index and offset
> index.
> > > It is possible that we only have either a corrupted time index or an
> > > offset index, but not both. Although we can choose to only rebuild the
> > one
> > > which is corrupted, given that we have to scan the entire log segment
> > > anyway, rebuilding both of them seems not much overhead. So the current
> > > patch I have is rebuilding both of them together.
> > >
> > > 10. Yes, it should only happen after a hard failure. The last time
> index
> > > entry of a normally closed segment has already points to the LEO, so
> > there
> > > is no scan during start up.
> > >
> > > 11. On broker startup, if a time index does not exist, an empty one
> will
> > > be created first. If message format version is 0.9.0, we will append a
> > time
> > > index entry of (last modification time -> base offset of next segment)
> to
> > > the time index of each inactive segment. So no actual rebuild will
> happen
> > > during upgrade. However, if message format version is 0.10.0, we will
> > > rebuild the time index if it does not exist. (I actually had a question
> > > about the how we are loading the log segments, we can discuss it in the
> > PR)
> > >
> > > I will update the wiki to clarify the question raised in the comments
> and
> > > submit a PR by tomorrow. I am currently cleaning up the documentation.
> > >
> > > Thanks,
> > >
> > > Jiangjie (Becket) Qin
> > >
> > >
> > >
> > > On Sun, Apr 10, 2016 at 9:25 PM, Jun Rao <j...@confluent.io> wrote:
> > >
> > >> Hi, Jiangjie,
> > >>
> > >> Thanks for the update. Looks good to me overall. Just a few minor
> > comments
> > >> below.
> > >>
> > >> 10. On broker startup, it's not clear to me why we need to scan the
> log
> > >> segment to retrieve the largest timestamp since the time index always
> > has
> > >> an entry for the largest timestamp. Is that only for restarting after
> a
> > >> hard failure?
> > >>
> > >> 11. On broker startup, if a log segment misses the time index, do we
> > >> always
> > >> rebuild it? This can happen when the broker is upgraded.
> > >>
> > >> 12. Related to Guozhang's question #1. It seems it's simpler to add
> time
> > >> index entries independent of the offset index since at index entry may
> > not
> > >> be added to the offset and the time index at the same time. Also, this
> > >> allows time index to be rebuilt independently if needed.
> > >>
> > >> Thanks,
> > >>
> > >> Jun
> > >>
> > >>
> > >> On Wed, Apr 6, 2016 at 5:44 PM, Becket Qin <becket....@gmail.com>
> > wrote:
> > >>
> > >> > Hi all,
> > >> >
> > >> > I updated KIP-33 based on the initial implementation. Per discussion
> > on
> > >> > yesterday's KIP hangout, I would like to initiate the new vote
> thread
> > >> for
> > >> > KIP-33.
> > >> >
> > >> > The KIP wiki:
> > >> >
> > >> >
> > >>
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-33+-+Add+a+time+based+log+index
> > >> >
> > >> > Here is a brief summary of the KIP:
> > >> > 1. We propose to add a time index for each log segment.
> > >> > 2. The time indices are going to be used of log retention, log
> rolling
> > >> and
> > >> > message search by timestamp.
> > >> >
> > >> > There was an old voting thread which has some discussions on this
> KIP.
> > >> The
> > >> > mail thread link is following:
> > >> >
> > >> >
> > >>
> >
> http://mail-archives.apache.org/mod_mbox/kafka-dev/201602.mbox/%3ccabtagwgoebukyapfpchmycjk2tepq3ngtuwnhtr2tjvsnc8...@mail.gmail.com%3E
> > >> >
> > >> > I have the following WIP patch for reference. It needs a few more
> unit
> > >> > tests and documentation. Other than that it should run fine.
> > >> >
> > >> >
> > >>
> >
> https://github.com/becketqin/kafka/commit/712357a3fbf1423e05f9eed7d2fed5b6fe6c37b7
> > >> >
> > >> > Thanks,
> > >> >
> > >> > Jiangjie (Becket) Qin
> > >> >
> > >>
> > >
> > >
> >
>

Reply via email to