Re: [VOTE] KIP-33 - Add a time based log index

Ismael Juma Tue, 19 Apr 2016 17:26:04 -0700

Thanks Becket. I think it would be nice to update the KIP with regards to
point 3 and 4.


In any case, +1 (non-binding)

Ismael

On Tue, Apr 19, 2016 at 2:03 AM, Becket Qin <[email protected]> wrote:

> Thanks for the comments Ismael. Please see the replies inline.
>
> On Mon, Apr 18, 2016 at 6:50 AM, Ismael Juma <[email protected]> wrote:
>
> > Hi Jiangjie,
> >
> > Thanks for the KIP, it's a nice improvement. Since it seems like we have
> > been using the voting thread for discussion, I'll do the same.
> >
> > A few minor comments/questions:
> >
> > 1. The proposed name for the time index file
> `SegmentBaseOffset.timeindex`.
> > Would `SegmentBaseOffset.time-index` be a little better? It would clearly
> > separate the type of index in case we add additional index types in the
> > future.
> >
> I have no strong opinion on this, I am not adding any thing separator
> because it is more regex friendly.
> I am not sure about the other indexes, time and space seems to be two most
> common dimensions.
>
> 2. When describing the time index entry, we say "Offset - the next offset
> > when the time index entry is inserted". I found the mention of `next` a
> bit
> > confusing as it looks to me like the time index entry has the first
> offset
> > in the message set.
>
> This semantic meaning is a little different from the offset index. The
> offset index information is self-contained by nature. i.e. all the offsets
> before is smaller than the offset of this message set. So we only need to
> say "the offset of this message set is OFFSET". This does not quite apply
> to the time index because the max timestamp may or may not be in the
> message set being appended. So we have to either say, "the max timestamp
> before I append this message set is T", or "the max timestamp after I
> appended this message set is T". The former case means that we can skip all
> the previous messages if we are looking for a timestamp > T and start from
> this offset. The latter one means if we are searching for timestamp > T, we
> should start after this message set, which is essentially the same as the
> former case but require an additional interpretation.
>
> 3. We say "The default initial / max size of the time index files is the
> > same as the offset index files. (time index entry is 1.5x of the size of
> > offset index entry, user should set the configuration accordingly)". It
> may
> > be worth elaborating a little on what a user should do with regards to
> this
> > configuration when upgrading (ie maybe under "Compatibility, Deprecation,
> > and Migration Plan").
> >
> Makes sense.
>
>
> > 4. In a previous vote thread, Jun said "The simplest thing is probably
> > to change
> > the default index size to 2MB to match the default log segment size" and
> > you seemed to agree. I couldn't find anything about this in the KIP. Are
> we
> > still doing it?
> >
> Yes, we can still make the change for default settings. User might want to
> set the index size a little larger if they have a customized size but in
> reality it should not cause problems other than rolling out a little more
> log segments.
>
> 5. We say "Instead, it is only monotonically increasing within each time
> > index file. i.e. It is possible that the time index file for a later log
> > segment contains smaller timestamp than some timestamp in the time index
> > file of an earlier segment.". I think it would be good to explain under
> > which scenario a time index file for a later log segment contains a
> smaller
> > timestamp (is this only when CreateTime is used?).
> >
> Yes, it only happens when CreateTime is used.
>
>
> > 6. We say "When searching by timestamp, broker will start from the
> earliest
> > log segment and check the last time index entry.". The existing logic
> > searches from newest segment backwards. Is there a reason why we are
> > changing it?
> >
> Suppose segment 0 has max timestamp 100, segment 1 has max timestamp 50 and
> segment 3 has max timestamp 90, now user want to search for timestamp 80.
> If we search backwards, we have to take a look at all the segments. If we
> search forward, we will stop at the first segment whose max timestamp is
> greater than 80 (i.e all the previous segments has smaller timestamps) and
> start the finer search on that segment.
>
>
> > 7. Do you mind if I fix typos and minor grammar issues directly in the
> > wiki? It seems easier than doing that via email.
> >
> Not at all, thanks for help.
>
>
> > Thanks,
> > Ismael
> >
> > On Thu, Apr 7, 2016 at 1:44 AM, Becket Qin <[email protected]> wrote:
> >
> > > Hi all,
> > >
> > > I updated KIP-33 based on the initial implementation. Per discussion on
> > > yesterday's KIP hangout, I would like to initiate the new vote thread
> for
> > > KIP-33.
> > >
> > > The KIP wiki:
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-33+-+Add+a+time+based+log+index
> > >
> > > Here is a brief summary of the KIP:
> > > 1. We propose to add a time index for each log segment.
> > > 2. The time indices are going to be used of log retention, log rolling
> > and
> > > message search by timestamp.
> > >
> > > There was an old voting thread which has some discussions on this KIP.
> > The
> > > mail thread link is following:
> > >
> > >
> >
> http://mail-archives.apache.org/mod_mbox/kafka-dev/201602.mbox/%3ccabtagwgoebukyapfpchmycjk2tepq3ngtuwnhtr2tjvsnc8...@mail.gmail.com%3E
> > >
> > > I have the following WIP patch for reference. It needs a few more unit
> > > tests and documentation. Other than that it should run fine.
> > >
> > >
> >
> https://github.com/becketqin/kafka/commit/712357a3fbf1423e05f9eed7d2fed5b6fe6c37b7
> > >
> > > Thanks,
> > >
> > > Jiangjie (Becket) Qin
> > >
> >
>

Re: [VOTE] KIP-33 - Add a time based log index

Reply via email to