Thanks for the comments Ismael. Please see the replies inline. On Mon, Apr 18, 2016 at 6:50 AM, Ismael Juma <ism...@juma.me.uk> wrote:
> Hi Jiangjie, > > Thanks for the KIP, it's a nice improvement. Since it seems like we have > been using the voting thread for discussion, I'll do the same. > > A few minor comments/questions: > > 1. The proposed name for the time index file `SegmentBaseOffset.timeindex`. > Would `SegmentBaseOffset.time-index` be a little better? It would clearly > separate the type of index in case we add additional index types in the > future. > I have no strong opinion on this, I am not adding any thing separator because it is more regex friendly. I am not sure about the other indexes, time and space seems to be two most common dimensions. 2. When describing the time index entry, we say "Offset - the next offset > when the time index entry is inserted". I found the mention of `next` a bit > confusing as it looks to me like the time index entry has the first offset > in the message set. This semantic meaning is a little different from the offset index. The offset index information is self-contained by nature. i.e. all the offsets before is smaller than the offset of this message set. So we only need to say "the offset of this message set is OFFSET". This does not quite apply to the time index because the max timestamp may or may not be in the message set being appended. So we have to either say, "the max timestamp before I append this message set is T", or "the max timestamp after I appended this message set is T". The former case means that we can skip all the previous messages if we are looking for a timestamp > T and start from this offset. The latter one means if we are searching for timestamp > T, we should start after this message set, which is essentially the same as the former case but require an additional interpretation. 3. We say "The default initial / max size of the time index files is the > same as the offset index files. (time index entry is 1.5x of the size of > offset index entry, user should set the configuration accordingly)". It may > be worth elaborating a little on what a user should do with regards to this > configuration when upgrading (ie maybe under "Compatibility, Deprecation, > and Migration Plan"). > Makes sense. > 4. In a previous vote thread, Jun said "The simplest thing is probably > to change > the default index size to 2MB to match the default log segment size" and > you seemed to agree. I couldn't find anything about this in the KIP. Are we > still doing it? > Yes, we can still make the change for default settings. User might want to set the index size a little larger if they have a customized size but in reality it should not cause problems other than rolling out a little more log segments. 5. We say "Instead, it is only monotonically increasing within each time > index file. i.e. It is possible that the time index file for a later log > segment contains smaller timestamp than some timestamp in the time index > file of an earlier segment.". I think it would be good to explain under > which scenario a time index file for a later log segment contains a smaller > timestamp (is this only when CreateTime is used?). > Yes, it only happens when CreateTime is used. > 6. We say "When searching by timestamp, broker will start from the earliest > log segment and check the last time index entry.". The existing logic > searches from newest segment backwards. Is there a reason why we are > changing it? > Suppose segment 0 has max timestamp 100, segment 1 has max timestamp 50 and segment 3 has max timestamp 90, now user want to search for timestamp 80. If we search backwards, we have to take a look at all the segments. If we search forward, we will stop at the first segment whose max timestamp is greater than 80 (i.e all the previous segments has smaller timestamps) and start the finer search on that segment. > 7. Do you mind if I fix typos and minor grammar issues directly in the > wiki? It seems easier than doing that via email. > Not at all, thanks for help. > Thanks, > Ismael > > On Thu, Apr 7, 2016 at 1:44 AM, Becket Qin <becket....@gmail.com> wrote: > > > Hi all, > > > > I updated KIP-33 based on the initial implementation. Per discussion on > > yesterday's KIP hangout, I would like to initiate the new vote thread for > > KIP-33. > > > > The KIP wiki: > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-33+-+Add+a+time+based+log+index > > > > Here is a brief summary of the KIP: > > 1. We propose to add a time index for each log segment. > > 2. The time indices are going to be used of log retention, log rolling > and > > message search by timestamp. > > > > There was an old voting thread which has some discussions on this KIP. > The > > mail thread link is following: > > > > > http://mail-archives.apache.org/mod_mbox/kafka-dev/201602.mbox/%3ccabtagwgoebukyapfpchmycjk2tepq3ngtuwnhtr2tjvsnc8...@mail.gmail.com%3E > > > > I have the following WIP patch for reference. It needs a few more unit > > tests and documentation. Other than that it should run fine. > > > > > https://github.com/becketqin/kafka/commit/712357a3fbf1423e05f9eed7d2fed5b6fe6c37b7 > > > > Thanks, > > > > Jiangjie (Becket) Qin > > >