Hi, Jiangjie, Thanks for the update. Looks good to me overall. Just a few minor comments below.
10. On broker startup, it's not clear to me why we need to scan the log segment to retrieve the largest timestamp since the time index always has an entry for the largest timestamp. Is that only for restarting after a hard failure? 11. On broker startup, if a log segment misses the time index, do we always rebuild it? This can happen when the broker is upgraded. 12. Related to Guozhang's question #1. It seems it's simpler to add time index entries independent of the offset index since at index entry may not be added to the offset and the time index at the same time. Also, this allows time index to be rebuilt independently if needed. Thanks, Jun On Wed, Apr 6, 2016 at 5:44 PM, Becket Qin <becket....@gmail.com> wrote: > Hi all, > > I updated KIP-33 based on the initial implementation. Per discussion on > yesterday's KIP hangout, I would like to initiate the new vote thread for > KIP-33. > > The KIP wiki: > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-33+-+Add+a+time+based+log+index > > Here is a brief summary of the KIP: > 1. We propose to add a time index for each log segment. > 2. The time indices are going to be used of log retention, log rolling and > message search by timestamp. > > There was an old voting thread which has some discussions on this KIP. The > mail thread link is following: > > http://mail-archives.apache.org/mod_mbox/kafka-dev/201602.mbox/%3ccabtagwgoebukyapfpchmycjk2tepq3ngtuwnhtr2tjvsnc8...@mail.gmail.com%3E > > I have the following WIP patch for reference. It needs a few more unit > tests and documentation. Other than that it should run fine. > > https://github.com/becketqin/kafka/commit/712357a3fbf1423e05f9eed7d2fed5b6fe6c37b7 > > Thanks, > > Jiangjie (Becket) Qin >