+1
On Fri, Apr 15, 2016 at 9:37 AM, Guozhang Wang <wangg...@gmail.com> wrote: > +1 from me. Thanks. > > On Fri, Apr 15, 2016 at 9:16 AM, Jun Rao <j...@confluent.io> wrote: > >> Hi, Jiangjie, >> >> Thanks for the latest update. +1 on the KIP. >> >> Jun >> >> On Thu, Apr 14, 2016 at 2:36 PM, Becket Qin <becket....@gmail.com> wrote: >> >> > Hi Jun, >> > >> > 11. Yes, that sounds a reasonable solution. In the latest patch I am >> doing >> > the following in order: >> > a. Create an empty time index for a log segment if there isn't one. >> > b. For all non-active log segments, append an entry of >> > (last_modification_time -> next_base_offset) into the index. The time >> index >> > of the active segment is left empty. All the in-memory maxTimestamp is >> set >> > to -1. >> > c. If there is a hard failure, the time index and offset index will both >> be >> > rebuilt. >> > >> > So we do not rebuild the time index during upgrade unless there was a >> hard >> > failure. I have updated the wiki to reflect this. >> > >> > BTW, it seems that the current code will never hit the case where an >> index >> > is missing. I commented on PR. >> > >> > Thanks, >> > >> > Jiangjie (Becket) Qin >> > >> > >> > >> > On Thu, Apr 14, 2016 at 10:00 AM, Jun Rao <j...@confluent.io> wrote: >> > >> > > Hi, Jiangjie, >> > > >> > > 11. Rebuilding all missing time indexes will make the upgrade process >> > > longer since the log segments pre 0.10.0 don't have the time index. >> Could >> > > we only rebuild the missing indexes after the last flush offset? For >> > other >> > > segments missing the time index, we just assume lastTimestamp to be -1? >> > > >> > > Thanks, >> > > >> > > Jun >> > > >> > > On Wed, Apr 13, 2016 at 9:55 AM, Becket Qin <becket....@gmail.com> >> > wrote: >> > > >> > > > Hi Jun and Guozhang, >> > > > >> > > > I have updated the KIP wiki to incorporate your comments. Please let >> me >> > > > know if you prefer starting another discussion thread for further >> > > > discussion. >> > > > >> > > > Thanks, >> > > > >> > > > Jiangjie (Becket) Qin >> > > > >> > > > On Mon, Apr 11, 2016 at 12:21 AM, Becket Qin <becket....@gmail.com> >> > > wrote: >> > > > >> > > > > Hi Guozhang and Jun, >> > > > > >> > > > > Thanks for the comments. Please see the responses below. >> > > > > >> > > > > Regarding to Guozhang's question #1 and Jun's question #12. I was >> > > > > inserting the time index and offset index entry together mostly for >> > > > > simplicity as Guozhang mentioned. The purpose of using index >> interval >> > > > bytes >> > > > > for time index was to control the density of the time index, which >> is >> > > the >> > > > > same purpose as offset index. It seems reasonable to make them >> > aligned. >> > > > We >> > > > > can track separately the physical position when we insert the last >> > time >> > > > > index entry(my original code did that), but when I wrote the code I >> > > feel >> > > > it >> > > > > seems unnecessary. Another minor benefit is that searching by >> > timestamp >> > > > > could be potentially faster if we align the time index and offset >> > > index. >> > > > > It is possible that we only have either a corrupted time index or >> an >> > > > > offset index, but not both. Although we can choose to only rebuild >> > the >> > > > one >> > > > > which is corrupted, given that we have to scan the entire log >> segment >> > > > > anyway, rebuilding both of them seems not much overhead. So the >> > current >> > > > > patch I have is rebuilding both of them together. >> > > > > >> > > > > 10. Yes, it should only happen after a hard failure. The last time >> > > index >> > > > > entry of a normally closed segment has already points to the LEO, >> so >> > > > there >> > > > > is no scan during start up. >> > > > > >> > > > > 11. On broker startup, if a time index does not exist, an empty one >> > > will >> > > > > be created first. If message format version is 0.9.0, we will >> append >> > a >> > > > time >> > > > > index entry of (last modification time -> base offset of next >> > segment) >> > > to >> > > > > the time index of each inactive segment. So no actual rebuild will >> > > happen >> > > > > during upgrade. However, if message format version is 0.10.0, we >> will >> > > > > rebuild the time index if it does not exist. (I actually had a >> > question >> > > > > about the how we are loading the log segments, we can discuss it in >> > the >> > > > PR) >> > > > > >> > > > > I will update the wiki to clarify the question raised in the >> comments >> > > and >> > > > > submit a PR by tomorrow. I am currently cleaning up the >> > documentation. >> > > > > >> > > > > Thanks, >> > > > > >> > > > > Jiangjie (Becket) Qin >> > > > > >> > > > > >> > > > > >> > > > > On Sun, Apr 10, 2016 at 9:25 PM, Jun Rao <j...@confluent.io> wrote: >> > > > > >> > > > >> Hi, Jiangjie, >> > > > >> >> > > > >> Thanks for the update. Looks good to me overall. Just a few minor >> > > > comments >> > > > >> below. >> > > > >> >> > > > >> 10. On broker startup, it's not clear to me why we need to scan >> the >> > > log >> > > > >> segment to retrieve the largest timestamp since the time index >> > always >> > > > has >> > > > >> an entry for the largest timestamp. Is that only for restarting >> > after >> > > a >> > > > >> hard failure? >> > > > >> >> > > > >> 11. On broker startup, if a log segment misses the time index, do >> we >> > > > >> always >> > > > >> rebuild it? This can happen when the broker is upgraded. >> > > > >> >> > > > >> 12. Related to Guozhang's question #1. It seems it's simpler to >> add >> > > time >> > > > >> index entries independent of the offset index since at index entry >> > may >> > > > not >> > > > >> be added to the offset and the time index at the same time. Also, >> > this >> > > > >> allows time index to be rebuilt independently if needed. >> > > > >> >> > > > >> Thanks, >> > > > >> >> > > > >> Jun >> > > > >> >> > > > >> >> > > > >> On Wed, Apr 6, 2016 at 5:44 PM, Becket Qin <becket....@gmail.com> >> > > > wrote: >> > > > >> >> > > > >> > Hi all, >> > > > >> > >> > > > >> > I updated KIP-33 based on the initial implementation. Per >> > discussion >> > > > on >> > > > >> > yesterday's KIP hangout, I would like to initiate the new vote >> > > thread >> > > > >> for >> > > > >> > KIP-33. >> > > > >> > >> > > > >> > The KIP wiki: >> > > > >> > >> > > > >> > >> > > > >> >> > > > >> > > >> > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-33+-+Add+a+time+based+log+index >> > > > >> > >> > > > >> > Here is a brief summary of the KIP: >> > > > >> > 1. We propose to add a time index for each log segment. >> > > > >> > 2. The time indices are going to be used of log retention, log >> > > rolling >> > > > >> and >> > > > >> > message search by timestamp. >> > > > >> > >> > > > >> > There was an old voting thread which has some discussions on >> this >> > > KIP. >> > > > >> The >> > > > >> > mail thread link is following: >> > > > >> > >> > > > >> > >> > > > >> >> > > > >> > > >> > >> http://mail-archives.apache.org/mod_mbox/kafka-dev/201602.mbox/%3ccabtagwgoebukyapfpchmycjk2tepq3ngtuwnhtr2tjvsnc8...@mail.gmail.com%3E >> > > > >> > >> > > > >> > I have the following WIP patch for reference. It needs a few >> more >> > > unit >> > > > >> > tests and documentation. Other than that it should run fine. >> > > > >> > >> > > > >> > >> > > > >> >> > > > >> > > >> > >> https://github.com/becketqin/kafka/commit/712357a3fbf1423e05f9eed7d2fed5b6fe6c37b7 >> > > > >> > >> > > > >> > Thanks, >> > > > >> > >> > > > >> > Jiangjie (Becket) Qin >> > > > >> > >> > > > >> >> > > > > >> > > > > >> > > > >> > > >> > >> > > > > -- > -- Guozhang