+1 On Sat, Apr 16, 2016 at 10:25 PM, Gwen Shapira <g...@confluent.io> wrote:
> +1 > > On Fri, Apr 15, 2016 at 9:37 AM, Guozhang Wang <wangg...@gmail.com> wrote: > > +1 from me. Thanks. > > > > On Fri, Apr 15, 2016 at 9:16 AM, Jun Rao <j...@confluent.io> wrote: > > > >> Hi, Jiangjie, > >> > >> Thanks for the latest update. +1 on the KIP. > >> > >> Jun > >> > >> On Thu, Apr 14, 2016 at 2:36 PM, Becket Qin <becket....@gmail.com> > wrote: > >> > >> > Hi Jun, > >> > > >> > 11. Yes, that sounds a reasonable solution. In the latest patch I am > >> doing > >> > the following in order: > >> > a. Create an empty time index for a log segment if there isn't one. > >> > b. For all non-active log segments, append an entry of > >> > (last_modification_time -> next_base_offset) into the index. The time > >> index > >> > of the active segment is left empty. All the in-memory maxTimestamp is > >> set > >> > to -1. > >> > c. If there is a hard failure, the time index and offset index will > both > >> be > >> > rebuilt. > >> > > >> > So we do not rebuild the time index during upgrade unless there was a > >> hard > >> > failure. I have updated the wiki to reflect this. > >> > > >> > BTW, it seems that the current code will never hit the case where an > >> index > >> > is missing. I commented on PR. > >> > > >> > Thanks, > >> > > >> > Jiangjie (Becket) Qin > >> > > >> > > >> > > >> > On Thu, Apr 14, 2016 at 10:00 AM, Jun Rao <j...@confluent.io> wrote: > >> > > >> > > Hi, Jiangjie, > >> > > > >> > > 11. Rebuilding all missing time indexes will make the upgrade > process > >> > > longer since the log segments pre 0.10.0 don't have the time index. > >> Could > >> > > we only rebuild the missing indexes after the last flush offset? For > >> > other > >> > > segments missing the time index, we just assume lastTimestamp to be > -1? > >> > > > >> > > Thanks, > >> > > > >> > > Jun > >> > > > >> > > On Wed, Apr 13, 2016 at 9:55 AM, Becket Qin <becket....@gmail.com> > >> > wrote: > >> > > > >> > > > Hi Jun and Guozhang, > >> > > > > >> > > > I have updated the KIP wiki to incorporate your comments. Please > let > >> me > >> > > > know if you prefer starting another discussion thread for further > >> > > > discussion. > >> > > > > >> > > > Thanks, > >> > > > > >> > > > Jiangjie (Becket) Qin > >> > > > > >> > > > On Mon, Apr 11, 2016 at 12:21 AM, Becket Qin < > becket....@gmail.com> > >> > > wrote: > >> > > > > >> > > > > Hi Guozhang and Jun, > >> > > > > > >> > > > > Thanks for the comments. Please see the responses below. > >> > > > > > >> > > > > Regarding to Guozhang's question #1 and Jun's question #12. I > was > >> > > > > inserting the time index and offset index entry together mostly > for > >> > > > > simplicity as Guozhang mentioned. The purpose of using index > >> interval > >> > > > bytes > >> > > > > for time index was to control the density of the time index, > which > >> is > >> > > the > >> > > > > same purpose as offset index. It seems reasonable to make them > >> > aligned. > >> > > > We > >> > > > > can track separately the physical position when we insert the > last > >> > time > >> > > > > index entry(my original code did that), but when I wrote the > code I > >> > > feel > >> > > > it > >> > > > > seems unnecessary. Another minor benefit is that searching by > >> > timestamp > >> > > > > could be potentially faster if we align the time index and > offset > >> > > index. > >> > > > > It is possible that we only have either a corrupted time index > or > >> an > >> > > > > offset index, but not both. Although we can choose to only > rebuild > >> > the > >> > > > one > >> > > > > which is corrupted, given that we have to scan the entire log > >> segment > >> > > > > anyway, rebuilding both of them seems not much overhead. So the > >> > current > >> > > > > patch I have is rebuilding both of them together. > >> > > > > > >> > > > > 10. Yes, it should only happen after a hard failure. The last > time > >> > > index > >> > > > > entry of a normally closed segment has already points to the > LEO, > >> so > >> > > > there > >> > > > > is no scan during start up. > >> > > > > > >> > > > > 11. On broker startup, if a time index does not exist, an empty > one > >> > > will > >> > > > > be created first. If message format version is 0.9.0, we will > >> append > >> > a > >> > > > time > >> > > > > index entry of (last modification time -> base offset of next > >> > segment) > >> > > to > >> > > > > the time index of each inactive segment. So no actual rebuild > will > >> > > happen > >> > > > > during upgrade. However, if message format version is 0.10.0, we > >> will > >> > > > > rebuild the time index if it does not exist. (I actually had a > >> > question > >> > > > > about the how we are loading the log segments, we can discuss > it in > >> > the > >> > > > PR) > >> > > > > > >> > > > > I will update the wiki to clarify the question raised in the > >> comments > >> > > and > >> > > > > submit a PR by tomorrow. I am currently cleaning up the > >> > documentation. > >> > > > > > >> > > > > Thanks, > >> > > > > > >> > > > > Jiangjie (Becket) Qin > >> > > > > > >> > > > > > >> > > > > > >> > > > > On Sun, Apr 10, 2016 at 9:25 PM, Jun Rao <j...@confluent.io> > wrote: > >> > > > > > >> > > > >> Hi, Jiangjie, > >> > > > >> > >> > > > >> Thanks for the update. Looks good to me overall. Just a few > minor > >> > > > comments > >> > > > >> below. > >> > > > >> > >> > > > >> 10. On broker startup, it's not clear to me why we need to scan > >> the > >> > > log > >> > > > >> segment to retrieve the largest timestamp since the time index > >> > always > >> > > > has > >> > > > >> an entry for the largest timestamp. Is that only for restarting > >> > after > >> > > a > >> > > > >> hard failure? > >> > > > >> > >> > > > >> 11. On broker startup, if a log segment misses the time index, > do > >> we > >> > > > >> always > >> > > > >> rebuild it? This can happen when the broker is upgraded. > >> > > > >> > >> > > > >> 12. Related to Guozhang's question #1. It seems it's simpler to > >> add > >> > > time > >> > > > >> index entries independent of the offset index since at index > entry > >> > may > >> > > > not > >> > > > >> be added to the offset and the time index at the same time. > Also, > >> > this > >> > > > >> allows time index to be rebuilt independently if needed. > >> > > > >> > >> > > > >> Thanks, > >> > > > >> > >> > > > >> Jun > >> > > > >> > >> > > > >> > >> > > > >> On Wed, Apr 6, 2016 at 5:44 PM, Becket Qin < > becket....@gmail.com> > >> > > > wrote: > >> > > > >> > >> > > > >> > Hi all, > >> > > > >> > > >> > > > >> > I updated KIP-33 based on the initial implementation. Per > >> > discussion > >> > > > on > >> > > > >> > yesterday's KIP hangout, I would like to initiate the new > vote > >> > > thread > >> > > > >> for > >> > > > >> > KIP-33. > >> > > > >> > > >> > > > >> > The KIP wiki: > >> > > > >> > > >> > > > >> > > >> > > > >> > >> > > > > >> > > > >> > > >> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-33+-+Add+a+time+based+log+index > >> > > > >> > > >> > > > >> > Here is a brief summary of the KIP: > >> > > > >> > 1. We propose to add a time index for each log segment. > >> > > > >> > 2. The time indices are going to be used of log retention, > log > >> > > rolling > >> > > > >> and > >> > > > >> > message search by timestamp. > >> > > > >> > > >> > > > >> > There was an old voting thread which has some discussions on > >> this > >> > > KIP. > >> > > > >> The > >> > > > >> > mail thread link is following: > >> > > > >> > > >> > > > >> > > >> > > > >> > >> > > > > >> > > > >> > > >> > http://mail-archives.apache.org/mod_mbox/kafka-dev/201602.mbox/%3ccabtagwgoebukyapfpchmycjk2tepq3ngtuwnhtr2tjvsnc8...@mail.gmail.com%3E > >> > > > >> > > >> > > > >> > I have the following WIP patch for reference. It needs a few > >> more > >> > > unit > >> > > > >> > tests and documentation. Other than that it should run fine. > >> > > > >> > > >> > > > >> > > >> > > > >> > >> > > > > >> > > > >> > > >> > https://github.com/becketqin/kafka/commit/712357a3fbf1423e05f9eed7d2fed5b6fe6c37b7 > >> > > > >> > > >> > > > >> > Thanks, > >> > > > >> > > >> > > > >> > Jiangjie (Becket) Qin > >> > > > >> > > >> > > > >> > >> > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > > > > > > > > -- > > -- Guozhang > -- Liquan Pei Software Engineer, Confluent Inc