Hi, all, In my previous note, the two check points per partition have to be stored in different files. Otherwise, the files could be corrupted.
Thanks, Xiao Li On Mar 2, 2015, at 10:25 PM, Xiao <lixiao1...@gmail.com> wrote: > Hi, all, > > I just started reading the source codes of Kafka. The current > OffsetCheckpoint.write() does not look good to me. After the file rename, it > still needs to do a fsync. > > In addition, it should maintain a checksum for each check point. The checksum > corruption needs to be checked during the recovery. > > Ideally, it should maintain two check points for each partition. At least, it > can ensure there exists a valid checkpoint. > > Let me know if my concerns are valid. > > I think this talk might help most understand the issue. > https://www.usenix.org/conference/osdi14/technical-sessions/presentation/pillai > > Thanks, > > Xiao Li >