We found one place where the bookie may lose data even though we turn on fsync for the journal. Condition: - One journal disk, and turn on fsync for the journal - Configure two ledger disks, ledger1, and ledger2
Assume we write 100MB data into one bookie, 70MB data written into ledger1's write cache, and 30 MB data written into ledger2's write cache. Ledger1's write cache is full and triggers flush. In flushing the write cache, it will trigger a checkpoint to mark the journal’s lastMark position (100MB’s offset) and write the lastMark position into both ledger1 and ledger2's lastMark file. At this time, this bookie shutdown without flush write cache, such as shutdown by `kill -9` command, and ledger2's write cache (30MB) doesn’t flush into ledger disk. But ledger2's lastMark position which persisted into lastMark file has been updated to 100MB’s offset. When the bookie starts up, the journal reply position will be `min(ledger1's lastMark, ledger2's lastMark)`, and it will be 100MB’s offset. The ledger2's 30MB data won’t reply and that data will be lost. Please help take a look. I’m not sure whether I missed some logic. Thanks, Hang