[ https://issues.apache.org/jira/browse/KAFKA-615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13729010#comment-13729010 ]
Jun Rao commented on KAFKA-615: ------------------------------- Thanks for patch v5. Some comments: 50. Log: 50.1 recoveryLog(): It seems that recoveryPoint can be > lastOffset due to truncation on unclean shutdown. See the comment in 52.2. 50.2 The comment in the following code is no longer correct since it's not just recovering the active segment. Also, it seems that if we hit the exception, we should delete the rest of the segments after resetting current segment to startOffset. } catch { case e: InvalidOffsetException => val startOffset = curr.baseOffset warn("Found invalid offset during recovery of the active segment for topic partition " + dir.getName +". Deleting the segment and " + "creating an empty one with starting offset " + startOffset) // truncate the active segment to its starting offset curr.truncateTo(startOffset) } 50.3 the log flusher scheduler is multi-threaded. I am wondering if that guarantees that the flushes on the same log will complete in recovery point order, which is important? 51. LogSegment.recover(): the comment for the return value is incorrect. We return truncated bytes, not messages. 52. ReplicaManager: 52.1 The checkpointing of recovery point can be done once per LeaderAndIsr request, not per partition. 52.2 There is this corner case that I am not sure how to handle. Suppose that we truncate a log and immediately crash before flushing the recovery points. During recovery, we can happen is that a recovery point may be larger than logEndOffset. However, the log may need recovery since we don't know whether the flushing on truncated data succeeded or not. So, perhaps what we can do is that in recoveryLog(), if (lastOffset <= this.recoveryPoint), we force recover the last segment? 53. Could you verify that the basic system test works? > Avoid fsync on log segment roll > ------------------------------- > > Key: KAFKA-615 > URL: https://issues.apache.org/jira/browse/KAFKA-615 > Project: Kafka > Issue Type: Bug > Reporter: Jay Kreps > Assignee: Neha Narkhede > Attachments: KAFKA-615-v1.patch, KAFKA-615-v2.patch, > KAFKA-615-v3.patch, KAFKA-615-v4.patch, KAFKA-615-v5.patch, KAFKA-615-v6.patch > > > It still isn't feasible to run without an application level fsync policy. > This is a problem as fsync locks the file and tuning such a policy so that > the flushes aren't so frequent that seeks reduce throughput, yet not so > infrequent that the fsync is writing so much data that there is a noticable > jump in latency is very challenging. > The remaining problem is the way that log recovery works. Our current policy > is that if a clean shutdown occurs we do no recovery. If an unclean shutdown > occurs we recovery the last segment of all logs. To make this correct we need > to ensure that each segment is fsync'd before we create a new segment. Hence > the fsync during roll. > Obviously if the fsync during roll is the only time fsync occurs then it will > potentially write out the entire segment which for a 1GB segment at 50mb/sec > might take many seconds. The goal of this JIRA is to eliminate this and make > it possible to run with no application-level fsyncs at all, depending > entirely on replication and background writeback for durability. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira