[jira] [Commented] (KAFKA-615) Avoid fsync on log segment roll

Sriram Subramanian (JIRA) Mon, 05 Aug 2013 10:07:36 -0700

    [ 
https://issues.apache.org/jira/browse/KAFKA-615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13729651#comment-13729651
 ]


Sriram Subramanian commented on KAFKA-615:
------------------------------------------

1. Log.scala
1.1 Doc fix. I see two of this - * @param recoveryPoint The offset at which to 
begin recovery--i.e. the first offset which has not been flushed to disk

2. ReplicaManager.scla
2.1 How does this help? We could always crash before calling this and we should 
still do the right thing on recovery.
  
3. TestLogPerformance
3.1 Should we make the tool consistent like the rest (w.r.t args)?
3.2 I am assuming we are testing with the default pdflush interval. Should we 
be able to control the flush interval to test performance? 
3.3 Could you add a comment on what this tool does and how is it different from 
the LinearWriteTest tool

                
> Avoid fsync on log segment roll
> -------------------------------
>
>                 Key: KAFKA-615
>                 URL: https://issues.apache.org/jira/browse/KAFKA-615
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Jay Kreps
>            Assignee: Neha Narkhede
>         Attachments: KAFKA-615-v1.patch, KAFKA-615-v2.patch, 
> KAFKA-615-v3.patch, KAFKA-615-v4.patch, KAFKA-615-v5.patch, KAFKA-615-v6.patch
>
>
> It still isn't feasible to run without an application level fsync policy. 
> This is a problem as fsync locks the file and tuning such a policy so that 
> the flushes aren't so frequent that seeks reduce throughput, yet not so 
> infrequent that the fsync is writing so much data that there is a noticable 
> jump in latency is very challenging.
> The remaining problem is the way that log recovery works. Our current policy 
> is that if a clean shutdown occurs we do no recovery. If an unclean shutdown 
> occurs we recovery the last segment of all logs. To make this correct we need 
> to ensure that each segment is fsync'd before we create a new segment. Hence 
> the fsync during roll.
> Obviously if the fsync during roll is the only time fsync occurs then it will 
> potentially write out the entire segment which for a 1GB segment at 50mb/sec 
> might take many seconds. The goal of this JIRA is to eliminate this and make 
> it possible to run with no application-level fsyncs at all, depending 
> entirely on replication and background writeback for durability.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (KAFKA-615) Avoid fsync on log segment roll

Reply via email to