+1.  This is an important performance fix for Windows-based clusters.

-Jakob

On 22 April 2015 at 03:25, Honghai Chen <honghai.c...@microsoft.com> wrote:
> Fix the issue Sriram mentioned. Code review and jira/KIP updated.
>
> Below are detail description for the scenarios:
> 1.If do clear shutdown,  the last log file will be truncated to its real size 
> since the close() function of FileMessageSet will call trim(),
> 2.If crash, then when restart,  will go through the process of recover() and 
> the last log file will be truncate to its real size, (and the position will 
> be moved to end of the file)
> 3.When service start and open existing file
> a.Will run the LogSegment constructor which has NO parameter "preallocate",
> b.Then in FileMessageSet,  the "end" in FileMessageSet will be Int.MaxValue,  
>  and then "channel.position(math.min(channel.size().toInt, end))"  will make 
> the position be end of the file,
> c.If recover needed, the recover function will truncate file to end of valid 
> data, and also move the position to it,
>
> 4.When service running and need create new log segment and new FileMessageSet
>
> a.If preallocate = truei.the "end" in FileMessageSet will be 0,  the file 
> size will be "initFileSize", and then 
> "channel.position(math.min(channel.size().toInt, end))"  will make the 
> position be 0,
>
> b.Else if preallocate = falsei.backward compatible, the "end" in 
> FileMessageSet will be Int.MaxValue, the file size will be "0",  and then 
> "channel.position(math.min(channel.size().toInt, end))"  will make the 
> position be 0,
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-20+-+Enable+log+preallocate+to+improve+consume+performance+under+windows+and+some+old+Linux+file+system
> https://issues.apache.org/jira/browse/KAFKA-1646
> https://reviews.apache.org/r/33204/diff/2/
>
> Thanks, Honghai Chen
> http://aka.ms/kafka
> http://aka.ms/manifold
>
> -----Original Message-----
> From: Honghai Chen
> Sent: Wednesday, April 22, 2015 11:12 AM
> To: dev@kafka.apache.org
> Subject: RE: [DISCUSS] KIP 20 Enable log preallocate to improve consume 
> performance under windows and some old Linux file system
>
> Hi Sriram,
>         One sentence of code missed, will update code review board and KIP 
> soon.
>         For LogSegment and FileMessageSet, must use different constructor 
> function for existing file and new file, then the code " 
> channel.position(math.min(channel.size().toInt, end)) " will make sure the 
> position at end of existing file.
>
> Thanks, Honghai Chen
>
> -----Original Message-----
> From: Jay Kreps [mailto:jay.kr...@gmail.com]
> Sent: Wednesday, April 22, 2015 5:22 AM
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP 20 Enable log preallocate to improve consume 
> performance under windows and some old Linux file system
>
> My understanding of the patch is that clean shutdown truncates the file back 
> to it's true size (and reallocates it on startup). Hard crash is handled by 
> the normal recovery which should truncate off the empty portion of the file.
>
> On Tue, Apr 21, 2015 at 10:52 AM, Sriram Subramanian < 
> srsubraman...@linkedin.com.invalid> wrote:
>
>> Could you describe how recovery works in this mode? Say, we had a 250
>> MB preallocated segment and we wrote till 50MB and crashed. Till what
>> point do we recover? Also, on startup, how is the append end pointer
>> set even on a clean shutdown? How does the FileChannel end position
>> get set to 50 MB instead of 250 MB? The existing code might just work
>> for it but explaining that would be useful.
>>
>> On 4/21/15 9:40 AM, "Neha Narkhede" <n...@confluent.io> wrote:
>>
>> >+1. I've tried this on Linux and it helps reduce the spikes in append
>> >+(and
>> >hence producer) latency for high throughput writes. I am not entirely
>> >sure why but my suspicion is that in the absence of preallocation,
>> >you see spikes writes need to happen faster than the time it takes
>> >Linux to allocate the next block to the file.
>> >
>> >It will be great to see some performance test results too.
>> >
>> >On Tue, Apr 21, 2015 at 9:23 AM, Jay Kreps <jay.kr...@gmail.com> wrote:
>> >
>> >> I'm also +1 on this. The change is quite small and may actually
>> >>help perf  on Linux as well (we've never tried this).
>> >>
>> >> I have a lot of concerns on testing the various failure conditions
>> >> but I think since it will be off by default the risk is not too high.
>> >>
>> >> -Jay
>> >>
>> >> On Mon, Apr 20, 2015 at 6:58 PM, Honghai Chen
>> >><honghai.c...@microsoft.com>
>> >> wrote:
>> >>
>> >> > I wrote a KIP for this after some discussion on KAFKA-1646.
>> >> > https://issues.apache.org/jira/browse/KAFKA-1646
>> >> >
>> >> >
>> >>
>> >>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-20+-+Enable+log+
>> pre
>> >>allocate+to+improve+consume+performance+under+windows+and+some+old+Linux+
>> >>file+system
>> >> > The RB is here: https://reviews.apache.org/r/33204/diff/
>> >> >
>> >> > Thanks, Honghai
>> >> >
>> >> >
>> >>
>> >
>> >
>> >
>> >--
>> >Thanks,
>> >Neha
>>
>>

Reply via email to