[ 
https://issues.apache.org/jira/browse/KAFKA-3802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15320162#comment-15320162
 ] 

Moritz Siuts commented on KAFKA-3802:
-------------------------------------

I have been able to reproduce the problem with a small test program and to hunt 
it down to a specific change. The problem occurs when Kafka is shutted down, so 
when the logfiles are closed.

The problem seems to be introduced with KAFKA-1646 which added {{trim()}} to 
the {{close()}} method in the {{FileMessageSet}}.
The trim method calls {{channel.truncate()}} which on some systems (I can 
reproduce it  on Ubuntu 12.04 with Java7 but not on Mac OS X with Java 8) 
modifies the mtime. If I delete the truncate code in my PoC below the problem 
does not occur.

I think one could fix this, by checking in {{truncateTo()}} that the targetSize 
is different from channel.size before calling truncate on the channel, but I 
was not able to find the time to test this. 

Because the code was not changed in Kafka 0.10.0 it should have the same 
problems.

Code for reproducing (watch the mtime of {{/tmp/kafka.txt}} while it is 
sleeping for 2 Minutes:

{noformat}
import java.io.File;
import java.io.RandomAccessFile;
import java.nio.ByteBuffer;
import java.nio.channels.FileChannel;
import java.util.concurrent.TimeUnit;

public class Main {

    public static void main(String[] args) throws Exception {
        File file = new File("/tmp/kafka.txt");

        FileChannel channel = new RandomAccessFile(file, "rw").getChannel();


        channel.write(ByteBuffer.wrap("Kafka".getBytes("UTF-8")));

        System.out.println("Going to sleep.");
        Thread.sleep(TimeUnit.MINUTES.toMillis(2));

        System.out.println("Going to close the channel.");
        channel.force(true);
        channel.truncate(channel.size()); // problem is here

        channel.close();
    }
}

{noformat}

> log mtimes reset on broker restart
> ----------------------------------
>
>                 Key: KAFKA-3802
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3802
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.9.0.1
>            Reporter: Andrew Otto
>
> Folks over in 
> http://mail-archives.apache.org/mod_mbox/kafka-users/201605.mbox/%3CCAO8=cz0ragjad1acx4geqcwj+rkd1gmdavkjwytwthkszfg...@mail.gmail.com%3E
>  are commenting about this issue.
> In 0.9, any data log file that was on
> disk before the broker has it's mtime modified to the time of the broker
> restart.
> This causes problems with log retention, as all the files then look like
> they contain recent data to kafka.  We use the default log retention of 7
> days, but if all the files are touched at the same time, this can cause us
> to retain up to 2 weeks of log data, which can fill up our disks.
> This happens *most* of the time, but seemingly not all.  We have seen broker 
> restarts where mtimes were not changed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to