Hi,

I see, this could be the problem. I'm still working on a file system
implementation that reorders writes (I was on vacation for two weeks), I
think I can test a fix soon.

Regards,
Thomas


On Monday, July 6, 2015, Nicolas Barbier <[email protected]> wrote:

> Hello,
>
> I suspect that fsync’s are not issued at the right times when using
> PageLog, causing a risk of corruption when the OS reorders writes and
> crashes (e.g., power failure) in the middle of writing.
>
> To investigate this I have read the code a bit, which seemed to
> confirm my suspicion. As I am not very well acquainted with the H2
> code, I have also performed a test using strace, to check whether
> fsync’s are really performed in the wrong order w.r.t. writes (which
> indeed seems to be the case). I describe the results here:
>
> I put some primitive logging at the following points:
>
> At the beginning of FileDisk.force (FilePathDisk.java:409):
>
> System.out.println("nicolas: FileDisk.force");
> new Exception().printStackTrace();
>
> At the beginning of PageLog.flushOut (PageLog.java:855):
>
> System.out.println("nicolas: flushing log");
>
> In PageStore.writePage (PageStore.java:1334) right before file.write is
> called:
>
> System.out.println("nicolas: writing page");
> new Exception().printStackTrace();
>
> I created a test program (attached) that just opens a database,
> inserts one row, waits two seconds, and then closes the database.
>
> Then, I rebuilt H2 and ran the test program under strace as follows:
>
> strace -f java -cp
> /home/itsme/bla/h2database-read-only/h2/bin/h2-1.4.187.jar:.
> PageLogTest > strace.log 2>&1
>
> I removed everything from this file that comes before “nicolas:
> connected” (the result is attached).
>
> Then, I seached for “nicolas: flushing log” (found at line 67). The
> flushing is performed by the WriterThread (as expected) during the 2
> second sleep. As part of the flushing H2 performs a write (“write(16,
> ” at line 107, the stacktrace that confirms that this is part of the
> log is right before it at line 74). This write is therefore in the
> “log” part of the database file.
>
> The first fsync is located in the strace file at line 437. It is part
> of the code that performs a checkpoint at shutdown.
>
> The problem now follows: A write is performed to the non-log part of
> the database file, *before* the fsync mentioned above. At line 394 in
> the strace file, a write is performed that is part of a call to
> PageDataLeaf.write (see the logged stracktrace right before, that
> starts at line 358).
>
> When the OS reorders these two writes so that the non-log (“data”)
> write is issued to the disk before the write to the log, and the OS
> crashes after the first write, the database becomes corrupt.
>
> What do you think? Am I missing something here?
>
> Nicolas
>
> --
> A. Because it breaks the logical sequence of discussion.
> Q. Why is top posting bad?
>
> --
> You received this message because you are subscribed to the Google Groups
> "H2 Database" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected] <javascript:;>.
> To post to this group, send email to [email protected]
> <javascript:;>.
> Visit this group at http://groups.google.com/group/h2-database.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups "H2 
Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/h2-database.
For more options, visit https://groups.google.com/d/optout.

Reply via email to