Hi Thomas, Any news on a new version for PageLog that includes a fix for this?
Regards, Rob. Op dinsdag 7 juli 2015 17:59:46 UTC+2 schreef Thomas Mueller: > > Hi, > > I see, this could be the problem. I'm still working on a file system > implementation that reorders writes (I was on vacation for two weeks), I > think I can test a fix soon. > > Regards, > Thomas > > > On Monday, July 6, 2015, Nicolas Barbier <[email protected] > <javascript:>> wrote: > >> Hello, >> >> I suspect that fsync’s are not issued at the right times when using >> PageLog, causing a risk of corruption when the OS reorders writes and >> crashes (e.g., power failure) in the middle of writing. >> >> To investigate this I have read the code a bit, which seemed to >> confirm my suspicion. As I am not very well acquainted with the H2 >> code, I have also performed a test using strace, to check whether >> fsync’s are really performed in the wrong order w.r.t. writes (which >> indeed seems to be the case). I describe the results here: >> >> I put some primitive logging at the following points: >> >> At the beginning of FileDisk.force (FilePathDisk.java:409): >> >> System.out.println("nicolas: FileDisk.force"); >> new Exception().printStackTrace(); >> >> At the beginning of PageLog.flushOut (PageLog.java:855): >> >> System.out.println("nicolas: flushing log"); >> >> In PageStore.writePage (PageStore.java:1334) right before file.write is >> called: >> >> System.out.println("nicolas: writing page"); >> new Exception().printStackTrace(); >> >> I created a test program (attached) that just opens a database, >> inserts one row, waits two seconds, and then closes the database. >> >> Then, I rebuilt H2 and ran the test program under strace as follows: >> >> strace -f java -cp >> /home/itsme/bla/h2database-read-only/h2/bin/h2-1.4.187.jar:. >> PageLogTest > strace.log 2>&1 >> >> I removed everything from this file that comes before “nicolas: >> connected” (the result is attached). >> >> Then, I seached for “nicolas: flushing log” (found at line 67). The >> flushing is performed by the WriterThread (as expected) during the 2 >> second sleep. As part of the flushing H2 performs a write (“write(16, >> ” at line 107, the stacktrace that confirms that this is part of the >> log is right before it at line 74). This write is therefore in the >> “log” part of the database file. >> >> The first fsync is located in the strace file at line 437. It is part >> of the code that performs a checkpoint at shutdown. >> >> The problem now follows: A write is performed to the non-log part of >> the database file, *before* the fsync mentioned above. At line 394 in >> the strace file, a write is performed that is part of a call to >> PageDataLeaf.write (see the logged stracktrace right before, that >> starts at line 358). >> >> When the OS reorders these two writes so that the non-log (“data”) >> write is issued to the disk before the write to the log, and the OS >> crashes after the first write, the database becomes corrupt. >> >> What do you think? Am I missing something here? >> >> Nicolas >> >> -- >> A. Because it breaks the logical sequence of discussion. >> Q. Why is top posting bad? >> >> -- >> You received this message because you are subscribed to the Google Groups >> "H2 Database" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To post to this group, send email to [email protected]. >> Visit this group at http://groups.google.com/group/h2-database. >> For more options, visit https://groups.google.com/d/optout. >> > -- You received this message because you are subscribed to the Google Groups "H2 Database" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/h2-database. For more options, visit https://groups.google.com/d/optout.
