Tom, this discussion brings up something that's been bugging me about the recommendations for getting more performance out of PG.. in particular the one that suggests you put your WAL files on a different physical drive from the database.

Consider the following scenario:
Database on drive1
WAL on drive2

1. PG write of some sort occurs.
2. PG writes out the WAL.
3. PG writes out the data.
4. PG updates the WAL to reflect data actually written.
5. System crashes/reboots/whatever.

With the DB and the WAL on different drives, it seems possible to me that drive2 could've fsync()'d or otherwise properly written all of the data out, but drive1 could have failed somewhere along the way and not actually written the data to the DB.

The next time PG is brought up, the WAL would indicate the transaction, as it were, was a success.. but the data wouldn't actually be there.

In the case of using only one drive, the rollback (from a FS perspective) couldn't possibly occur in such a way as to leave step 4 as a success, but step 3 as a failure -- worst case, the data would be written out but the WAL wouldn't have been updated (rolled back say by the FS) and thus PG will roll back the data itself, or use whatever mechanism it uses to insure data integrity is consistent with the WAL.

Am I smoking something here or is this a real, if rare in practice, risk that occurs when you have the WAL on a different drive than the data is on?


At 17:39 10/27/2003, Tom Lane wrote:
"Rick Gigger" <[EMAIL PROTECTED]> writes:
> It seems to me file system journaling should fix the whole problem by giving
> you a record of what was actually commited to disk and what was not.


Nope, a journaling FS has exactly the same problem Postgres does
(because the underlying "WAL" concept is the same: write the log entries
before you change the files they describe).  If the drive lies about
write order, the FS can be screwed just as badly.  Now the FS code might
have a low-level way to force write order that Postgres doesn't have
access to ... but simply uttering the magic incantation "journaling file
system" will not make this problem disappear.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html


---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Reply via email to