J.R needs comments on this. PITR has problems because local relations aren't logged to WAL. Suggestions?
--------------------------------------------------------------------------- J. R. Nield wrote: > As per earlier discussion, I'm working on the hot backup issues as part > of the PITR support. While I was looking at the buffer manager and the > relcache/MyDb issues to figure out the best way to work this, it > occurred to me that PITR will introduce a big problem with the way we > handle local relations. > > The basic problem is that local relations (rd_myxactonly == true) are > not part of a checkpoint, so there is no way to get a lower bound on the > starting LSN needed to recover a local relation. In the past this did > not matter, because either the local file would be (effectively) > discarded during recovery because it had not yet become visible, or the > file would be flushed before the transaction creating it made it > visible. Now this is a problem. > > So I need a decision from the core team on what to do about the local > buffer manager. My preference would be to forget about the local buffer > manager entirely, or if not that then to allow it only for _true_ > temporary data. The only alternative I can devise is to create some way > for all other backends to participate in a checkpoint, perhaps using a > signal. I'm not sure this can be done safely. > > Anyway, I'm glad the tuplesort stuff doesn't try to use relation files > :-) > > Can the core team let me know if this is acceptable, and whether I > should move ahead with changes to the buffer manager (and some other > stuff) needed to avoid special treatment of rd_myxactonly relations? > > Also to Richard: have you guys at multera dealt with this issue already? > Is there some way around this that I'm missing? > > > Regards, > > John Nield > > > > > Just as an example of this problem, imagine the following sequence: > > 1) Transaction TX1 creates a local relation LR1 which will eventually > become a globally visible table. Tuples are inserted into the local > relation, and logged to the WAL file. Some tuples remain in the local > buffer cache and are not yet written out, although they are logged. TX1 > is still in progress. > > 2) Backup starts, and checkpoint is called to get a minimum starting LSN > (MINLSN) for the backed-up files. Only the global buffers are flushed. > > 3) Backup process copies LR1 into the backup directory. (postulate some > way of coordinating with the local buffer manager, a problem I have not > solved). > > 4) TX1 commits and flushes its local buffers. A dirty buffer exists > whose LSN is before MINLSN. LR1 becomes globally visible. > > 5) Backup finishes copying all the files, including the local relations, > and then flushes the log. The log files between MINLSN and the current > LSN are copied to the backup directory, and backup is done. > > 6) Sometime later, a system administrator restores the backup and plays > the logs forward starting at MINLSN. LR1 will be corrupt, because some > of the log entries required for its restoration will be before MINLSN. > This corruption will not be detected until something goes wrong. > > BTW: The problem doesn't only happen with backup! It occurs at every > checkpoint as well, I just missed it until I started working on the hot > backup issue. > > -- > J. R. Nield > [EMAIL PROTECTED] > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED]) > -- Bruce Momjian | http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026 ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])