Just following up on Tom Lane's email - A couple of things that I hadn't mentioned: After bringing up the machine, the first thing I did before mucking about with PostGreSQL was to tarball $PGDATA so that I'd have a second chance if I messed up. I then ran pg_resetlog -f the first time, as Tom surmised, with the unwanted results.
That done, I sent out the email, and followed Tom's instructions (yay backups!) and did it properly. On Sat, Sep 21, 2002 at 11:13:44AM -0400, Tom Lane wrote: > "Pete St. Onge" <[EMAIL PROTECTED]> writes: > > That should not have been a catastrophic mistake in any version >= 7.1. > I suspect you had disk problems or other problems. We did, but these were on a different disk according to the logs, AFAIK. > These numbers are suspiciously small for an installation that's been > in production awhile. I suspect you have not told us the whole story; > in particular I suspect you already tried "pg_resetxlog -f", which was > probably not a good idea. *raises hand* Yep. Here's the contents of the pg_xlog directory. PGSQL has only been used here for approximately 4 months of fairly light use, so perhaps the numbers aren't as strange as they could be (this is from the backup). -rw------- 1 postgres postgres 16777216 Sep 19 22:09 000000020000007E > Yeah, your xlog positions should be a great deal higher than they are, > if segment 2/7E was previously in use. > > It is likely that you can recover (with some uncertainty about integrity > of recent transactions) if you proceed as follows: > > 1. Get contrib/pg_resetxlog/pg_resetxlog.c from the 7.2.2 release ... [Chomp] The compile worked without a hitch after doing ./configure in the top-level directory. I just downloaded the src for both trees, made the changes manually, copied the file into the 7.1.3 tree and compiled it there. > 2. Run the hacked-up pg_resetxlog like this: > > pg_resetxlog -l 2 127 -x 1000000000 $PGDATA > > (the -l position is next beyond what we see in pg_xlog, the 1-billion > XID is just a guess at something past where you were. Actually, can > you give us the size of pg_log, ie, $PGDATA/global/1269? That would > allow computing a correct next-XID to use. Figure 4 XIDs per byte, > thus if pg_log is 1 million bytes you need -x at least 4 million.) -rw------- 1 postgres postgres 11870208 Sep 19 17:00 1269 This gives a min WAL starting location of 47480832. I used 47500000. > 3. The postmaster should start now. I had to use pg_resetxlog's force option, but yeah, it worked like you said it would. > 4. *Immediately* attempt to do a pg_dumpall. Do not pass GO, do not > collect $200, do not let in any interactive clients until you've done > it. (I'd suggest tweaking pg_hba.conf to disable all logins but your > own.) I did not pass go, I did not collect $200. I *did* do a pg_dumpall right there and then, and was able to dump everything I needed. One of the projects uses large objects - image files and html files (don't ask, I've already tried to dissuade the Powers-That-Be) - and these didn't come out. However, since this stuff is entered via script, the project leader was fine with re-running the scripts tomorrow. > 5. If pg_dumpall succeeds and produces sane-looking output, then you've > survived. initdb, reload the dump file, re-open for business, go have > a beer. (Recommended: install 7.2.2 and reload into that, not 7.1.*.) > You will probably still need to check for partially-applied recent > transactions, but for the most part you should be OK. rpm -Uvh'ed the 7.2.2 RPMs, initdb'd and reloaded data into the new installation. Pretty painless. I've just sent out an email to folks here to let them know the situation, and we should know in the next day or so what is up. > 6. If pg_dumpall fails then let us know what the symptoms are, and we'll > see if we can figure out a workaround for whatever the corruption is. I've kept the tarball with the corrupted data. I'll hold onto it for a bit, in case, but will likely expunge it in the next week or so. If this can have a use for the project (whatever it may be), let me know and I can burn it to DVD. Of course, without your help, Tom, there would be a lot of Very Unhappy People here, me only being one of them. Many thanks for your help and advice! Cheers, Pete -- Pete St. Onge Research Associate, Computational Biologist, UNIX Admin Banting and Best Institute of Medical Research Program in Bioinformatics and Proteomics University of Toronto http://www.utoronto.ca/emililab/ ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])