On 22 January 2013 16:43, Kevin Grittner <kgri...@mail.com> wrote: > [Please keep the list copied, and put your reply in-line instead > of at the top.] > > Cliff de Carteret wrote: > > On 22 January 2013 16:07, Kevin Grittner <kgri...@mail.com> wrote: > > > >> Cliff de Carteret wrote: > >> > >>> The current setup has been working successfully for several years > >>> until the recent database crash > >> > >> What file does the server log say it is trying to archive? What > >> error are you getting? Does that filename already exist on the > >> archive (or some intermediate location used by the archive command > >> or script)? > > > The sever log is (repeated constantly): > > > > LOG: archive command failed with exit code 1 > > DETAIL: The failed archive command was: test ! -f > > /opt/postgres/remote_pgsql/wal_archive/00000001000000A800000078 && cp > > pg_xlog/00000001000000A800000078 > > /opt/postgres/remote_pgsql/wal_archive/00000001000000A800000078 > > WARNING: transaction log file "00000001000000A800000078" could not be > > archived: too many failures > > > > The file 00000001000000A800000078 exists in the remote archive's > > wal_archive directory. I read a post saying to copy the file over to the > > archive and then delete the .ready file to get postgres to move onto the > > next file but this ended up logging out saying that a log file was > missing. > > There are more recent files in this directory but they end at the point > > where I reverted all of the changes I made last night when time was > running > > out and the database had to be put back to a known state. > > I would have deleted (or renamed) the copy in the archive > directory. Archiving should have then resumed and cleaned up the > pg_xlog directory. > I have now deleted the copy on the remote wal_archive folder and the archiving is now functioning and sending the logs from the local to the remote folder. The remote database does not startup and the following is in the log:
LOG: database system was shut down in recovery at 2013-01-22 10:54:48 GMT LOG: entering standby mode LOG: restored log file "00000001000000AB00000051" from archive LOG: invalid resource manager ID in primary checkpoint record PANIC: could not locate a valid checkpoint record LOG: startup process (PID 22350) was terminated by signal 6: Aborted LOG: aborting startup due to startup process failure 00000001000000AB00000051 is in my remote database's pg_xlog folder Thanks for your help already! > > -Kevin >