On Tue, Mar 25, 2014 at 6:33 PM, Jeff Janes <jeff.ja...@gmail.com> wrote:
> On Tuesday, March 25, 2014, Steven Schlansker <ste...@likeness.com> wrote: > >> Hi everyone, >> >> I have a Postgres 9.3.3 database machine. Due to some intelligent work >> on the part of someone who shall remain nameless, the WAL archive command >> included a '> /dev/null 2>&1' which masked archive failures until the disk >> entirely filled with 400GB of pg_xlog entries. >> > > PostgreSQL itself should be logging failures to the server log, regardless > of whether those failures log themselves. > > >> I have fixed the archive command and can see WAL segments being shipped >> off of the server, however the xlog remains at a stable size and is not >> shrinking. In fact, it's still growing at a (much slower) rate. >> > > The leading edge of the log files should be archived as soon as they fill > up, and recycled/deleted two checkpoints later. The trailing edge should > be archived upon checkpoints and then recycled or deleted. I think there > is a throttle on how many off the trailing edge are archived each > checkpoint. So issues a bunch of "CHECKPOINT;" commands for a while and > see if that clears it up. > Actually my description is rather garbled, mixing up what I saw when wal_keep_segments was lowered, not when recovering from a long lasting archive failure. Nevertheless, checkpoints are what provoke the removal of excessive WAL files. Are you logging checkpoints? What do they say? Also, what is in pg_xlog/archive_status ? Cheers, Jeff