G'day all, PG version: 7.4.7 (debian 7.4.7-6sarge2) OS: Linux (debian sarge)
I started a 'vacuum full analyze' using pgsql, then, after about 10 minutes, I issued a control-c in pgsql. This hadn't returned my prompt after another 10 minutes or so. A glance at the backend showed other processes running and some apparently hanging around, possibly blocked by the vacuum. Eventually I did a shutdown using the debian /etc/init.d/postgresql script. This tried shutting the postmaster down cleanly, but eventually 'kill -9'ed the postmaster (much to my surpise, given the ever present "TIP 2: Don't 'kill -9' the postmaster" - just goes to show, you shouldn't trust *anyone*!). Anyhow, given the 'kill -9' I guess you could stop reading right here... Trying to start the postmaster again failed, with what looks like the relevent message being: PANIC: btree_delete_page_redo: lost target page See the attached syslog.txt for the system logs. Googling 'postgres "lost target page"' only came back with extracts from nbtxlog.c rather than any previously reported problems like this. At this point I restored the database from our backups, however I still have a copy of the corrupted data directory if there's interest in trying to diagnose the problem. The entire data directory is around 5 Gb so let me know if you'd like to see something specific in there. Cheers, Chris.
Jan 25 13:49:10 wren postgres[6106]: [58-2] STATEMENT: VACUUM full ANALYZE; Jan 25 13:59:58 wren postgres[6903]: [1-1] LOG: recycled transaction log file "000000230000009F" Jan 25 14:17:35 wren postgres[1834]: [1-1] LOG: received fast shutdown request Jan 25 14:17:35 wren postgres[1834]: [2-1] LOG: aborting any active transactions Jan 25 14:17:35 wren postgres[6117]: [3-2] STATEMENT: Jan 25 14:17:35 wren postgres[6117]: [3-3] ^Iselect Jan 25 14:17:35 wren postgres[6117]: [3-4] ^I customer.custid, Jan 25 14:17:35 wren postgres[6117]: [3-5] ^I login.login, Jan 25 14:17:35 wren postgres[6117]: [3-6] ^I radacct.acctsessionid, Jan 25 14:17:35 wren postgres[6117]: [3-7] ^I radacct.acctstoptime, Jan 25 14:17:35 wren postgres[8708]: [4-1] FATAL: the database system is shutting down Jan 25 14:17:35 wren postgres[6117]: [3-8] ^I billtype.datacost, Jan 25 14:17:35 wren postgres[6117]: [3-9] ^I radacct.acctoutputoctets Jan 25 14:17:35 wren postgres[6117]: [3-10] ^Ifrom Jan 25 14:17:35 wren postgres[6117]: [3-11] ^I login,radacct,customer,billtype Jan 25 14:17:35 wren postgres[6117]: [3-12] ^Iwhere Jan 25 14:17:35 wren postgres[6117]: [3-13] ^I login.login = radacct.username and Jan 25 14:17:35 wren postgres[6117]: [3-14] ^I login.custid = customer.custid and Jan 25 14:17:35 wren postgres[6117]: [3-15] ^I date_trunc ('second', radacct.acctstoptime) > date_trunc ('second', customer.lastbilled) and Jan 25 14:17:35 wren postgres[6117]: [3-16] ^I customer.billtype = billtype.code Jan 25 14:17:35 wren postgres[6117]: [3-17] ^Iorder by radacct.acctstoptime Jan 25 14:17:35 wren postgres[6117]: [3-18] ^I Jan 25 14:17:35 wren postgres[6117]: [3-19] ^I Jan 25 14:17:58 wren postgres[8762]: [4-1] FATAL: the database system is shutting down Jan 25 14:18:45 wren postgres[1834]: [3-1] LOG: received immediate shutdown request Jan 25 14:18:48 wren postgres[6111]: [3-2] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server Jan 25 14:18:48 wren postgres[6111]: [3-3] process exited abnormally and possibly corrupted shared memory. Jan 25 14:18:48 wren postgres[6111]: [3-4] HINT: In a moment you should be able to reconnect to the database and repeat your command. Jan 25 14:18:48 wren postgres[6111]: [3-5] STATEMENT: VACUUM full ANALYZE; Jan 25 14:18:48 wren postgres[8623]: [1-2] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server Jan 25 14:18:48 wren postgres[8623]: [1-3] process exited abnormally and possibly corrupted shared memory. Jan 25 14:18:48 wren postgres[8623]: [1-4] HINT: In a moment you should be able to reconnect to the database and repeat your command. Jan 25 14:18:49 wren postgres[9120]: [1-1] LOG: database system was interrupted at 2007-01-25 14:10:27 EST Jan 25 14:18:49 wren postgres[9120]: [2-1] LOG: checkpoint record is at 23/A13FD9FC Jan 25 14:18:49 wren postgres[9120]: [3-1] LOG: redo record is at 23/A13D3318; undo record is at 0/0; shutdown FALSE Jan 25 14:18:49 wren postgres[9120]: [4-1] LOG: next transaction ID: 105232106; next OID: 20947715 Jan 25 14:18:49 wren postgres[9120]: [5-1] LOG: database system was not properly shut down; automatic recovery in progress Jan 25 14:18:49 wren postgres[9120]: [6-1] LOG: redo starts at 23/A13D3318 Jan 25 14:18:49 wren postgres[9120]: [7-1] PANIC: btree_delete_page_redo: lost target page Jan 25 14:18:49 wren postgres[9114]: [1-1] LOG: startup process (PID 9120) was terminated by signal 6 Jan 25 14:18:49 wren postgres[9114]: [2-1] LOG: aborting startup due to startup process failure Jan 25 14:20:35 wren postgres[9232]: [1-1] LOG: database system was interrupted while in recovery at 2007-01-25 14:18:49 EST Jan 25 14:20:35 wren postgres[9232]: [1-2] HINT: This probably means that some data is corrupted and you will have to use the last backup for recovery. Jan 25 14:20:35 wren postgres[9232]: [2-1] LOG: checkpoint record is at 23/A13FD9FC Jan 25 14:20:35 wren postgres[9232]: [3-1] LOG: redo record is at 23/A13D3318; undo record is at 0/0; shutdown FALSE Jan 25 14:20:35 wren postgres[9232]: [4-1] LOG: next transaction ID: 105232106; next OID: 20947715 Jan 25 14:20:35 wren postgres[9232]: [5-1] LOG: database system was not properly shut down; automatic recovery in progress Jan 25 14:20:35 wren postgres[9232]: [6-1] LOG: redo starts at 23/A13D3318 Jan 25 14:20:35 wren postgres[9232]: [7-1] PANIC: btree_delete_page_redo: lost target page Jan 25 14:20:35 wren postgres[9226]: [1-1] LOG: startup process (PID 9232) was terminated by signal 6 Jan 25 14:20:35 wren postgres[9226]: [2-1] LOG: aborting startup due to startup process failure
---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match