G'day all,

PG version: 7.4.7    (debian 7.4.7-6sarge2)
OS: Linux (debian sarge)

I started a 'vacuum full analyze' using pgsql, then, after
about 10 minutes, I issued a control-c in pgsql.  This hadn't
returned my prompt after another 10 minutes or so.  A glance at
the backend showed other processes running and some apparently
hanging around, possibly blocked by the vacuum.

Eventually I did a shutdown using the debian
/etc/init.d/postgresql script.  This tried shutting the
postmaster down cleanly, but eventually 'kill -9'ed the
postmaster (much to my surpise, given the ever present "TIP 2:
Don't 'kill -9' the postmaster" - just goes to show, you
shouldn't trust *anyone*!).

Anyhow, given the 'kill -9' I guess you could stop reading right
here...

Trying to start the postmaster again failed, with what looks
like the relevent message being:

  PANIC:  btree_delete_page_redo: lost target page

See the attached syslog.txt for the system logs.

Googling 'postgres "lost target page"' only came back with
extracts from nbtxlog.c rather than any previously reported
problems like this.

At this point I restored the database from our backups, however
I still have a copy of the corrupted data directory if there's
interest in trying to diagnose the problem.

The entire data directory is around 5 Gb so let me know if you'd
like to see something specific in there.

Cheers,

Chris.
Jan 25 13:49:10 wren postgres[6106]: [58-2] STATEMENT:  VACUUM full ANALYZE;
Jan 25 13:59:58 wren postgres[6903]: [1-1] LOG:  recycled transaction log file 
"000000230000009F"
Jan 25 14:17:35 wren postgres[1834]: [1-1] LOG:  received fast shutdown request
Jan 25 14:17:35 wren postgres[1834]: [2-1] LOG:  aborting any active 
transactions
Jan 25 14:17:35 wren postgres[6117]: [3-2] STATEMENT:  
Jan 25 14:17:35 wren postgres[6117]: [3-3] ^Iselect
Jan 25 14:17:35 wren postgres[6117]: [3-4] ^I   customer.custid,
Jan 25 14:17:35 wren postgres[6117]: [3-5] ^I   login.login,
Jan 25 14:17:35 wren postgres[6117]: [3-6] ^I   radacct.acctsessionid,
Jan 25 14:17:35 wren postgres[6117]: [3-7] ^I   radacct.acctstoptime,
Jan 25 14:17:35 wren postgres[8708]: [4-1] FATAL:  the database system is 
shutting down
Jan 25 14:17:35 wren postgres[6117]: [3-8] ^I   billtype.datacost,
Jan 25 14:17:35 wren postgres[6117]: [3-9] ^I   radacct.acctoutputoctets
Jan 25 14:17:35 wren postgres[6117]: [3-10] ^Ifrom
Jan 25 14:17:35 wren postgres[6117]: [3-11] ^I   login,radacct,customer,billtype
Jan 25 14:17:35 wren postgres[6117]: [3-12] ^Iwhere
Jan 25 14:17:35 wren postgres[6117]: [3-13] ^I   login.login = radacct.username 
and
Jan 25 14:17:35 wren postgres[6117]: [3-14] ^I   login.custid = customer.custid 
and
Jan 25 14:17:35 wren postgres[6117]: [3-15] ^I   date_trunc ('second', 
radacct.acctstoptime) > date_trunc ('second', customer.lastbilled) and
Jan 25 14:17:35 wren postgres[6117]: [3-16] ^I   customer.billtype = 
billtype.code
Jan 25 14:17:35 wren postgres[6117]: [3-17] ^Iorder by radacct.acctstoptime
Jan 25 14:17:35 wren postgres[6117]: [3-18] ^I
Jan 25 14:17:35 wren postgres[6117]: [3-19] ^I
Jan 25 14:17:58 wren postgres[8762]: [4-1] FATAL:  the database system is 
shutting down
Jan 25 14:18:45 wren postgres[1834]: [3-1] LOG:  received immediate shutdown 
request
Jan 25 14:18:48 wren postgres[6111]: [3-2] DETAIL:  The postmaster has 
commanded this server process to roll back the current transaction and exit, 
because another server
Jan 25 14:18:48 wren postgres[6111]: [3-3]  process exited abnormally and 
possibly corrupted shared memory.
Jan 25 14:18:48 wren postgres[6111]: [3-4] HINT:  In a moment you should be 
able to reconnect to the database and repeat your command.
Jan 25 14:18:48 wren postgres[6111]: [3-5] STATEMENT:  VACUUM full ANALYZE;
Jan 25 14:18:48 wren postgres[8623]: [1-2] DETAIL:  The postmaster has 
commanded this server process to roll back the current transaction and exit, 
because another server
Jan 25 14:18:48 wren postgres[8623]: [1-3]  process exited abnormally and 
possibly corrupted shared memory.
Jan 25 14:18:48 wren postgres[8623]: [1-4] HINT:  In a moment you should be 
able to reconnect to the database and repeat your command.
Jan 25 14:18:49 wren postgres[9120]: [1-1] LOG:  database system was 
interrupted at 2007-01-25 14:10:27 EST
Jan 25 14:18:49 wren postgres[9120]: [2-1] LOG:  checkpoint record is at 
23/A13FD9FC
Jan 25 14:18:49 wren postgres[9120]: [3-1] LOG:  redo record is at 23/A13D3318; 
undo record is at 0/0; shutdown FALSE
Jan 25 14:18:49 wren postgres[9120]: [4-1] LOG:  next transaction ID: 
105232106; next OID: 20947715
Jan 25 14:18:49 wren postgres[9120]: [5-1] LOG:  database system was not 
properly shut down; automatic recovery in progress
Jan 25 14:18:49 wren postgres[9120]: [6-1] LOG:  redo starts at 23/A13D3318
Jan 25 14:18:49 wren postgres[9120]: [7-1] PANIC:  btree_delete_page_redo: lost 
target page
Jan 25 14:18:49 wren postgres[9114]: [1-1] LOG:  startup process (PID 9120) was 
terminated by signal 6
Jan 25 14:18:49 wren postgres[9114]: [2-1] LOG:  aborting startup due to 
startup process failure

Jan 25 14:20:35 wren postgres[9232]: [1-1] LOG:  database system was 
interrupted while in recovery at 2007-01-25 14:18:49 EST
Jan 25 14:20:35 wren postgres[9232]: [1-2] HINT:  This probably means that some 
data is corrupted and you will have to use the last backup for recovery.
Jan 25 14:20:35 wren postgres[9232]: [2-1] LOG:  checkpoint record is at 
23/A13FD9FC
Jan 25 14:20:35 wren postgres[9232]: [3-1] LOG:  redo record is at 23/A13D3318; 
undo record is at 0/0; shutdown FALSE
Jan 25 14:20:35 wren postgres[9232]: [4-1] LOG:  next transaction ID: 
105232106; next OID: 20947715
Jan 25 14:20:35 wren postgres[9232]: [5-1] LOG:  database system was not 
properly shut down; automatic recovery in progress
Jan 25 14:20:35 wren postgres[9232]: [6-1] LOG:  redo starts at 23/A13D3318
Jan 25 14:20:35 wren postgres[9232]: [7-1] PANIC:  btree_delete_page_redo: lost 
target page
Jan 25 14:20:35 wren postgres[9226]: [1-1] LOG:  startup process (PID 9232) was 
terminated by signal 6
Jan 25 14:20:35 wren postgres[9226]: [2-1] LOG:  aborting startup due to 
startup process failure
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match

Reply via email to