[BUGS] tracking down a crash bug

Orion Henry Sun, 13 Apr 2008 14:45:59 -0700

Hello.  I need help tracking down a crash bug.  I'm running 8.2.7
I've had my database go into recovery mode three times so far today under
user load.  /etc/init.d/postgresql-8.2 stop would stop the backend but leave
a few processes behind like this


postgres 22318  0.0  0.0  45724  1272 ?        Ss   Apr11   0:13 postgres:
app1101 app1101 10.255.7.159(44567) idle
postgres 24365  0.0  0.0  45724  1224 ?        Ss   Apr11   0:02 postgres:
app2280 app2280 10.255.7.159(51010) idle
postgres  5649  0.0  0.0  45368  1180 ?        Ss   Apr11   0:00 postgres:
app9452 app9452 10.255.7.159(43141) idle

I would then have to kill -9 these process.  Looking at the postgres log I
find only this...

2008-04-13 12:20:10 PDT STATEMENT:  SELECT version FROM schema_info
2008-04-13 12:21:14 PDT ERROR:  relation "schema_info" does not exist
2008-04-13 12:21:14 PDT STATEMENT:  SELECT version FROM schema_info
2008-04-13 12:26:48 PDT LOG:  background writer process (PID 965) was
terminated by signal 9
2008-04-13 12:26:48 PDT LOG:  terminating any other active server processes
2008-04-13 12:26:48 PDT WARNING:  terminating connection because of crash of
another server process
2008-04-13 12:26:48 PDT DETAIL:  The postmaster has commanded this server
process to roll back the current transaction and exit, because another
server process exited abnormally and possibly corrupted shared memory.
2008-04-13 12:26:48 PDT HINT:  In a moment you should be able to reconnect
to the database and repeat your command.
[ repeat several hundred times ]
2008-04-13 12:28:11 PDT FATAL:  the database system is in recovery mode
[ repeat several hundred times ]
2008-04-13 12:33:00 PDT LOG:  incomplete startup packet
2008-04-13 12:33:00 PDT LOG:  received fast shutdown request
2008-04-13 12:33:12 PDT FATAL:  the database system is shutting down
[ repeat a dozen times ]
2008-04-13 12:34:00 PDT LOG:  received immediate shutdown request
2008-04-13 12:34:02 PDT LOG:  could not load root certificate file
"root.crt": No such file or directory
2008-04-13 12:34:02 PDT DETAIL:  Will not verify client certificates.
2008-04-13 12:34:20 PDT LOG:  could not create IPv6 socket: Address family
not supported by protocol
2008-04-13 12:34:20 PDT LOG:  could not resolve "localhost": Name or service
not known
2008-04-13 12:34:20 PDT LOG:  disabling statistics collector for lack of
working socket
2008-04-13 12:34:20 PDT WARNING:  autovacuum not started because of
misconfiguration
2008-04-13 12:34:20 PDT HINT:  Enable options "stats_start_collector" and
"stats_row_level".
2008-04-13 12:34:20 PDT LOG:  database system was interrupted at 2008-04-13
12:22:44 PDT
2008-04-13 12:34:20 PDT LOG:  checkpoint record is at 0/594FDF58
2008-04-13 12:34:20 PDT LOG:  redo record is at 0/5946B830; undo record is
at 0/0; shutdown FALSE
2008-04-13 12:34:20 PDT LOG:  next transaction ID: 0/2979312; next OID:
106497
2008-04-13 12:34:20 PDT LOG:  next MultiXactId: 1; next MultiXactOffset: 0
2008-04-13 12:34:20 PDT LOG:  database system was not properly shut down;
automatic recovery in progress
2008-04-13 12:34:20 PDT LOG:  redo starts at 0/5946B830
2008-04-13 12:34:21 PDT LOG:  incomplete startup packet
2008-04-13 12:34:21 PDT LOG:  record with zero length at 0/5957A3EC
2008-04-13 12:34:21 PDT LOG:  redo done at 0/5957A3C4
2008-04-13 12:34:21 PDT LOG:  database system is ready

Any advice on how I can get this bug identified and squashed?  I suspect
it's in the create/drop [database,role,schema].  I've used postgres for 7
years without issues at this point.  The only thing different now are my
usage patterns.  Since I'm offering a database as a service to my users I'm
adding and dropping databases roles and schemas constantly.

Thanks

[BUGS] tracking down a crash bug

Reply via email to