Hello. I need help tracking down a crash bug. I'm running 8.2.7 I've had my database go into recovery mode three times so far today under user load. /etc/init.d/postgresql-8.2 stop would stop the backend but leave a few processes behind like this
postgres 22318 0.0 0.0 45724 1272 ? Ss Apr11 0:13 postgres: app1101 app1101 10.255.7.159(44567) idle postgres 24365 0.0 0.0 45724 1224 ? Ss Apr11 0:02 postgres: app2280 app2280 10.255.7.159(51010) idle postgres 5649 0.0 0.0 45368 1180 ? Ss Apr11 0:00 postgres: app9452 app9452 10.255.7.159(43141) idle I would then have to kill -9 these process. Looking at the postgres log I find only this... 2008-04-13 12:20:10 PDT STATEMENT: SELECT version FROM schema_info 2008-04-13 12:21:14 PDT ERROR: relation "schema_info" does not exist 2008-04-13 12:21:14 PDT STATEMENT: SELECT version FROM schema_info 2008-04-13 12:26:48 PDT LOG: background writer process (PID 965) was terminated by signal 9 2008-04-13 12:26:48 PDT LOG: terminating any other active server processes 2008-04-13 12:26:48 PDT WARNING: terminating connection because of crash of another server process 2008-04-13 12:26:48 PDT DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2008-04-13 12:26:48 PDT HINT: In a moment you should be able to reconnect to the database and repeat your command. [ repeat several hundred times ] 2008-04-13 12:28:11 PDT FATAL: the database system is in recovery mode [ repeat several hundred times ] 2008-04-13 12:33:00 PDT LOG: incomplete startup packet 2008-04-13 12:33:00 PDT LOG: received fast shutdown request 2008-04-13 12:33:12 PDT FATAL: the database system is shutting down [ repeat a dozen times ] 2008-04-13 12:34:00 PDT LOG: received immediate shutdown request 2008-04-13 12:34:02 PDT LOG: could not load root certificate file "root.crt": No such file or directory 2008-04-13 12:34:02 PDT DETAIL: Will not verify client certificates. 2008-04-13 12:34:20 PDT LOG: could not create IPv6 socket: Address family not supported by protocol 2008-04-13 12:34:20 PDT LOG: could not resolve "localhost": Name or service not known 2008-04-13 12:34:20 PDT LOG: disabling statistics collector for lack of working socket 2008-04-13 12:34:20 PDT WARNING: autovacuum not started because of misconfiguration 2008-04-13 12:34:20 PDT HINT: Enable options "stats_start_collector" and "stats_row_level". 2008-04-13 12:34:20 PDT LOG: database system was interrupted at 2008-04-13 12:22:44 PDT 2008-04-13 12:34:20 PDT LOG: checkpoint record is at 0/594FDF58 2008-04-13 12:34:20 PDT LOG: redo record is at 0/5946B830; undo record is at 0/0; shutdown FALSE 2008-04-13 12:34:20 PDT LOG: next transaction ID: 0/2979312; next OID: 106497 2008-04-13 12:34:20 PDT LOG: next MultiXactId: 1; next MultiXactOffset: 0 2008-04-13 12:34:20 PDT LOG: database system was not properly shut down; automatic recovery in progress 2008-04-13 12:34:20 PDT LOG: redo starts at 0/5946B830 2008-04-13 12:34:21 PDT LOG: incomplete startup packet 2008-04-13 12:34:21 PDT LOG: record with zero length at 0/5957A3EC 2008-04-13 12:34:21 PDT LOG: redo done at 0/5957A3C4 2008-04-13 12:34:21 PDT LOG: database system is ready Any advice on how I can get this bug identified and squashed? I suspect it's in the create/drop [database,role,schema]. I've used postgres for 7 years without issues at this point. The only thing different now are my usage patterns. Since I'm offering a database as a service to my users I'm adding and dropping databases roles and schemas constantly. Thanks