On Jan 25, 2008 4:31 PM, Tom Lane <[EMAIL PROTECTED]> wrote: > Can you get us a stack trace from that crash?
Here's the trace of the server process post-crash. [EMAIL PROTECTED] ~]# gdb -p 24078 (gdb) bt #0 0x006ac402 in __kernel_vsyscall () #1 0x0060801d in ___newselect_nocancel () from /lib/libc.so.6 #2 0x0820db22 in ServerLoop () #3 0x0820d631 in PostmasterMain () #4 0x081b2ee7 in main () Looking into it more, it looks like the server is restarting every time it encounters this. I was wrong thinking that it stayed crashed, I guess I was just looking at a stale connection. However, I did find a way to trigger this behavior, and it still only happens with GSSAPI auth. Basically, run a statement with a syntax error while logged in under GSSAPI (I think anything that generates an ERROR level message will work). Here's the transcript of a GSSAPI connection: postgres=> select current_user; current_user -------------- nobody (1 row) postgres=> select * from pg_database 1; server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. !> select * from pg_database 1; You are currently not connected to a database. !> And of an md5-based connection (for user nobody, note that native krb5 connections exhibit this same behavior): postgres=> select * from pg_database 1; ERROR: syntax error at or near "1" LINE 1: select * from pg_database 1; ^ postgres=> select current_user; current_user -------------- nobody (1 row) postgres=> select * from pg_database 1; WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Succeeded. postgres=> And the syslog entries show this: Jan 26 19:37:38 mitchell postgres[24586]: [12-1] LOG: connection received: host=ator.cs.wisc.edu port=59442 Jan 26 19:37:38 mitchell postgres[24586]: [13-1] LOG: connection authorized: user=nobody database=postgres Jan 26 19:37:47 mitchell postgres[24587]: [12-1] LOG: connection received: host=ator.cs.wisc.edu port=59443 Jan 26 19:37:47 mitchell postgres[24587]: [13-1] LOG: connection authorized: user=koczan database=postgres Jan 26 19:38:04 mitchell postgres[24586]: [14-1] ERROR: syntax error at or near "1" at character 27 Jan 26 19:38:04 mitchell postgres[24586]: [14-2] STATEMENT: select * from pg_database 1; Jan 26 19:38:11 mitchell postgres[24078]: [12-1] LOG: server process (PID 24587) was terminated by signal 11: Segmentation fault Jan 26 19:38:11 mitchell postgres[24078]: [13-1] LOG: terminating any other active server processes Jan 26 19:38:11 mitchell postgres[24586]: [15-1] WARNING: terminating connection because of crash of another server process Jan 26 19:38:11 mitchell postgres[24586]: [15-2] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server Jan 26 19:38:11 mitchell postgres[24586]: [15-3] process exited abnormally and possibly corrupted shared memory. Jan 26 19:38:11 mitchell postgres[24586]: [15-4] HINT: In a moment you should be able to reconnect to the database and repeat your command. Jan 26 19:38:11 mitchell postgres[24078]: [14-1] LOG: all server processes terminated; reinitializing Jan 26 19:38:11 mitchell postgres[24607]: [15-1] LOG: database system was interrupted; last known up at 2008-01-26 19:25:52 CST Jan 26 19:38:11 mitchell postgres[24607]: [16-1] LOG: database system was not properly shut down; automatic recovery in progress Jan 26 19:38:11 mitchell postgres[24608]: [15-1] LOG: connection received: host=ator.cs.wisc.edu port=59446 Jan 26 19:38:11 mitchell postgres[24607]: [17-1] LOG: record with zero length at 0/87A5DC Jan 26 19:38:11 mitchell postgres[24607]: [18-1] LOG: redo is not required Jan 26 19:38:11 mitchell postgres[24607]: [19-1] LOG: checkpoint starting: shutdown immediate Jan 26 19:38:11 mitchell postgres[24607]: [20-1] LOG: checkpoint complete: wrote 0 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=0.002 s, Jan 26 19:38:11 mitchell postgres[24607]: [20-2] sync=0.000 s, total=0.009 s Jan 26 19:38:11 mitchell postgres[24611]: [15-1] LOG: autovacuum launcher started Jan 26 19:38:11 mitchell postgres[24078]: [15-1] LOG: database system is ready to accept connections Jan 26 19:38:11 mitchell postgres[24608]: [16-1] FATAL: the database system is in recovery mode Jan 26 19:38:11 mitchell postgres[24613]: [16-1] LOG: connection received: host=ator.cs.wisc.edu port=59447 Jan 26 19:38:11 mitchell postgres[24613]: [17-1] FATAL: no pg_hba.conf entry for host "128.105.162.36", user "koczan", database "postgres", SSL off Jan 26 19:38:22 mitchell postgres[24615]: [16-1] LOG: connection received: host=ator.cs.wisc.edu port=59448 Jan 26 19:38:22 mitchell postgres[24615]: [17-1] LOG: connection authorized: user=nobody database=postgres Since the server restarts and any connections either go away forever or just reset, post-crash stack traces won't do much good. If there's a way to get a stack-trace at the point of the crash, please let me know how and I'll get you that information. Also, why the connection is trying to reconnect without SSL intrigues and concerns me. Peter ---------------------------(end of broadcast)--------------------------- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate