Title: Identifying cause of "database system shutdown was interrupted" at failed startup

    Hi,
    We recently encountered a serious database crash that resulted in a significant loss of data…

    We took down the database server, and when we restarted the backend we got an error 'database system shutdown was interrupted' … 'invalid checkpoint' etc… with missing xlog files (I've appended the log to the end of this post)…

    I've been trawling list-archives for a few days and this issue has cropped up a number of times, but I've found it hard to identify a single post - or set of posts - that might help explain the cause of such a crash…

    Hopefully I'll be able to bring together the results of this trawl through the archives in this post - but I'd really appreciate any help or suggestions people have - we currently have a slightly uneasy feeling because we've not quite got to the bottom of the issues, and it would be nice to set our minds at rest! :-)

    So far I've identified two possible causes of the crash - I've listed them below, and wonder whether people have any comments on them:

    1) We were running postgres version 7.3.6-1 (which is the version in RedHat AS3 : redhat EL AS3 kernel-smp-2.4.21-9.0.1EL)

    The following post suggests that this is a known issue in 7.3.3, but 7.3.4 is safe? I assume, therefore, that 7.3.6-1 is also safe...

    http://archives.postgresql.org/pgsql-general/2003-09/msg01086.php
     
    2) We are running the database in conjunction with Jboss, connecting to the database server from a different machine via JDBC. The database was taken down *without* stopping Jboss first.

    Any thoughts would be much apreciated!

    Below are the relevant bits of the shutdown and startup logs,

    Best wishes,
    Crispin

    ----------------------
    shutdown log (/var/log/messages):
    May 28 15:43:35  shutdown: shutting down for system halt
    May 28 15:43:35  init: Switching to runlevel: 0
    May 28 15:43:36 server rhnsd[1694]: Exiting
    May 28 15:43:36 server rhnsd: rhnsd shutdown succeeded
    May 28 15:43:36 server atd: atd shutdown succeeded
    May 28 15:43:36 server cups: cupsd shutdown succeeded
    May 28 15:43:36 server xfs[1643]: terminating
    May 28 15:43:36 server xfs: xfs shutdown succeeded
    May 28 15:43:36 server mysqld: Stopping MySQL: succeeded
    May 28 15:43:36 server gpm: gpm shutdown succeeded
    May 28 15:43:37 server rhdb: Stopping PostgreSQL - Red Hat Edition service:
    May 28 15:43:37 server su(pam_unix)[12400]: session opened for user postgres by (uid=0)
    May 28 15:43:40 server su(pam_unix)[12400]: session closed for user postgres
    May 28 15:43:40 server rhdb: ^[[60G[
    May 28 15:43:40 server rhdb:
    May 28 15:43:40 server rc: Stopping rhdb: succeeded
    ...
    May 28 15:43:44 server kernel: Kernel logging (proc) stopped.
    May 28 15:43:44 server kernel: Kernel log daemon terminating.
    May 28 15:43:45 server syslog: klogd shutdown succeeded
    May 28 15:43:45 server exiting on signal 15
    May 28 16:13:35 server syslogd 1.4.1: restart.


    -----
    starting messages

    Jun  1 10:43:55 server postgres[5537]: [30] LOG:  database system shutdown was interrupted at 2004-05-28 16:32:08 BST
    Jun  1 10:43:55 server postgres[5537]: [31] LOG:  open of /var/lib/pgsql/data/pg_xlog/0000000000000000 (log file 0, segment 0) failed: No such file or directory

    Jun  1 10:43:55 server postgres[5537]: [32] LOG:  invalid primary checkpoint record
    Jun  1 10:43:55 server postgres[5537]: [33] LOG:  open of /var/lib/pgsql/data/pg_xlog/0000000000000000 (log file 0, segment 0) failed: No such file or directory

    Jun  1 10:43:55 server postgres[5537]: [34] LOG:  invalid secondary checkpoint record
    Jun  1 10:43:55 server postgres[5537]: [35] PANIC:  unable to locate a valid checkpoint record
    Jun  1 10:43:55 server postgres[5534]: [31] LOG:  startup process (pid 5537) was terminated by signal 6
    Jun  1 10:43:55 server postgres[5534]: [32] LOG:  aborting startup due to startup process failure
    Jun  1 10:43:56 server rhdb: Starting PostgreSQL - Red Hat Edition service:  failed
    Jun  1 10:44:00 server su(pam_unix)[5554]: session opened for user postgres by (uid=0)
    Jun  1 10:44:00 server su(pam_unix)[5554]: session closed for user postgres
    Jun  1 10:44:00 server postgres[5595]: [30] LOG:  database system shutdown was interrupted at 2004-05-28 16:32:08 BST
    Jun  1 10:44:00 server postgres[5595]: [31] LOG:  open of /var/lib/pgsql/data/pg_xlog/0000000000000000 (log file 0, segment 0) failed: No such file or directory

    Jun  1 10:44:00 server postgres[5595]: [32] LOG:  invalid primary checkpoint record
    Jun  1 10:44:00 server postgres[5595]: [33] LOG:  open of /var/lib/pgsql/data/pg_xlog/0000000000000000 (log file 0, segment 0) failed: No such file or directory

    Jun  1 10:44:00 server postgres[5595]: [34] LOG:  invalid secondary checkpoint record
    Jun  1 10:44:00 server postgres[5595]: [35] PANIC:  unable to locate a valid checkpoint record
    Jun  1 10:44:00 server postgres[5592]: [31] LOG:  startup process (pid 5595) was terminated by signal 6
    Jun  1 10:44:00 server postgres[5592]: [32] LOG:  aborting startup due to startup process failure
    Jun  1 10:44:01 server rhdb: Starting PostgreSQL - Red Hat Edition service:  failed







This email is confidential and intended solely for the use of the person(s) ('the intended recipient') to whom it was addressed. Any views or opinions presented are solely those of the author and do not necessarily represent those of the Paterson Institute for Cancer Research or the Christie Hospital NHS Trust. It may contain information that is privileged & confidential within the meaning of applicable law. Accordingly any dissemination, distribution, copying, or other use of this message, or any of its contents, by any person other than the intended recipient may constitute a breach of civil or criminal law and is strictly prohibited. If you are NOT the intended recipient please contact the sender and dispose of this e-mail as soon as possible.

Reply via email to