Looks like we got another set of errors overnight. Here's the log file from the errors. (Log file scrubbed slightly to remove private data, but still representative of the problem I believe.)
Nov 13 05:34:34 dev postgres[6084]: [4-1] user=dev,db=dev ERROR: could not access status of transaction 6337381 Nov 13 05:34:34 dev postgres[6084]: [4-2] user=dev,db=dev DETAIL: Could not open file "pg_subtrans/0060": No such file or directory. Nov 13 05:34:34 dev postgres[6084]: [4-3] user=dev,db=dev CONTEXT: SQL statement "SELECT 1 FROM ONLY "typhon"."collection_batches" x WHERE "id" OPERATOR(pg_catalog.=) $1 FOR KEY SHARE OF x" Nov 13 05:34:34 dev postgres[6084]: [4-4] user=dev,db=dev STATEMENT: update listings set deactivated_at=$1 where id=$2 and lock_version=$3 Nov 13 05:34:34 dev postgres[6076]: [4-1] user=dev,db=dev ERROR: could not access status of transaction 6337381 Nov 13 05:34:34 dev postgres[6076]: [4-2] user=dev,db=dev DETAIL: Could not open file "pg_subtrans/0060": No such file or directory. Nov 13 05:34:34 dev postgres[6076]: [4-3] user=dev,db=dev CONTEXT: SQL statement "SELECT 1 FROM ONLY "typhon"."collection_batches" x WHERE "id" OPERATOR(pg_catalog.=) $1 FOR KEY SHARE OF x" Nov 13 05:34:34 dev postgres[6076]: [4-4] user=dev,db=dev STATEMENT: update listings set deactivated_at=$1 where id=$2 and lock_version=$3 Nov 13 05:34:34 dev postgres[6087]: [4-1] user=dev,db=dev ERROR: could not access status of transaction 6337381 Nov 13 05:34:34 dev postgres[6087]: [4-2] user=dev,db=dev DETAIL: Could not open file "pg_subtrans/0060": No such file or directory. Nov 13 05:34:34 dev postgres[6087]: [4-3] user=dev,db=dev CONTEXT: SQL statement "SELECT 1 FROM ONLY "typhon"."collection_batches" x WHERE "id" OPERATOR(pg_catalog.=) $1 FOR KEY SHARE OF x" Nov 13 05:34:34 dev postgres[6087]: [4-4] user=dev,db=dev STATEMENT: update listings set deactivated_at=$1 where id=$2 and lock_version=$3 Nov 13 05:34:34 dev postgres[6086]: [4-1] user=dev,db=dev ERROR: could not access status of transaction 6337381 Nov 13 05:34:34 dev postgres[6086]: [4-2] user=dev,db=dev DETAIL: Could not open file "pg_subtrans/0060": No such file or directory. Nov 13 05:34:34 dev postgres[6086]: [4-3] user=dev,db=dev CONTEXT: SQL statement "SELECT 1 FROM ONLY "typhon"."collection_batches" x WHERE "id" OPERATOR(pg_catalog.=) $1 FOR KEY SHARE OF x" Nov 13 05:34:34 dev postgres[6086]: [4-4] user=dev,db=dev STATEMENT: update listings set deactivated_at=$1 where id=$2 and lock_version=$3 Nov 13 05:34:34 dev postgres[6088]: [4-1] user=dev,db=dev ERROR: could not access status of transaction 6337381 Nov 13 05:34:34 dev postgres[6088]: [4-2] user=dev,db=dev DETAIL: Could not open file "pg_subtrans/0060": No such file or directory. Nov 13 05:34:34 dev postgres[6088]: [4-3] user=dev,db=dev CONTEXT: SQL statement "SELECT 1 FROM ONLY "typhon"."collection_batches" x WHERE "id" OPERATOR(pg_catalog.=) $1 FOR KEY SHARE OF x" Nov 13 05:34:34 dev postgres[6088]: [4-4] user=dev,db=dev STATEMENT: update listings set deactivated_at=$1 where id=$2 and lock_version=$3 Nov 13 05:34:34 dev postgres[6085]: [4-1] user=dev,db=dev ERROR: could not access status of transaction 6337381 Nov 13 05:34:34 dev postgres[6085]: [4-2] user=dev,db=dev DETAIL: Could not open file "pg_subtrans/0060": No such file or directory. Nov 13 05:34:34 dev postgres[6085]: [4-3] user=dev,db=dev CONTEXT: SQL statement "SELECT 1 FROM ONLY "typhon"."collection_batches" x WHERE "id" OPERATOR(pg_catalog.=) $1 FOR KEY SHARE OF x" Nov 13 05:34:34 dev postgres[6085]: [4-4] user=dev,db=dev STATEMENT: update listings set deactivated_at=$1 where id=$2 and lock_version=$3 Several processes all seemed to hit the problem at the same moment, and all of them refer to the same transaction ID. Again, the file pg_subtrans/0060 doesn't exist, and the only file that does exist is pg_subtrans/005A which appears to be a zeroed-out file 245760 bytes in length. Still don't have a clue as to how I can reproduce the problem. It seems that in all cases the error occurred during either an UPDATE to a table_X or an INSERT to table_Y. In all cases, the error occurred in a manner identical to those shown in the log above, the only difference being either an UPDATE on table_X or an INSERT on table_Y. Not sure what direction I should head to now. Perhaps some aggressive logging would help, so we can see the queries surrounding the problems? I could reconfigure things to capture all statements and set up monit or something to send an alert when the problem resurfaces, for instance. Cheers all. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers