In our query logs I saw: postgres=# SELECT log_time, session_id, session_line, left(message,99), left(query,99) FROM postgres_log WHERE error_severity='ERROR' AND message NOT LIKE 'cancel%'; -[ RECORD 1 ]+---------------------------------------------------------------------------------------------------- log_time | 2018-12-31 15:39:11.917-05 session_id | 5c2a7e6f.1fa4 session_line | 1 left | dsa_area could not attach to segment left | SELECT colcld.child c, parent p, array_agg(colpar.attname::text ORDER BY colpar.attnum) cols, array -[ RECORD 2 ]+---------------------------------------------------------------------------------------------------- log_time | 2018-12-31 15:39:11.917-05 session_id | 5c2a7e6f.1fa3 session_line | 4 left | dsa_area could not attach to segment left | SELECT colcld.child c, parent p, array_agg(colpar.attname::text ORDER BY colpar.attnum) cols, array
The full query + plan is: |ts=# explain SELECT colcld.child c, parent p, array_agg(colpar.attname::text ORDER BY colpar.attnum) cols, array_agg(format_type(colpar.atttypid, colpar.atttypmod) ORDER BY colpar.attnum) AS types |FROM queued_alters qa |JOIN pg_attribute colpar ON to_regclass(qa.parent)=colpar.attrelid AND colpar.attnum>0 AND NOT colpar.attisdropped |JOIN (SELECT *, attrelid::regclass::text AS child FROM pg_attribute) colcld ON to_regclass(qa.child) =colcld.attrelid AND colcld.attnum>0 AND NOT colcld.attisdropped |WHERE colcld.attname=colpar.attname AND colpar.atttypid!=colcld.atttypid |GROUP BY 1,2 |ORDER BY |parent LIKE 'unused%', -- Do them last |regexp_replace(colcld.child, '.*_((([0-9]{4}_[0-9]{2})_[0-9]{2})|(([0-9]{6})([0-9]{2})?))$', '\3\5') DESC, | regexp_replace(colcld.child, '.*_', '') DESC |LIMIT 1; |QUERY PLAN |Limit (cost=67128.06..67128.06 rows=1 width=307) | -> Sort (cost=67128.06..67137.84 rows=3912 width=307) | Sort Key: (((qa_1.parent)::text ~~ 'unused%'::text)), (regexp_replace((((pg_attribute.attrelid)::regclass)::text), '.*_((([0-9]{4}_[0-9]{2})_[0-9]{2})|(([0-9]{6})([0-9]{2})?))$'::text, '\3\5'::text)) DESC, (regexp_replace((((pg_attribute.attrelid)::regclass)::text), '.*_'::text, ''::text)) DESC | -> GroupAggregate (cost=66893.34..67108.50 rows=3912 width=307) | Group Key: (((pg_attribute.attrelid)::regclass)::text), qa_1.parent | -> Sort (cost=66893.34..66903.12 rows=3912 width=256) | Sort Key: (((pg_attribute.attrelid)::regclass)::text), qa_1.parent | -> Gather (cost=40582.28..66659.91 rows=3912 width=256) | Workers Planned: 2 | -> Parallel Hash Join (cost=39582.28..65268.71 rows=1630 width=256) | Hash Cond: (((to_regclass((qa_1.child)::text))::oid = pg_attribute.attrelid) AND (colpar.attname = pg_attribute.attname)) | Join Filter: (colpar.atttypid <> pg_attribute.atttypid) | -> Nested Loop (cost=0.43..25614.89 rows=11873 width=366) | -> Parallel Append (cost=0.00..12.00 rows=105 width=292) | -> Parallel Seq Scan on queued_alters_child qa_1 (cost=0.00..11.47 rows=147 width=292) | -> Parallel Seq Scan on queued_alters qa (cost=0.00..0.00 rows=1 width=292) | -> Index Scan using pg_attribute_relid_attnum_index on pg_attribute colpar (cost=0.43..242.70 rows=114 width=78) | Index Cond: ((attrelid = (to_regclass((qa_1.parent)::text))::oid) AND (attnum > 0)) | Filter: (NOT attisdropped) | -> Parallel Hash (cost=33651.38..33651.38 rows=395365 width=72) | -> Parallel Seq Scan on pg_attribute (cost=0.00..33651.38 rows=395365 width=72) | Filter: ((NOT attisdropped) AND (attnum > 0)) | queued_alters is usually empty, and looks like it would've last been nonempty on 2018-12-10. ts=# \d queued_alters Table "public.queued_alters" Column | Type | Collation | Nullable | Default --------+-----------------------+-----------+----------+--------- child | character varying(64) | | | parent | character varying(64) | | | Indexes: "queued_alters_child_parent_key" UNIQUE CONSTRAINT, btree (child, parent) Number of child tables: 1 (Use \d+ to list them.) I found this other log at that time: 2018-12-31 15:39:11.918-05 | 30831 | 5bf38e71.786f | 5 | background worker "parallel worker" (PID 8100) exited with exit code 1 Which is the postmaster, or its PID in any case.. [pryzbyj@telsasoft-db ~]$ ps -wwwf 30831 UID PID PPID C STIME TTY STAT TIME CMD postgres 30831 1 0 Nov19 ? S 62:44 /usr/pgsql-11/bin/postmaster -D /var/lib/pgsql/11/data postgres=# SELECT log_time, pid, session_line, left(message,99) FROM postgres_log WHERE session_id='5bf38e71.786f' ; log_time | pid | session_line | left ----------------------------+-------+--------------+------------------------------------------------------------------------- 2018-12-31 15:39:11.918-05 | 30831 | 5 | background worker "parallel worker" (PID 8100) exited with exit code 1 2018-12-31 15:39:11.935-05 | 30831 | 6 | background worker "parallel worker" (PID 8101) exited with exit code 1 2018-12-31 16:40:47.42-05 | 30831 | 7 | background worker "parallel worker" (PID 7239) exited with exit code 1 2018-12-31 16:40:47.42-05 | 30831 | 8 | background worker "parallel worker" (PID 7240) exited with exit code 1 2018-12-31 16:41:00.151-05 | 30831 | 9 | background worker "parallel worker" (PID 7371) exited with exit code 1 2018-12-31 16:41:00.151-05 | 30831 | 10 | background worker "parallel worker" (PID 7372) exited with exit code 1 2018-12-31 16:41:04.024-05 | 30831 | 11 | background worker "parallel worker" (PID 7451) exited with exit code 1 2018-12-31 16:41:04.024-05 | 30831 | 12 | background worker "parallel worker" (PID 7450) exited with exit code 1 2018-12-31 16:41:23.845-05 | 30831 | 13 | background worker "parallel worker" (PID 7658) exited with exit code 1 2018-12-31 16:41:23.845-05 | 30831 | 14 | background worker "parallel worker" (PID 7659) exited with exit code 1 2018-12-31 16:43:58.854-05 | 30831 | 15 | background worker "parallel worker" (PID 10825) exited with exit code 1 2018-12-31 16:43:58.854-05 | 30831 | 16 | background worker "parallel worker" (PID 10824) exited with exit code 1 I seem to be missing logs for session_lines 1-4 , which would've been rotated to oblivion if older than 24h. I found these: https://www.postgresql.org/message-id/flat/CAPa4P2YzgRTBHHRx2KAPUO1_xkmqQgT2xP0tS_e%3DphWvNzWdkA%40mail.gmail.com#4a1a94bc71d4c99c01babc759fe5b6ec https://www.postgresql.org/message-id/CAEepm=0mv9bigjppribgqhnhqvgyo2+kmzekguvjjc9y_zv...@mail.gmail.com Which appears to have been commited so I think this error is not unexpected. |commit fd7c0fa732d97a4b4ebb58730e6244ea30d0a618 |Author: Robert Haas <rh...@postgresql.org> |Date: Mon Dec 18 12:17:37 2017 -0500 | |Fix crashes on plans with multiple Gather (Merge) nodes. I will try to reproduce and provide bt...but all I can think to do is run a tight loop around that query and hope something else comes by and tickles it a few more times. Happy Grecian newyear. Justin