Justin Pryzby <pry...@telsasoft.com> writes: > This commit seems to trigger elog(), not reproducible in the > parent commit.
> 6e086fa2e77 Allow parallel workers to cope with a newly-created session user > ID. > postgres=# SET min_parallel_table_scan_size=0; CLUSTER pg_attribute USING > pg_attribute_relid_attnum_index; > ERROR: pg_attribute catalog is missing 26 attribute(s) for relation OID 70321 I've been poking at this all day, and I still have little idea what's going on. I've added a bunch of throwaway instrumentation, and have managed to convince myself that the problem is that parallel heap scan is broken. The scans done to rebuild pg_attribute's indexes seem to sometimes miss heap pages or visit pages twice (in different workers). I have no idea why this is, and even less idea how 6e086fa2e is provoking it. As you say, the behavior isn't entirely reproducible, but I couldn't make it happen at all after reverting 6e086fa2e's changes in transam/parallel.c, so apparently there is some connection. Another possibly useful data point is that for me it reproduces fairly well (more than one time in two) on x86_64 Linux, but I could not make it happen on macOS ARM64. If it's a race condition, which smells plausible, that's perhaps not hugely surprising. regards, tom lane