On Tue, Jun 08, 2021 at 12:34:04PM -0500, Justin Pryzby wrote: > On Tue, Jun 08, 2021 at 12:06:02PM -0400, Alvaro Herrera wrote: > > On 2021-Jun-06, Justin Pryzby wrote: > > > > > However, I also found an autovacuum chewing 100% CPU, and it appears the > > > problem is actually because autovacuum has locked a page of pg-statistic, > > > and > > > every other process then gets stuck waiting in the planner. I checked a > > > few > > > and found these: > > > > Hmm ... I wonder if this could be related to commits d9d076222f5b, > > c98763bf51bf, etc. I don't have any connecting thoughts other than the > > tuple visibility code being involved. Do you see any procs with the > > PROC_IN_SAFE_IC flag set? > > I do have a reindex script which runs CIC, and it does looks suspicious. > > <<Mon Jun 7 22:00:54 CDT 2021: starting db=ts > ... > Mon Jun 7 22:02:57 CDT 2021: ts: pg_statio_all_tables_snap: > pg_statio_all_tables_snap_t_idx(reindex non-partitioned)... > ERROR: canceling statement due to statement timeout > reindex: warning, dropping invalid/unswapped index: > pg_statio_all_tables_snap_t_idx_ccnew > Mon Jun 7 23:02:57 CDT 2021: ts: pg_statio_all_tables_snap: > pg_toast.pg_toast_33011_index(reindex system)... > Mon Jun 7 23:02:57 CDT 2021: ts: pg_statio_all_indexes_snap: > pg_statio_all_indexes_snap_t_idx(reindex non-partitioned)... > ERROR: canceling statement due to statement timeout > reindex: warning, dropping invalid/unswapped index: > pg_statio_all_indexes_snap_t_idx_ccnew > Tue Jun 8 00:02:57 CDT 2021: ts: pg_statio_all_indexes_snap: > pg_toast.pg_toast_33020_index(reindex system)... > Tue Jun 8 01:21:20 CDT 2021: ts: permissions: perm_group_idx(reindex > non-partitioned)... > ERROR: canceling statement due to statement timeout > reindex: warning, dropping invalid/unswapped index: perm_group_idx_ccnew > Tue Jun 8 02:21:20 CDT 2021: ts: permissions: perm_user_idx(reindex > non-partitioned)... > ERROR: canceling statement due to statement timeout > reindex: warning, dropping invalid/unswapped index: perm_user_idx_ccnew > Tue Jun 8 03:21:20 CDT 2021: ts: permissions: > pg_toast.pg_toast_33577_index(reindex system)... > Tue Jun 8 03:21:20 CDT 2021: ts: patchfiles: patchfiles_filename_idx(reindex > non-partitioned)... > ERROR: canceling statement due to statement timeout > reindex: warning, dropping invalid/unswapped index: > patchfiles_filename_idx_ccnew > Tue Jun 8 04:21:21 CDT 2021: ts: patchfiles: patchfiles_pkey(reindex > non-partitioned)... > ERROR: canceling statement due to statement timeout > reindex: warning, dropping invalid/unswapped index: patchfiles_pkey_ccnew > > => It's strange that these timed out, since the statio tables are less than > 15MB, and permissions and patchfiles are both under 100kB. > > And it seems to say that it time out after less than 1sec.
Oops, no: it timed out after 3600sec, as requested. > They're running this: > | PGOPTIONS="--deadlock_timeout=333ms -cstatement-timeout=3600s" psql -c > "REINDEX INDEX CONCURRENTLY $i" > And if it times out, it then runs: $PSQL "DROP INDEX CONCURRENTLY $bad" ... > $ date -d @1623121264 > Mon Jun 7 22:01:04 CDT 2021 Which is probably because the reindex was waiting for the vacuum process to finish (or maybe waiting on the page that vacuum had locked?). -- Justin