> On Jun 9, 2017, at 11:23 AM, Adrian Klaver <adrian.kla...@aklaver.com> wrote: > > On 06/09/2017 09:13 AM, armand pirvu wrote: >>> On Jun 9, 2017, at 11:01 AM, Adrian Klaver <adrian.kla...@aklaver.com >>> <mailto:adrian.kla...@aklaver.com>> wrote: >>> >>> On 06/09/2017 08:45 AM, armand pirvu wrote: >>>> Hi >>>> Had a couple of processes blocking the vacuum so I terminated them using >>>> select pg_terminate_backend(pid); >>>> Running the following >>>> select distinct pid, backend_start, query_start, state_change, state, >>>> query from pg_catalog.pg_stat_activity order by 1; >>>> pid | backend_start | query_start | >>>> state_change | state | query >>>> -------+-------------------------------+-------------------------------+-------------------------------+--------+-------------------------------------------------------------------------------------------------------------------------- >>>> 10677 | 2017-06-09 10:25:49.189848-05 | 2017-06-09 10:33:43.598805-05 | >>>> 2017-06-09 10:33:43.599091-05 | idle | SELECT 1 >>>> 11096 | 2017-06-09 10:27:03.686588-05 | 2017-06-09 10:33:56.28736-05 | >>>> 2017-06-09 10:33:56.287364-05 | active | select distinct pid, >>>> backend_start, query_start, state_change, state, query from >>>> pg_catalog.pg_stat_activity order by 1; >>>> 13277 | 2017-06-09 07:48:49.506686-05 | 2017-06-09 07:48:52.887185-05 | >>>> 2017-06-09 07:48:52.887188-05 | active | autovacuum: VACUUM >>>> csischema.tf_purchased_badge >>>> 13484 | 2017-06-09 10:31:54.127672-05 | 2017-06-09 10:33:47.137938-05 | >>>> 2017-06-09 10:33:47.138226-05 | idle | SELECT 1 >>>> 16886 | 2017-06-09 07:56:49.033893-05 | 2017-06-09 07:56:49.078369-05 | >>>> 2017-06-09 07:56:49.078371-05 | active | autovacuum: VACUUM >>>> csischema.tf_purchases_person >>>> 25387 | 2017-06-09 05:32:08.079397-05 | 2017-06-09 05:32:08.385728-05 | >>>> 2017-06-09 05:32:08.385731-05 | active | autovacuum: VACUUM >>>> csischema.tf_demographic_response_person >>>> 37465 | 2017-06-09 08:50:58.992002-05 | 2017-06-09 08:51:21.506829-05 | >>>> 2017-06-09 08:51:21.506831-05 | active | autovacuum: VACUUM >>>> csischema.tf_transaction_item_person >>>> I did notice that state_change did not change one bit >>> >>> Did the state change? >>> >> No and that was what got me worried > > Are these large tables? > >
I would say yes select count(*) from csischema.tf_purchased_badge; 9380749 select count(*) from csischema.tf_purchases_person; 19902172 select count(*) from csischema.tf_demographic_response_person; 80868561 select count(*) from csischema.tf_transaction_item_person; 3281084 Interesting enough two completed relname | seq_scan | seq_tup_read | idx_scan | idx_tup_fetch | n_tup_ins | n_tup_upd | n_live_tup | n_dead_tup | n_mod_since_analyze | last_vacuum | last_autovacuum | autovacuum_count ----------------------------+----------+--------------+----------+---------------+-----------+-----------+------------+------------+---------------------+-------------+-------------------------------+------------------ tf_transaction_item_person | 160 | 0 | 476810 | 1946119 | 2526 | 473678 | 3226110 | 0 | 116097 | | 2017-06-09 11:15:24.701997-05 | 2 tf_purchased_badge | 358 | 1551142438 | 2108331 | 7020502 | 5498 | 1243746 | 9747336 | 107560 | 115888 | | 2017-06-09 15:09:16.624363-05 | 1 I did notice though that checkpoints seem a bit too often aka below 5 min from start to end 2017-06-09 14:18:38.552 CDT,,,888,,593a1810.378,271,,2017-06-08 22:37:52 CDT,,0,LOG,00000,"checkpoint starting: time",,,,,,,,,"" 2017-06-09 14:21:12.210 CDT,,,888,,593a1810.378,272,,2017-06-08 22:37:52 CDT,,0,LOG,00000,"checkpoint complete: wrote 12070 buffers (9.2%); 0 transaction log file(s) added, 0 removed, 4 recycled; write=148.714 s, sync=4.834 s, total=153.657 s; sync files=17, longest=1.841 s, average=0.284 s; distance=89452 kB, estimate=89452 kB",,,,,,,,,"" 2017-06-09 14:23:38.278 CDT,,,888,,593a1810.378,273,,2017-06-08 22:37:52 CDT,,0,LOG,00000,"checkpoint starting: time",,,,,,,,,"" 2017-06-09 14:24:38.629 CDT,,,888,,593a1810.378,274,,2017-06-08 22:37:52 CDT,,0,LOG,00000,"checkpoint complete: wrote 593 buffers (0.5%); 0 transaction log file(s) added, 0 removed, 1 recycled; write=59.825 s, sync=0.474 s, total=60.350 s; sync files=8, longest=0.355 s, average=0.059 s; distance=26952 kB, estimate=83202 kB",,,,,,,,,"" And also SELECT total_checkpoints, seconds_since_start / total_checkpoints / 60 AS minutes_between_checkpoints FROM (SELECT EXTRACT(EPOCH FROM (now() - pg_postmaster_start_time())) AS seconds_since_start, (checkpoints_timed+checkpoints_req) AS total_checkpoints FROM pg_stat_bgwriter ) AS sub; total_checkpoints | minutes_between_checkpoints -------------------+----------------------------- 285 | 3.33731205871345 These tables suffer quite some data changes IIRC but that comes via some temp tables which reside in a temp schema and some previous messages from the log suggest that it might have ran into ladder locking in early stages, aka tmp table locked from vacuum and any further processing waiting for it and causing some other waits on those largish tables Considering the temp ones are only for load and yes some processing goes in there , I am thinking disabling auto vacuum for the temp tables . Or should I disable auto vacuum all together and run say as a bath job on a weekend night ? > If you are on Postgres 9.6: > > https://www.postgresql.org/docs/9.6/static/progress-reporting.html > Aside that there are vacuum improvements and such, any other strong compelling reason to upgrade to 9.6 ? >>>> Does that mean that something is not quite right with the vacuums ? >>> >>> Might want to take a look at: >>> >>> https://www.postgresql.org/docs/9.6/static/monitoring-stats.html#PG-STAT-ALL-TABLES-VIEW >>> >>> >>>> Thank you >>>> Armand >>> >>> >>> -- >>> Adrian Klaver >>> adrian.kla...@aklaver.com <mailto:adrian.kla...@aklaver.com> > > > -- > Adrian Klaver > adrian.kla...@aklaver.com