> On Jun 9, 2017, at 11:23 AM, Adrian Klaver <[email protected]> wrote:
>
> On 06/09/2017 09:13 AM, armand pirvu wrote:
>>> On Jun 9, 2017, at 11:01 AM, Adrian Klaver <[email protected]
>>> <mailto:[email protected]>> wrote:
>>>
>>> On 06/09/2017 08:45 AM, armand pirvu wrote:
>>>> Hi
>>>> Had a couple of processes blocking the vacuum so I terminated them using
>>>> select pg_terminate_backend(pid);
>>>> Running the following
>>>> select distinct pid, backend_start, query_start, state_change, state,
>>>> query from pg_catalog.pg_stat_activity order by 1;
>>>> pid | backend_start | query_start |
>>>> state_change | state | query
>>>> -------+-------------------------------+-------------------------------+-------------------------------+--------+--------------------------------------------------------------------------------------------------------------------------
>>>> 10677 | 2017-06-09 10:25:49.189848-05 | 2017-06-09 10:33:43.598805-05 |
>>>> 2017-06-09 10:33:43.599091-05 | idle | SELECT 1
>>>> 11096 | 2017-06-09 10:27:03.686588-05 | 2017-06-09 10:33:56.28736-05 |
>>>> 2017-06-09 10:33:56.287364-05 | active | select distinct pid,
>>>> backend_start, query_start, state_change, state, query from
>>>> pg_catalog.pg_stat_activity order by 1;
>>>> 13277 | 2017-06-09 07:48:49.506686-05 | 2017-06-09 07:48:52.887185-05 |
>>>> 2017-06-09 07:48:52.887188-05 | active | autovacuum: VACUUM
>>>> csischema.tf_purchased_badge
>>>> 13484 | 2017-06-09 10:31:54.127672-05 | 2017-06-09 10:33:47.137938-05 |
>>>> 2017-06-09 10:33:47.138226-05 | idle | SELECT 1
>>>> 16886 | 2017-06-09 07:56:49.033893-05 | 2017-06-09 07:56:49.078369-05 |
>>>> 2017-06-09 07:56:49.078371-05 | active | autovacuum: VACUUM
>>>> csischema.tf_purchases_person
>>>> 25387 | 2017-06-09 05:32:08.079397-05 | 2017-06-09 05:32:08.385728-05 |
>>>> 2017-06-09 05:32:08.385731-05 | active | autovacuum: VACUUM
>>>> csischema.tf_demographic_response_person
>>>> 37465 | 2017-06-09 08:50:58.992002-05 | 2017-06-09 08:51:21.506829-05 |
>>>> 2017-06-09 08:51:21.506831-05 | active | autovacuum: VACUUM
>>>> csischema.tf_transaction_item_person
>>>> I did notice that state_change did not change one bit
>>>
>>> Did the state change?
>>>
>> No and that was what got me worried
>
> Are these large tables?
>
>
I would say yes
select count(*) from csischema.tf_purchased_badge;
9380749
select count(*) from csischema.tf_purchases_person;
19902172
select count(*) from csischema.tf_demographic_response_person;
80868561
select count(*) from csischema.tf_transaction_item_person;
3281084
Interesting enough two completed
relname | seq_scan | seq_tup_read | idx_scan |
idx_tup_fetch | n_tup_ins | n_tup_upd | n_live_tup | n_dead_tup |
n_mod_since_analyze | last_vacuum | last_autovacuum |
autovacuum_count
----------------------------+----------+--------------+----------+---------------+-----------+-----------+------------+------------+---------------------+-------------+-------------------------------+------------------
tf_transaction_item_person | 160 | 0 | 476810 |
1946119 | 2526 | 473678 | 3226110 | 0 | 116097
| | 2017-06-09 11:15:24.701997-05 | 2
tf_purchased_badge | 358 | 1551142438 | 2108331 |
7020502 | 5498 | 1243746 | 9747336 | 107560 | 115888
| | 2017-06-09 15:09:16.624363-05 | 1
I did notice though that checkpoints seem a bit too often aka below 5 min from
start to end
2017-06-09 14:18:38.552 CDT,,,888,,593a1810.378,271,,2017-06-08 22:37:52
CDT,,0,LOG,00000,"checkpoint starting: time",,,,,,,,,""
2017-06-09 14:21:12.210 CDT,,,888,,593a1810.378,272,,2017-06-08 22:37:52
CDT,,0,LOG,00000,"checkpoint complete: wrote 12070 buffers (9.2%); 0
transaction log file(s) added, 0 removed, 4 recycled; write=148.714 s,
sync=4.834 s, total=153.657 s; sync files=17, longest=1.841 s, average=0.284 s;
distance=89452 kB, estimate=89452 kB",,,,,,,,,""
2017-06-09 14:23:38.278 CDT,,,888,,593a1810.378,273,,2017-06-08 22:37:52
CDT,,0,LOG,00000,"checkpoint starting: time",,,,,,,,,""
2017-06-09 14:24:38.629 CDT,,,888,,593a1810.378,274,,2017-06-08 22:37:52
CDT,,0,LOG,00000,"checkpoint complete: wrote 593 buffers (0.5%); 0 transaction
log file(s) added, 0 removed, 1 recycled; write=59.825 s, sync=0.474 s,
total=60.350 s; sync files=8, longest=0.355 s, average=0.059 s; distance=26952
kB, estimate=83202 kB",,,,,,,,,""
And also
SELECT
total_checkpoints,
seconds_since_start / total_checkpoints / 60 AS minutes_between_checkpoints
FROM
(SELECT
EXTRACT(EPOCH FROM (now() - pg_postmaster_start_time())) AS seconds_since_start,
(checkpoints_timed+checkpoints_req) AS total_checkpoints
FROM pg_stat_bgwriter
) AS sub;
total_checkpoints | minutes_between_checkpoints
-------------------+-----------------------------
285 | 3.33731205871345
These tables suffer quite some data changes IIRC but that comes via some temp
tables which reside in a temp schema and some previous messages from the log
suggest that it might have ran into ladder locking in early stages, aka tmp
table locked from vacuum and any further processing waiting for it and causing
some other waits on those largish tables
Considering the temp ones are only for load and yes some processing goes in
there , I am thinking disabling auto vacuum for the temp tables . Or should I
disable auto vacuum all together and run say as a bath job on a weekend night ?
> If you are on Postgres 9.6:
>
> https://www.postgresql.org/docs/9.6/static/progress-reporting.html
>
Aside that there are vacuum improvements and such, any other strong compelling
reason to upgrade to 9.6 ?
>>>> Does that mean that something is not quite right with the vacuums ?
>>>
>>> Might want to take a look at:
>>>
>>> https://www.postgresql.org/docs/9.6/static/monitoring-stats.html#PG-STAT-ALL-TABLES-VIEW
>>>
>>>
>>>> Thank you
>>>> Armand
>>>
>>>
>>> --
>>> Adrian Klaver
>>> [email protected] <mailto:[email protected]>
>
>
> --
> Adrian Klaver
> [email protected]