There is a slight different in both instance’s data. Inastanbce 1 contains latest data and instance 2 consists of data which is 3 weeks older than instance 1.
Following are the number of rows in each table in both instances Instance 1 alf_node : 96493129 rows alf_node_properties : 455599288 rows alf_node_aspects : 150153006 rowsInstance 2 alf_node : 90831396 rows alf_node_properties : 440792247 rows alf_node_aspects : 146648241 rows I hope the above data difference can make a drastic difference. Please correct me if I am wrong. I checked "pg_stat_user_tables" and autovacuum never run against those tables. I did execute vacuum manually and I noticed the below in the output "INFO: vacuuming "public.alf_node_aspects" INFO: "alf_node_aspects": found 0 removable, 150264654 nonremovable row versions in 812242 pages DETAIL: 0 dead row versions cannot be removed yet. CPU 13.53s/33.35u sec elapsed 77.88 sec. INFO: analyzing "public.alf_node_aspects" INFO: "alf_node_aspects": scanned 30000 of 812242 pages, containing 5550000 live rows and 0 dead rows; 30000 rows in sample, 150264770 estimated total rows" Did the vacuum worked fine? I did "explain( analyze, buffers )" below are the results: Instance 1: Nested Loop Semi Join (cost=441133.07..1352496.76 rows=1 width=8) (actual time=9054.324..10029.748 rows=22 loops=1) Buffers: shared hit=1617812 -> Merge Semi Join (cost=441132.50..1352359.19 rows=41 width=16) (actual time=9054.296..10029.547 rows=22 loops=1) Merge Cond: (node.id = prop.node_id) Buffers: shared hit=1617701 -> Index Only Scan using idx_alf_node_tqn on alf_node node (cost=441041.93..1340831.48 rows=4593371 width=8) (actual t ime=4.418..7594.637 rows=40998148 loops=1) Index Cond: ((type_qname_id = 240) AND (store_id = 6)) Filter: (NOT (hashed SubPlan 1)) Rows Removed by Filter: 130 Heap Fetches: 0 Buffers: shared hit=1617681 SubPlan 1 -> Bitmap Heap Scan on alf_node_aspects aspect_1 (cost=3418.63..440585.22 rows=182459 width=8) (actual time=0. 583..2.992 rows=4697 loops=1) Recheck Cond: (qname_id = 251) Heap Blocks: exact=1774 Buffers: shared hit=1791 -> Bitmap Index Scan on fk_alf_nasp_qn (cost=0.00..3373.01 rows=182459 width=0) (actual time=0.384..0.38 4 rows=4697 loops=1) Index Cond: (qname_id = 251) Buffers: shared hit=17 -> Index Only Scan using idx_alf_nprop_s on alf_node_properties prop (cost=0.70..41.74 rows=852 width=8) (actual time= 0.022..0.037 rows=23 loops=1) Index Cond: ((qname_id = '242'::bigint) AND (string_value = 'E292432'::text)) Heap Fetches: 0 Buffers: shared hit=20 -> Index Only Scan using alf_node_aspects_pkey on alf_node_aspects aspect (cost=0.57..38.80 rows=14 width=8) (actual time=0. 007..0.007 rows=1 loops=22) Index Cond: ((node_id = node.id) AND (qname_id = 245)) Heap Fetches: 22 Buffers: shared hit=111 Planning time: 0.621 ms Execution time: 10029.799 ms (29 rows) Instance 2: Nested Loop Semi Join (cost=6471.94..173560891.23 rows=2 width=8) (actual time=0.162..0.470 rows=17 loops=1) Buffers: shared hit=257 -> Nested Loop (cost=6471.37..173560734.51 rows=45 width=16) (actual time=0.154..0.392 rows=17 loops=1) Buffers: shared hit=172 -> HashAggregate (cost=3508.15..3516.80 rows=865 width=8) (actual time=0.039..0.043 rows=18 loops=1) Group Key: prop.node_id Buffers: shared hit=23 -> Index Only Scan using idx_alf_nprop_s on alf_node_properties prop (cost=0.70..3505.99 rows=866 width=8) (actual time=0.019..0.033 rows=18 loops=1) Index Cond: ((qname_id = '242'::bigint) AND (string_value = 'E292432'::text)) Heap Fetches: 18 Buffers: shared hit=23 -> Index Scan using alf_node_pkey on alf_node node (cost=2963.22..200644.17 rows=1 width=8) (actual time=0.019.. 0.019 rows=1 loops=18) Index Cond: (id = prop.node_id) Filter: ((type_qname_id <> 145) AND (store_id = 6) AND (type_qname_id = 240) AND (NOT (SubPlan 1))) Rows Removed by Filter: 0 Buffers: shared hit=149 SubPlan 1 -> Materialize (cost=2962.65..397913.00 rows=158204 width=8) (actual time=0.002..0.009 rows=85 loops=17) Buffers: shared hit=59 -> Bitmap Heap Scan on alf_node_aspects aspect_1 (cost=2962.65..396503.98 rows=158204 width=8) (ac tual time=0.025..0.085 rows=85 loops=1) Recheck Cond: (qname_id = 251) Heap Blocks: exact=55 Buffers: shared hit=59 -> Bitmap Index Scan on fk_alf_nasp_qn (cost=0.00..2923.10 rows=158204 width=0) (actual time =0.019..0.019 rows=87 loops=1) Index Cond: (qname_id = 251) Buffers: shared hit=4 -> Index Only Scan using alf_node_aspects_pkey on alf_node_aspects aspect (cost=0.57..34.32 rows=12 width=8) (actual t ime=0.004..0.004 rows=1 loops=17) Index Cond: ((node_id = node.id) AND (qname_id = 245)) Heap Fetches: 17 Buffers: shared hit=85 Planning time: 0.642 ms Execution time: 0.546 ms (32 rows) With the vacuum and reindex on those tables manage to reduce the time to 10s from 30s, but still I don't get the exact performance as Instance 2. work_mem was set to 4mb previously and in instance 1 its set to 160mb and instance 2 still using 4mb as work_mem. shared_buffers set to 8GB in both. Both are on Postgres DB version 9.6.9 Thanks in advance. Fahiz On 10 Dec 2019, 20:15 +0000, Michael Lewis <mle...@entrata.com>, wrote: > On Mon, Dec 9, 2019 at 3:39 PM Fahiz Mohamed <fa...@netwidz.com> wrote: > > > I ran "explain analyse" on this query and I got following result. (We > > > have 2 identical DB instances and they consist of same data. Instane 1 > > > took 20+ second to process and instance 2 took less than a second) > > > > > > Instance 1: (This is used by regular User - More than 600,000 request a > > > day) - The result is same even when there is no user in the server. > > > -> Bitmap Heap Scan on alf_node node (cost=995009.97..3303978.85 > > > rows=4565737 width=8) (actual time=3304.419..20465.551 rows=41109751 > > > loops=1) > > > Recheck Cond: ((store_id = 6) AND (type_qname_id = 240)) > > > Rows Removed by Index Recheck: 54239131 > > > Filter: (NOT (hashed SubPlan 1)) > > > Rows Removed by Filter: 2816 > > > Heap Blocks: exact=24301 lossy=1875383 > > > > > > Planning time: 0.639 ms > > > Execution time: 22946.036 ms > > > > https://paquier.xyz/postgresql-2/postgres-9-4-feature-highlight-lossyexact-pages-for-bitmap-heap-scan/ > > > > That seems like a lot of lossy blocks. As I understand it, that means the > > system didn't have enough work_mem to fit all the references to the > > individual rows which perhaps isn't surprising when it estimates it needs > > 4.5 million rows and ends up with 41 million. > > > > Do both DB instances have the same data? I ask because the two plans are > > rather different which makes me think that statistics about the data are > > not very similar. Are both configured the same, particularly for > > shared_buffers and work_mem, as well as the various planning cost > > parameters like random_page cost? If you can provide these plans again with > > explain( analyze, buffers ) this time? Did you check on the last time > > autovacuum ran in pg_stat_user_tables? > > > >