Thank you very much for your prompt responses.

I have analysed more regarding this and found the long running query.

I ran "explain analyse" on this query and I got following result. (We have 2 
identical DB instances and they consist of same data. Instane 1 took 20+ second 
to process and instance 2 took less than a second)

Instance 1: (This is used by regular User - More than 600,000 request a day) - 
The result is same even when there is no user in the server.
EXPLAIN ANALYZE
Nested Loop Semi Join (cost=998547.53..3319573.36 rows=1 width=8) (actual 
time=10568.217..22945.971 rows=22 loops=1)
 -> Hash Semi Join (cost=998546.96..3319545.95 rows=41 width=16) (actual 
time=10568.198..22945.663 rows=22 loops=1)
    Hash Cond: (node.id = prop.node_id)
    -> Bitmap Heap Scan on alf_node node (cost=995009.97..3303978.85 
rows=4565737 width=8) (actual time=3304.419..20465.551 rows=41109751 loops=1)
       Recheck Cond: ((store_id = 6) AND (type_qname_id = 240))
       Rows Removed by Index Recheck: 54239131
       Filter: (NOT (hashed SubPlan 1))
       Rows Removed by Filter: 2816
       Heap Blocks: exact=24301 lossy=1875383
       -> Bitmap Index Scan on idx_alf_node_mdq (cost=0.00..646144.01 
rows=20047144 width=0) (actual time=3232.067..3232.067 rows=44246360 loops=1)
          Index Cond: ((store_id = 6) AND (type_qname_id = 240))
       SubPlan 1
        -> Bitmap Heap Scan on alf_node_aspects aspect_1 
(cost=2503.51..347403.58 rows=128379 width=8) (actual time=25.447..65.392 
rows=5635 loops=1)
           Recheck Cond: (qname_id = 251)
           Heap Blocks: exact=40765
           -> Bitmap Index Scan on fk_alf_nasp_qn (cost=0.00..2471.41 
rows=128379 width=0) (actual time=18.835..18.835 rows=239610 loops=1)
              Index Cond: (qname_id = 251)
    -> Hash (cost=3526.11..3526.11 rows=871 width=8) (actual time=0.045..0.045 
rows=23 loops=1)
       Buckets: 1024 Batches: 1 Memory Usage: 9kB
       -> Index Only Scan using idx_alf_nprop_s on alf_node_properties prop 
(cost=0.70..3526.11 rows=871 width=8) (actual time=0.021..0.042 rows=23 loops=1)
          Index Cond: ((qname_id = '242'::bigint) AND (string_value = 
'E292432'::text))
          Heap Fetches: 23
 -> Index Only Scan using alf_node_aspects_pkey on alf_node_aspects aspect 
(cost=0.57..2.01 rows=15 width=8) (actual time=0.011..0.011 rows=1 loops=22)
    Index Cond: ((node_id = node.id) AND (qname_id = 245))
    Heap Fetches: 22
Planning time: 0.639 ms
Execution time: 22946.036 ms

Instance 2: (Only by testers - 250 request a day)

Nested Loop Semi Join  (cost=6471.94..173560841.08 rows=2 width=8) (actual 
time=0.162..0.464 rows=17 loops=1)
  ->  Nested Loop  (cost=6471.37..173560684.36 rows=45 width=16) (actual 
time=0.154..0.387 rows=17 loops=1)
        ->  HashAggregate  (cost=3508.15..3516.80 rows=865 width=8) (actual 
time=0.041..0.047 rows=18 loops=1)
              Group Key: prop.node_id
              ->  Index Only Scan using idx_alf_nprop_s on alf_node_properties 
prop  (cost=0.70..3505.99 rows=866 width=8) (actual time=0.020..0.035 r
ows=18 loops=1)
                    Index Cond: ((qname_id = '242'::bigint) AND (string_value = 
'E292432'::text))
                    Heap Fetches: 18
        ->  Index Scan using alf_node_pkey on alf_node node  
(cost=2963.22..200644.11 rows=1 width=8) (actual time=0.019..0.019 rows=1 
loops=18)
              Index Cond: (id = prop.node_id)
              Filter: ((type_qname_id <> 145) AND (store_id = 6) AND 
(type_qname_id = 240) AND (NOT (SubPlan 1)))
              Rows Removed by Filter: 0
              SubPlan 1
                ->  Materialize  (cost=2962.65..397912.89 rows=158204 width=8) 
(actual time=0.001..0.009 rows=85 loops=17)
                      ->  Bitmap Heap Scan on alf_node_aspects aspect_1  
(cost=2962.65..396503.87 rows=158204 width=8) (actual time=0.021..0.082 rows=
85 loops=1)
                            Recheck Cond: (qname_id = 251)
                            Heap Blocks: exact=55
                            ->  Bitmap Index Scan on fk_alf_nasp_qn  
(cost=0.00..2923.10 rows=158204 width=0) (actual time=0.015..0.015 rows=87 
loops=
1)
                                  Index Cond: (qname_id = 251)
  ->  Index Only Scan using alf_node_aspects_pkey on alf_node_aspects aspect  
(cost=0.57..34.32 rows=12 width=8) (actual time=0.004..0.004 rows=1 loop
s=17)
        Index Cond: ((node_id = node.id) AND (qname_id = 245))
        Heap Fetches: 17
Planning time: 0.623 ms
Execution time: 0.540 ms

Configurations are same in both servers.

Please advise me on this. Is there any configuration specifically I need to 
look like “work_mem”, “Shared_buffers”, “checkpoint_segment”, 
“effective_cache_size”, “enable_seqscan” and “checkpoint_compression_target”?

Thanks in advance.

Fahiz

On 9 Dec 2019, 19:03 +0000, Michael Lewis <mle...@entrata.com>, wrote:
> > > There is a specific search query I am running to get list of Documents 
> > > and their metadata from several table in the DB.
> > > We are running Postgres 9.6.9 on Amazon RDS (db.m5.4xlarge instance)
> > >
> > > Our current DB consists of 500GB of data and indexes. Most of the rows in 
> > > table are consist of 454,078,915
> > >
> > > With the fresh DB with the restore of the DATA without any indexes Search 
> > > query performs relatively quick and most of the time its less than a 
> > > second.
> > >
> > > But after 3 weeks of use of the DB it sudenly started to slowdown only 
> > > for this perticular query and it takes 20+ seconds to respond. If I do a 
> > > restore the DB again then it continues to work fine and the symptom pops 
> > > out after 3 weeks time.
> >
> >
> > You haven't been quite clear on the situation and your use case, but 
> > assuming this table has 454 million rows and experiences updates/deletes 
> > then this sounds like you may be having problems with autovacuum. Have you 
> > customized parameters to ensure it is running more frequently than default? 
> > How are you doing those data restores? Perhaps that process is cleaning up 
> > the accumulated bloat and you can run fine again for a while. Check 
> > pg_stat_user_tables for the last (auto)vacuum that ran, assuming you didn't 
> > just restore again and are expecting the issue to occur again soon.

Reply via email to