We faced a similar issue, adding RDS proxy in front of RDS Postgres can
help.
In our situation, there were a lot of connects/disconnects from Lambda
functions although concurrency of Lambda was 100 only.
And adding connection pooler(RDS proxy) helped us to reduce the CPU load
from 100% to 30%

Happy to help :)
Prince Pathria Systems Engineer | Certified Kubernetes Administrator | AWS
Certified Solutions Architect Evive +91 9478670472 goevive.com


On Mon, Sep 28, 2020 at 9:21 PM aditya desai <admad...@gmail.com> wrote:

>
>> Hi,
>> We have an application where one of the APIs calling queries(attached) is
>> spiking the CPU to 100% during load testing.
>> However, queries are making use of indexes(Bitmap Index and Bitmap Heap
>> scan though). When run separately on DB queries hardly take less than 200
>> ms. Is CPU spiking due to Bitmap Heap Scan?
>> These queries are being called thousands of times. Application team says
>> they have handled connection pooling from the Application side. So there is
>> no connection pooling here from DB side. Current db instance size is 
>> "db.m4.4xlarge"
>> 64 GB RAM 16 vCPU".
>> The Application dev team has primary keys and foreign keys on tables so
>> they are unable to partition the tables as well due to limitations of
>> postgres partitioning. Columns in WHERE clauses are not constant in all
>> queries to decide partition keys.
>>
>> 1. Does DB need more CPU considering this kind of load?
>> 2. Can the query be tuned further? It is already using indexes(Bitmap
>> though).
>> 3. Will connection pooling resolve the CPU Spike issues?
>>
>> Also pasting Query and plans below.
>>
>> ----------------------exampleCount 1. Without
>> internalexamplecode-----------------------
>>
>> lmp_examples=> explain analyze with exampleCount as ( select
>> examplestatuscode from example j where 1=1 and j.countrycode = 'AD'   and
>> j.facilitycode in ('ABCD') and j.internalexamplecode in
>> ('005','006','007','005') and ((j.examplestartdatetime  between '2020-05-18
>> 00:00:00' and '2020-08-19 00:00:00' ) or j.examplestartdatetime IS NULL )
>> group by j.examplestatuscode)
>> lmp_examples-> select js.examplestatuscode,COALESCE(count(*),0)
>> stat_count from exampleCount jc right outer join examplestatus js on
>> jc.examplestatuscode=js.examplestatuscode group by js.examplestatuscode ;
>>
>>
>>                            QUERY PLAN
>>
>>
>>
>> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>  HashAggregate  (cost=79353.80..79353.89 rows=9 width=12) (actual
>> time=88.847..88.850 rows=9 loops=1)
>>    Group Key: js.examplestatuscode
>>    CTE examplecount
>>      ->  HashAggregate  (cost=79352.42..79352.46 rows=4 width=4) (actual
>> time=88.803..88.805 rows=5 loops=1)
>>            Group Key: j.examplestatuscode
>>            ->  Bitmap Heap Scan on example j  (cost=1547.81..79251.08
>> rows=40538 width=4) (actual time=18.424..69.658 rows=62851 loops=1)
>>                  Recheck Cond: ((((countrycode)::text = 'AD'::text) AND
>> ((facilitycode)::text = 'ABCD'::text) AND ((internalexamplecode)::text =
>> ANY ('{005,006,007,005}'::text[])) AND (examplestartdatetime >= '2020-05-18
>> 00:00:00'::timestamp without time zone) AND (examplestartdatetime <=
>> '2020-08-19 00:00:00'::timestamp without time zone)) OR
>> (examplestartdatetime IS NULL))
>>                  Filter: (((countrycode)::text = 'AD'::text) AND
>> ((facilitycode)::text = 'ABCD'::text) AND ((internalexamplecode)::text =
>> ANY ('{005,006,007,005}'::text[])))
>>                  Rows Removed by Filter: 3
>>                  Heap Blocks: exact=18307
>>                  ->  BitmapOr  (cost=1547.81..1547.81 rows=40538 width=0)
>> (actual time=15.707..15.707 rows=0 loops=1)
>>                        ->  Bitmap Index Scan on example_list9_idx
>> (cost=0.00..1523.10 rows=40538 width=0) (actual time=15.702..15.702
>> rows=62851 loops=1)
>>                              Index Cond: (((countrycode)::text =
>> 'AD'::text) AND ((facilitycode)::text = 'ABCD'::text) AND
>> ((internalexamplecode)::text = ANY ('{005,006,007,005}'::text[])) AND
>> (examplestartdatetime >= '2020-05-18 00:00:00'::timestamp without time
>> zone) AND (examplestartdatetime <= '2020-08-19 00:00:00'::timestamp without
>> time zone))
>>                        ->  Bitmap Index Scan on example_list10_idx
>> (cost=0.00..4.44 rows=1 width=0) (actual time=0.004..0.004 rows=3 loops=1)
>>                              Index Cond: (examplestartdatetime IS NULL)
>>    ->  Hash Left Join  (cost=0.13..1.29 rows=9 width=4) (actual
>> time=88.831..88.840 rows=9 loops=1)
>>          Hash Cond: ((js.examplestatuscode)::text =
>> (jc.examplestatuscode)::text)
>>          ->  Seq Scan on examplestatus js  (cost=0.00..1.09 rows=9
>> width=4) (actual time=0.004..0.007 rows=9 loops=1)
>>          ->  Hash  (cost=0.08..0.08 rows=4 width=16) (actual
>> time=88.817..88.817 rows=5 loops=1)
>>                Buckets: 1024  Batches: 1  Memory Usage: 9kB
>>                ->  CTE Scan on examplecount jc  (cost=0.00..0.08 rows=4
>> width=16) (actual time=88.807..88.812 rows=5 loops=1)
>>  Planning Time: 0.979 ms
>>  Execution Time: 89.036 ms
>> (23 rows)
>>
>>
>> ----------------exampleCount 2. With
>> internalexamplecode---------------------------------
>>
>>
>> lmp_examples=> explain analyze with exampleCount as ( select
>> examplestatuscode,count(1) stat_count from example j where 1=1 and
>> j.countrycode = 'AD'   and j.facilitycode in ('ABCD') and
>> j.internalexamplecode in ('005','006','007','005') and
>> ((j.examplestartdatetime  between '2020-05-18 00:00:00' and '2020-08-19
>> 00:00:00' ) or j.examplestartdatetime IS NULL )  group by
>> j.examplestatuscode)
>> lmp_examples-> select js.examplestatuscode,COALESCE(stat_count,0)
>> stat_count from exampleCount jc right outer join examplestatus js on
>> jc.examplestatuscode=js.examplestatuscode;
>>
>>
>>                            QUERY PLAN
>>
>>
>>
>> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>  Hash Left Join  (cost=79453.94..79455.10 rows=9 width=12) (actual
>> time=89.660..89.669 rows=9 loops=1)
>>    Hash Cond: ((js.examplestatuscode)::text =
>> (jc.examplestatuscode)::text)
>>    CTE examplecount
>>      ->  HashAggregate  (cost=79453.77..79453.81 rows=4 width=12) (actual
>> time=89.638..89.640 rows=5 loops=1)
>>            Group Key: j.examplestatuscode
>>            ->  Bitmap Heap Scan on example j  (cost=1547.81..79251.08
>> rows=40538 width=4) (actual time=18.193..69.710 rows=62851 loops=1)
>>                  Recheck Cond: ((((countrycode)::text = 'AD'::text) AND
>> ((facilitycode)::text = 'ABCD'::text) AND ((internalexamplecode)::text =
>> ANY ('{005,006,007,005}'::text[])) AND (examplestartdatetime >= '2020-05-18
>> 00:00:00'::timestamp without time zone) AND (examplestartdatetime <=
>> '2020-08-19 00:00:00'::timestamp without time zone)) OR
>> (examplestartdatetime IS NULL))
>>                  Filter: (((countrycode)::text = 'AD'::text) AND
>> ((facilitycode)::text = 'ABCD'::text) AND ((internalexamplecode)::text =
>> ANY ('{005,006,007,005}'::text[])))
>>                  Rows Removed by Filter: 3
>>                  Heap Blocks: exact=18307
>>                  ->  BitmapOr  (cost=1547.81..1547.81 rows=40538 width=0)
>> (actual time=15.483..15.483 rows=0 loops=1)
>>                        ->  Bitmap Index Scan on example_list9_idx
>> (cost=0.00..1523.10 rows=40538 width=0) (actual time=15.477..15.478
>> rows=62851 loops=1)
>>                              Index Cond: (((countrycode)::text =
>> 'AD'::text) AND ((facilitycode)::text = 'ABCD'::text) AND
>> ((internalexamplecode)::text = ANY ('{005,006,007,005}'::text[])) AND
>> (examplestartdatetime >= '2020-05-18 00:00:00'::timestamp without time
>> zone) AND (examplestartdatetime <= '2020-08-19 00:00:00'::timestamp without
>> time zone))
>>                        ->  Bitmap Index Scan on example_list10_idx
>> (cost=0.00..4.44 rows=1 width=0) (actual time=0.004..0.004 rows=3 loops=1)
>>                              Index Cond: (examplestartdatetime IS NULL)
>>    ->  Seq Scan on examplestatus js  (cost=0.00..1.09 rows=9 width=4)
>> (actual time=0.003..0.005 rows=9 loops=1)
>>    ->  Hash  (cost=0.08..0.08 rows=4 width=24) (actual
>> time=89.650..89.651 rows=5 loops=1)
>>          Buckets: 1024  Batches: 1  Memory Usage: 9kB
>>          ->  CTE Scan on examplecount jc  (cost=0.00..0.08 rows=4
>> width=24) (actual time=89.641..89.647 rows=5 loops=1)
>>  Planning Time: 0.470 ms
>>  Execution Time: 89.737 ms
>>
>> ------------------------exampleSelect-----------------------------------
>>
>>
>> lmp_examples=> explain analyze select j.id from example j where 1=1  and
>> j.countrycode = 'AD'  and j.facilitycode in ('ABCD') and
>> j.examplestatuscode in ('101') and j.internalexamplecode in
>> ('005','006','007','005')  and ((j.examplestartdatetime between '2020-05-18
>> 00:00:00' and '2020-08-19 00:00:00') or j.examplestartdatetime IS NULL)
>> ORDER BY createddate DESC limit 10;
>>
>>                                 QUERY PLAN
>>
>>
>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>  Limit  (cost=71286.65..71286.68 rows=10 width=12) (actual
>> time=47.351..47.359 rows=10 loops=1)
>>    ->  Sort  (cost=71286.65..71335.31 rows=19462 width=12) (actual
>> time=47.349..47.352 rows=10 loops=1)
>>          Sort Key: createddate DESC
>>          Sort Method: top-N heapsort  Memory: 25kB
>>          ->  Bitmap Heap Scan on example j  (cost=1176.77..70866.09
>> rows=19462 width=12) (actual time=15.133..46.555 rows=2530 loops=1)
>>                Recheck Cond: (((countrycode)::text = 'AD'::text) AND
>> ((facilitycode)::text = 'ABCD'::text) AND ((examplestatuscode)::text =
>> '101'::text) AND ((internalexamplecode)::text = ANY
>> ('{005,006,007,005}'::text[])))
>>                Filter: (((examplestartdatetime >= '2020-05-18
>> 00:00:00'::timestamp without time zone) AND (examplestartdatetime <=
>> '2020-08-19 00:00:00'::timestamp without time zone)) OR
>> (examplestartdatetime IS NULL))
>>                Rows Removed by Filter: 38724
>>                Heap Blocks: exact=20923
>>                ->  Bitmap Index Scan on example_list1_idx
>> (cost=0.00..1171.90 rows=33211 width=0) (actual time=9.938..9.939
>> rows=41254 loops=1)
>>                      Index Cond: (((countrycode)::text = 'AD'::text) AND
>> ((facilitycode)::text = 'ABCD'::text) AND ((examplestatuscode)::text =
>> '101'::text) AND ((internalexamplecode)::text = ANY
>> ('{005,006,007,005}'::text[])))
>>  Planning Time: 0.398 ms
>>  Execution Time: 47.416 ms
>>
>> Regards,
>> Aditya.
>>
>

Reply via email to