Re: CPU Consuming query. Sequential scan despite indexing.

aditya desai Wed, 21 Oct 2020 22:27:58 -0700

Hi,
Kindly requesting for help on this. Thanks.

-Aditya.




On Tue, Oct 20, 2020 at 6:00 PM aditya desai <[email protected]> wrote:

> Hi Laurenz,
> I created
>
> On Fri, Oct 16, 2020 at 2:06 PM Laurenz Albe <[email protected]>
> wrote:
>
>> On Thu, 2020-10-15 at 20:34 +0530, aditya desai wrote:
>> > Below query always shows up on top in the CPU matrix. Also despite
>> having indexes it does sequential scans
>> > (probably because WHERE condition satisfies almost all of the data from
>> table). This query
>> > runs on the default landing page in application and needs to fetch
>> records in less that 100 ms
>> >  without consuming too much CPU.
>> >
>> >  Any opinions? Table is very huge and due to referential identity and
>> business requirements we could not
>> >  implement partitioning as well.
>> >
>> > There is index on (countrycode,facilitycode,jobstartdatetime)
>> >
>> > explain (analyze,buffers) with JobCount as ( select
>> jobstatuscode,count(1) stat_count from job j
>> >  where 1=1 and j.countrycode = 'TH'
>> > and j.facilitycode in
>> ('THNPM1','THPRK1','THCNT1','THSPN1','THKRI1','THPKN1','THSBI1','THUTG1','THLRI1','THSRI1','THSUR1','THSKM1')
>> >  and ((j.jobstartdatetime  between '2020-08-01 00:00:00' and
>> '2020-09-30 00:00:00' ) or j.jobstartdatetime IS NULL )  group by
>> j.jobstatuscode)
>> >  select js.jobstatuscode,COALESCE(stat_count,0) stat_count from
>> JobCount jc right outer join jobstatus js on
>> jc.jobstatuscode=js.jobstatuscode;
>> >
>> >                           QUERY PLAN
>> >
>> >  Hash Right Join  (cost=98845.93..98846.10 rows=10 width=12) (actual
>> time=1314.809..1314.849 rows=10 loops=1)
>> >                              ->  Parallel Seq Scan on job j
>> (cost=0.00..96837.93 rows=200963 width=4) (actual time=13.010..1144.434
>> rows=163200 loops=3)
>> >                                    Filter: (((countrycode)::text =
>> 'TH'::text) AND (((jobstartdatetime >= '2020-08-01 00:00:00'::timestamp
>> without time zone) AND (jobst
>> > artdatetime <= '2020-09-30 00:00:00'::timestamp without time zone)) OR
>> (jobstartdatetime IS NULL)) AND ((facilitycode)::text = ANY
>> ('{THNPM1,THPRK1,THCNT1,THSPN1,THKRI1
>> > ,THPKN1,THSBI1,THUTG1,THLRI1,THSRI1,THSUR1,THSKM1}'::text[])))
>> >                                    Rows Removed by Filter: 449035
>> >                                    Buffers: shared hit=60086 read=11834
>> >                                    I/O Timings: read=59.194
>> >
>>
>> You should rewrite the subquery as a UNION to avoid the OR:
>>
>>   ... WHERE j.countrycode = 'TH'
>>         and j.facilitycode in
>> ('THNPM1','THPRK1','THCNT1','THSPN1','THKRI1','THPKN1','THSBI1','THUTG1','THLRI1','THSRI1','THSUR1','THSKM1')
>>         and j.jobstartdatetime between '2020-08-01 00:00:00' and
>> '2020-09-30 00:00:00'
>>
>> and
>>
>>   ... WHERE j.countrycode = 'TH'
>>         and j.facilitycode in
>> ('THNPM1','THPRK1','THCNT1','THSPN1','THKRI1','THPKN1','THSBI1','THUTG1','THLRI1','THSRI1','THSUR1','THSKM1')
>>         and j.jobstartdatetime IS NULL
>>
>> These indexes could speed up the resulting query:
>>
>>   CREATE INDEX ON job (countrycode, facilitycode);
>>   CREATE INDEX ON job (countrycode, jobstartdatetime);
>>   CREATE INDEX ON job (countrycode, facilitycode) WHERE jobstartdaytime
>> IS NULL;
>>
>
> I created the indexes you suggested and changed the query with the UNION
> operator. Please see explain plan below. Performance of the query(execution
> time has improved mostly because I ran vacuum full). Cost of the query is
> still high.This is Dev envrionment and has 2 vCPU and 8 GB RAM.
>
> explain (analyze,buffers) with JobCount as ( (select
> jobstatuscode,count(1) stat_count from job j where 1=1 and j.countrycode =
> 'TH'   and j.facilitycode in
> ('THNPM1','THPRK1','THCNT1','THSPN1','THKRI1','THPKN1','THSBI1','THUTG1','THLRI1','THSRI1','THSUR1','THSKM1')
> and j.jobstartdatetime  between '2020-08-01 00:00:00' and '2020-09-30
> 00:00:00'    group by j.jobstatuscode) UNION (select jobstatuscode,count(1)
> stat_count from job j where 1=1 and j.countrycode = 'TH'   and
> j.facilitycode in
> ('THNPM1','THPRK1','THCNT1','THSPN1','THKRI1','THPKN1','THSBI1','THUTG1','THLRI1','THSRI1','THSUR1','THSKM1')
> and j.jobstartdatetime is null  group by j.jobstatuscode))
> lmp_delivery_jobs->  select js.jobstatuscode,COALESCE(stat_count,0)
> stat_count from JobCount jc right outer join jobstatus js on
> jc.jobstatuscode=js.jobstatuscode;
>
>                                      QUERY PLAN
>
>
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> -----------------------------------------------------------------
>  Hash Right Join  (cost=79010.89..79011.19 rows=10 width=12) (actual
> time=444.241..444.256 rows=10 loops=1)
>    Hash Cond: ((jc.jobstatuscode)::text = (js.jobstatuscode)::text)
>    Buffers: shared hit=8560
>    CTE jobcount
>      ->  HashAggregate  (cost=79002.35..79002.48 rows=13 width=24) (actual
> time=444.211..444.213 rows=6 loops=1)
>            Group Key: j.jobstatuscode, (count(1))
>            Buffers: shared hit=8558
>            ->  Append  (cost=78959.64..79002.28 rows=13 width=24) (actual
> time=444.081..444.202 rows=6 loops=1)
>                  Buffers: shared hit=8558
>                  ->  Finalize GroupAggregate  (cost=78959.64..78961.41
> rows=7 width=12) (actual time=444.079..444.101 rows=6 loops=1)
>                        Group Key: j.jobstatuscode
>                        Buffers: shared hit=8546
>                        ->  Gather Merge  (cost=78959.64..78961.27 rows=14
> width=12) (actual time=444.063..444.526 rows=18 loops=1)
>                              Workers Planned: 2
>                              Workers Launched: 2
>                              Buffers: shared hit=17636
>                              ->  Sort  (cost=77959.61..77959.63 rows=7
> width=12) (actual time=435.748..435.750 rows=6 loops=3)
>                                    Sort Key: j.jobstatuscode
>                                    Sort Method: quicksort  Memory: 25kB
>                                    Worker 0:  Sort Method: quicksort
> Memory: 25kB
>                                    Worker 1:  Sort Method: quicksort
> Memory: 25kB
>                                    Buffers: shared hit=17636
>                                    ->  Partial HashAggregate
> (cost=77959.44..77959.51 rows=7 width=12) (actual time=435.703..435.706
> rows=6 loops=3)
>                                          Group Key: j.jobstatuscode
>                                          Buffers: shared hit=17620
>                                          ->  Parallel Bitmap Heap Scan on
> job j  (cost=11528.22..76957.69 rows=200351 width=4) (actual
> time=47.682..281.928 rows=163200
> loops=3)
>                                                Recheck Cond:
> (((countrycode)::text = 'TH'::text) AND ((facilitycode)::text = ANY
> ('{THNPM1,THPRK1,THCNT1,THSPN1,THKRI1,T
> HPKN1,THSBI1,THUTG1,THLRI1,THSRI1,THSUR1,THSKM1}'::text[])))
>                                                Filter: ((jobstartdatetime
> >= '2020-08-01 00:00:00'::timestamp without time zone) AND
> (jobstartdatetime <= '2020-09-30 00
> :00:00'::timestamp without time zone))
>                                                Heap Blocks: exact=6633
>                                                Buffers: shared hit=17620
>                                                ->  Bitmap Index Scan on
> job_list_test1  (cost=0.00..11408.01 rows=482693 width=0) (actual
> time=49.825..49.826 rows=48960
> 0 loops=1)
>                                                      Index Cond:
> (((countrycode)::text = 'TH'::text) AND ((facilitycode)::text = ANY
> ('{THNPM1,THPRK1,THCNT1,THSPN1,THKR
> I1,THPKN1,THSBI1,THUTG1,THLRI1,THSRI1,THSUR1,THSKM1}'::text[])))
>                                                      Buffers: shared
> hit=1913
>                  ->  GroupAggregate  (cost=40.50..40.68 rows=6 width=12)
> (actual time=0.093..0.094 rows=0 loops=1)
>                        Group Key: j_1.jobstatuscode
>                        Buffers: shared hit=12
>                        ->  Sort  (cost=40.50..40.54 rows=16 width=4)
> (actual time=0.092..0.092 rows=0 loops=1)
>                              Sort Key: j_1.jobstatuscode
>                              Sort Method: quicksort  Memory: 25kB
>                              Buffers: shared hit=12
>                              ->  Index Scan using job_list_test3 on job
> j_1  (cost=0.14..40.18 rows=16 width=4) (actual time=0.081..0.082 rows=0
> loops=1)
>                                    Index Cond: (((countrycode)::text =
> 'TH'::text) AND ((facilitycode)::text = ANY
> ('{THNPM1,THPRK1,THCNT1,THSPN1,THKRI1,THPKN1,THSBI1,T
> HUTG1,THLRI1,THSRI1,THSUR1,THSKM1}'::text[])))
>                                    Buffers: shared hit=12
>    ->  CTE Scan on jobcount jc  (cost=0.00..0.26 rows=13 width=24) (actual
> time=444.215..444.221 rows=6 loops=1)
>          Buffers: shared hit=8558
>    ->  Hash  (cost=8.29..8.29 rows=10 width=4) (actual time=0.016..0.016
> rows=10 loops=1)
>          Buckets: 1024  Batches: 1  Memory Usage: 9kB
>          Buffers: shared hit=2
>          ->  Index Only Scan using jobstatus_jobstatuscode_unq on
> jobstatus js  (cost=0.14..8.29 rows=10 width=4) (actual time=0.006..0.010
> rows=10 loops=1)
>                Heap Fetches: 0
>                Buffers: shared hit=2
>  Planning Time: 0.808 ms
>  Execution Time: 444.819 ms
> (53 rows)
>
>
>
>
>>
>> Yours,
>> Laurenz Albe
>> --
>> Cybertec | https://www.cybertec-postgresql.com
>>
>>

Re: CPU Consuming query. Sequential scan despite indexing.

Reply via email to