Hi Matteo,
Thanks for your suggestions, I just run some test with ILIKE and LIKE, and
ILIKE is consistently slower so I think I will keep the Lower functions.
As per your suggestion, I have switched indexes to use GIN type index,
they seem to build/read a bit faster, still the Recheck task continues to
happen in the query plan though. I have removed the Gender column from the
query since is not relevant in my tests. With all this playing around it
looks like the stats are now a bit more accurate.
The query went down to 9 seconds, ideally I would like to get to execute in
2 seconds..., any thoughts on what else I could try?
Thanks again,
Eric
=# explain (analyse on,buffers on)select T.form_id from
TAR_MVW_targeting_record AS T where T.status NOT IN ('ANULLED') AND
LOWER(T.household_member_last_name) LIKE LOWER('%tu%') group by T.form_id;
QUERY PLAN
---
---
HashAggregate (cost=557677.27..561360.83 rows=368356 width=8) (actual
time=10172.672..10410.068 rows=786669 loops=1)
Buffers: shared hit=304998
-> Bitmap Heap Scan on tar_mvw_targeting_record t
(cost=80048.06..552677.27 rows=202 width=8) (actual
time=2481.418..9564.280 rows
=33 loops=1)
Recheck Cond: ((status)::text <> 'ANULLED'::text)
Filter: (lower((household_member_last_name)::text) ~~ '%tu%'::text)
Rows Removed by Filter: 979
Buffers: shared hit=304998
-> Bitmap Index Scan on tar_mvw_targeting_record_lower_idx4
(cost=0.00..79548.06 rows=1012 width=0) (actual time=2375.399..2
375.399 rows=1012 loops=1)
Buffers: shared hit=7369
Total runtime: 10475.240 ms
On Mon, Nov 10, 2014 at 1:57 PM, desmodemone wrote:
>
>
> 2014-11-10 18:43 GMT+01:00 Eric Ramirez :
>
>>
>> Hi,
>> I have created a sample database with test data to help benchmark our
>> application. The database has ten million records, and is running on a
>> dedicated server(postgres 9.3) with 8GB of RAM. Our queries are pretty
>> slow with this amount of data and is my job to get them to run to at
>> acceptable speed. First thing that I notice was that the planner's row
>> estimates are off by a large number or records (millions) I have updated
>> the statistics target but didn't seem to make a difference. The relevant
>> output follows.
>> Am I looking in the wrong place, something else I should be trying?
>> Thanks in advance for your comments/suggestions,
>> Eric.
>>
>>
>> =# show work_mem;
>> work_mem
>> --
>> 1GB
>> (1 row)
>> =# show effective_cache_size;
>> effective_cache_size
>> --
>> 5GB
>> (1 row)
>>
>> =#ALTER TABLE TAR_MVW_TARGETING_RECORD ALTER COLUMN
>> household_member_first_name SET STATISTICS 5000;
>> =# vacuum analyse TAR_MVW_TARGETING_RECORD;
>>
>> =# \d tar_mvw_targeting_record;
>> Table "public.tar_mvw_targeting_record"
>>Column| Type | Modifiers
>> -+---+---
>> household_member_id | bigint|
>> form_id | bigint|
>> status | character varying(64) |
>> gender | character varying(64) |
>> household_member_first_name | character varying(64) |
>> household_member_last_name | character varying(64) |
>>
>> Indexes:
>>"tar_mvw_targeting_record_form_id_household_member_id_idx" UNIQUE,
>> btree (form_id, household_member_id)
>> "tar_mvw_targeting_record_lower_idx" gist
>> (lower(household_member_first_name::text) extensions.gist_trgm_ops)
>> WHERE status::text <> 'ANULLED'::text
>> "tar_mvw_targeting_record_lower_idx1" gist
>> (lower(household_member_last_name::text) extensions.gist_trgm_ops)
>> WHERE status::text <> 'ANULLED'::text
>>
>>
>> =# explain (analyse on,buffers on)select T.form_id from
>> TAR_MVW_targeting_record AS T where T.status NOT IN ('ANULLED') AND
>> LOWER(T.household_member_last_name) LIKE LOWER('%tu%') AND
>> T.gender='FEMALE' group by T.form_id;
>>
>> QUERY PLAN
>>
>>
>> ---
>> ---
>> HashAggregate (cost=450994.35..452834.96 rows=1840