[PERFORM] Some tuning suggestions on a Red Hat 6.7 - PG 9.5.3 production environment

2016-11-14 Thread Pietro Pugni
Dear list,
I’m looking for some guidelines on how to optimize the configuration of a 
production database dedicated to a DWH application.
I run the application on different machines and have solved several issues 
since now but am struggling on a production environment running Red Hat 6.7 and 
PostgreSQL 9.5.3.
My application does a lot of reads and many writes (plain “SELECT … INTO” and 
“INSERT”, no “UPDATE”), but on a order of magnitude lower than the reads.
The work flow consists of two big blocks: an ETL phase and the workloads on the 
data imported during the ETL phase.

The biggest schema has about 1.2 billions of rows distributed over a ten of 
tables; many of those tables are partitioned and have indexes. At the moment 
the database stores two schemas but I plan to add other three schemas of 
similar size.

The machine is virtualized and has 8 CPUs at about 3GHz, 64GB of RAM and 5TB of 
storage. It runs on Red Hat 6.7, kernel 2.6.x

The configuration changes I made so far are:
max_connections = 30
shared_buffers = 32GB
work_mem = 256MB
maintenance_work_mem = 4GB
effective_io_concurrency = 30
checkpoint_completion_target = 0.9
random_page_cost = 2.0
effective_cache_size = 48GB
default_statistics_target = 1000

autovacuum is on and the collation is ‘C’.


The first issue I faced was about maintenance_work_mem because I set it to 16GB 
and the server silently crashed during a VACUUM because I didn’t consider that 
it could take up to autovacuum_max_workers * maintenance_work_mem (roughly 
48GB). So I lowered maintenance_work_mem to 4GB and it did work. Should I set 
maintenance_work_mem to a smaller value (1GB) after the ETL terminates or can I 
leave it at 4GB without degrading the overall performance?

The second issue emerged during a intensive parallel query. I implemented a 
splitter that parallelize certain kind of queries. There were 8 similar queries 
running that was working on 8 overall disjoined subsets of the same table; this 
table has roughly 4.5 millions of rows. These queries uses SELECT DISTINCT, 
ORDER BY, OVER (PARTITION BY … ORDER BY) and COALESCE(). At a certain point the 
server crashed and I found the following error in the logs: 

postgres server process was terminated by signal 9 killed

After some research, I found that probably it was the OOM killer. Running 
“dmesg” tells that effectively it was. Reading the documentation and this 
answer on SO ( 
http://stackoverflow.com/questions/16418173/psql-seems-to-timeout-with-long-queries
 

 ), I realized that probably the issue is due to a misconfiguration. The value 
I set for this pg instance don’t seem to be so wrong, except maybe from 
maintenance_work_mem. I will certainly disable OOM as suggested by the official 
docs ( 
https://www.postgresql.org/docs/current/static/kernel-resources.html#LINUX-MEMORY-OVERCOMMIT
 

 ) but was wondering if I could tune the configuration a little better. Can 
someone give me some more advices?

I run the same application with different data (and workload) on other 
machines, but they have different configurations (Ubuntu 16.0.4). On one of 
them I previously disabled the virtual memory overcommit and never experienced 
that issue, but the machine has 128GB of RAM.

I hope to have been clear enough.
Thank you everyone
 Pietro

[PERFORM] Query planner chooses index scan backward instead of better index option

2016-11-14 Thread Seckin Pulatkan
Hi,

On our production environment (PostgreSQL 9.4.5 on
x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat
4.8.5-4), 64-bit), one of our queries runs very slow, about 5 minutes . We
noticed that it does not use an index that we anticapited it would.

The query is

select booking0_.*
from booking booking0_
where booking0_.customer_id in (
  select customer1_.id
 from customer customer1_
   where lower((customer1_.first_name||'
'||customer1_.last_name)) like '%gatef%'
  )
order by booking0_.id desc
limit 30;

We have just over 3.2 million records on booking and customer tables.


   1.

   QUERY PLAN

   2.

   Limit  (cost=0.86..11549.23 rows=30 width=241) (actual
time=9459.997..279283.497 rows=10 loops=1)

   3.

 ->  Nested Loop Semi Join  (cost=0.86..1979391.88 rows=5142
width=241) (actual time=9459.995..279283.482 rows=10 loops=1)

   4.

   ->  Index Scan Backward using pk_booking_id on booking
booking0_  (cost=0.43..522902.65 rows=2964333 width=241) (actual
time=0.043..226812.994 rows=3212711 loops=1)

   5.

   ->  Index Scan using pk_customer_id on customer customer1_
(cost=0.43..0.49 rows=1 width=4) (actual time=0.016..0.016 rows=0
loops=3212711)

   6.

 Index Cond: (id = booking0_.customer_id)

   7.

 Filter: (lowerfirst_name)::text || ' '::text) ||
(last_name)::text)) ~~ '%gatef%'::text)

   8.

 Rows Removed by Filter: 1

   9.

   Planning time: 2.901 ms

   10.

   Execution time: 279283.646 ms




The index that we expect it to use is

CREATE INDEX idx_customer_name_lower
  ON customer
  USING gin
  (lower((first_name::text || ' '::text) || last_name::text) COLLATE
pg_catalog."default" gin_trgm_ops);

explain (analyze, buffers)
select customer1_.id
   from customer customer1_
 where lower((customer1_.first_name||' '||customer1_.last_name)) like
'%gatef%';

 QUERY PLAN
---
 Bitmap Heap Scan on customer customer1_  (cost=2875.87..11087.13 rows=5144
width=4) (actual time=768.692..1571.241 rows=11 loops=1)
   Recheck Cond: (lowerfirst_name)::text || ' '::text) ||
(last_name)::text)) ~~ '%gatef%'::text)
   Heap Blocks: exact=11
   Buffers: shared hit=1420 read=23
   ->  Bitmap Index Scan on idx_customer_name_lower  (cost=0.00..2874.59
rows=5144 width=0) (actual time=763.327..763.327 rows=11 loops=1)
 Index Cond: (lowerfirst_name)::text || ' '::text) ||
(last_name)::text)) ~~ '%gatef%'::text)
 Buffers: shared hit=1418 read=14
 Planning time: 240.111 ms
 Execution time: 1571.403 ms

And then filter with customer_id index on booking table

CREATE INDEX idx_booking_customer_id
  ON booking
  USING btree
  (customer_id);

We have also created an index on booking table for id desc and customer_id

create index concurrently idx_booking_id_desc_customer_id on booking
using btree(id desc, customer_id);

But result was same


   1.

   QUERY PLAN

   2.

   Limit  (cost=0.86..12223.57 rows=30 width=241) (actual
time=1282.724..197879.302 rows=10 loops=1)

   3.

 ->  Nested Loop Semi Join  (cost=0.86..2094972.51 rows=5142
width=241) (actual time=1282.724..197879.292 rows=10 loops=1)

   4.

   ->  Index Scan Backward using pk_booking_id on booking
booking0_  (cost=0.43..525390.04 rows=3212872 width=241) (actual
time=0.012..131563.721 rows=3212879 loops=1)

   5.

   ->  Index Scan using pk_customer_id on customer customer1_
(cost=0.43..0.49 rows=1 width=4) (actual time=0.020..0.020 rows=0
loops=3212879)

   6.

 Index Cond: (id = booking0_.customer_id)

   7.

 Filter: (lowerfirst_name)::text || ' '::text) ||
(last_name)::text)) ~~ '%gatef%'::text)

   8.

 Rows Removed by Filter: 1

   9.

   Planning time: 0.424 ms

   10.

   Execution time: 197879.348 ms



If we remove "order by id desc" then it uses index that we expect it to
use. But we need that order by clause: with same query we are using a
pagination (offset) if there are more than 30 records.


   1.

   QUERY PLAN

   2.

   Limit  (cost=2790.29..2968.29 rows=30 width=241) (actual
time=27.932..38.643 rows=10 loops=1)

   3.

 ->  Nested Loop  (cost=2790.29..33299.63 rows=5142 width=241)
(actual time=27.931..38.640 rows=10 loops=1)

   4.

   ->  Bitmap Heap Scan on customer customer1_
(cost=2789.86..10997.73 rows=5142 width=4) (actual time=27.046..27.159
rows=11 loops=1)

   5.

 Recheck Cond: (lowerfirst_name)::text || '
'::text) || (last_name)::text)) ~~ '%gatef%'::text)

   6.

 Heap Blocks: exact=11

   7.

 ->  Bitmap Index Scan on idx_customer_name_lower
(cost=0.00..2788.57 rows=5142 width=0) (actual time=27.013..27.013
rows=11 loops=1)

   8.

   Index Cond: (lowerfirst_name)::text ||

Re: [PERFORM] Query planner chooses index scan backward instead of better index option

2016-11-14 Thread Jeff Janes
On Mon, Nov 14, 2016 at 4:01 AM, Seckin Pulatkan 
wrote:

> Hi,
>
> On our production environment (PostgreSQL 9.4.5 on
> x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat
> 4.8.5-4), 64-bit), one of our queries runs very slow, about 5 minutes . We
> noticed that it does not use an index that we anticapited it would.
>
> The query is
>
> select booking0_.*
> from booking booking0_
> where booking0_.customer_id in (
>   select customer1_.id
>  from customer customer1_
>where lower((customer1_.first_name||'
> '||customer1_.last_name)) like '%gatef%'
>   )
> order by booking0_.id desc
> limit 30;
>


It thinks it is going to find 30 rows which meet your condition very
quickly, so by walking the index backwards it can avoid needing to do a
sort.  But, the rows which meet your sub-select conditions are biased
towards the front of the index, so in fact it was to walk backwards through
most of your index before finding 30 eligible rows.

Your best bet is probably to force it into the plan you want by using a CTE:

with t as
(select booking0_.*
from booking booking0_
where booking0_.customer_id in (
  select customer1_.id
 from customer customer1_
   where lower((customer1_.first_name||'
'||customer1_.last_name)) like '%gatef%'
)  select * from t order by booking0_.id desc limit 30;

Cheers,

Jeff


Re: [PERFORM] Some tuning suggestions on a Red Hat 6.7 - PG 9.5.3 production environment

2016-11-14 Thread domenico febbo
dear Pietro,
are you sure about

effective_io_concurrency = 30

could you please explain the type of disk storage?

Il 14/Nov/2016 12:46, "Pietro Pugni"  ha scritto:

> Dear list,
> I’m looking for some guidelines on how to optimize the configuration of a
> production database dedicated to a DWH application.
> I run the application on different machines and have solved several issues
> since now but am struggling on a production environment running Red Hat 6.7
> and PostgreSQL 9.5.3.
> My application does a lot of reads and many writes (plain “SELECT … INTO”
> and “INSERT”, no “UPDATE”), but on a order of magnitude lower than the
> reads.
> The work flow consists of two big blocks: an ETL phase and the workloads
> on the data imported during the ETL phase.
>
> The biggest schema has about 1.2 billions of rows distributed over a ten
> of tables; many of those tables are partitioned and have indexes. At the
> moment the database stores two schemas but I plan to add other three
> schemas of similar size.
>
> The machine is virtualized and has 8 CPUs at about 3GHz, 64GB of RAM and
> 5TB of storage. It runs on Red Hat 6.7, kernel 2.6.x
>
> The configuration changes I made so far are:
> max_connections = 30
> shared_buffers = 32GB
> work_mem = 256MB
> maintenance_work_mem = 4GB
> effective_io_concurrency = 30
> checkpoint_completion_target = 0.9
> random_page_cost = 2.0
> effective_cache_size = 48GB
> default_statistics_target = 1000
>
> autovacuum is on and the collation is ‘C’.
>
>
> The first issue I faced was about maintenance_work_mem because I set it to
> 16GB and the server silently crashed during a VACUUM because I didn’t
> consider that it could take up to autovacuum_max_workers *
> maintenance_work_mem (roughly 48GB). So I lowered maintenance_work_mem to
> 4GB and it did work. *Should I set maintenance_work_mem to a smaller
> value (1GB) after the ETL terminates or can I leave it at 4GB without
> degrading the overall performance?*
>
> The second issue emerged during a intensive parallel query. I implemented
> a splitter that parallelize certain kind of queries. There were 8 similar
> queries running that was working on 8 overall disjoined subsets of the same
> table; this table has roughly 4.5 millions of rows. These queries uses
> SELECT DISTINCT, ORDER BY, OVER (PARTITION BY … ORDER BY) and COALESCE().
> At a certain point the server crashed and I found the following error in
> the logs:
>
> *postgres server process was terminated by signal 9 killed*
>
> After some research, I found that probably it was the OOM killer. Running
> “dmesg” tells that effectively it was. Reading the documentation and this
> answer on SO ( http://stackoverflow.com/questions/16418173/psql-seems-
> to-timeout-with-long-queries ), I realized that probably the issue is due
> to a misconfiguration. The value I set for this pg instance don’t seem to
> be so wrong, except maybe from maintenance_work_mem. I will certainly
> disable OOM as suggested by the official docs (
> https://www.postgresql.org/docs/current/static/kernel-
> resources.html#LINUX-MEMORY-OVERCOMMIT ) but *was wondering if I could
> tune the configuration a little better. Can someone give me some more
> advices?*
>
> I run the same application with different data (and workload) on other
> machines, but they have different configurations (Ubuntu 16.0.4). On one of
> them I previously disabled the virtual memory overcommit and never
> experienced that issue, but the machine has 128GB of RAM.
>
> I hope to have been clear enough.
> Thank you everyone
>  Pietro
>


[PERFORM] Why is the optimiser choosing a sub-optimal plan?

2016-11-14 Thread Stephen Cresswell
I have the a table with two indexes...

CREATE TABLE mobile_summary_usage
(
   import   text,
   msisdn   text,
   type text,
   totalinteger,
   day  date,
   cycletext
);

CREATE INDEX mobile_summary_usage_msisdn_cycle ON mobile_summary_usage
USING btree (msisdn, cycle);

CREATE INDEX mobile_summary_usage_cycle ON mobile_summary_usage USING
btree (cycle);


We insert approximately 2M records into this table each day. Whenever
someone wants to see the total amount of voice calls, text messages or
data they've used we query the table with the following

SELECT msisdn, type, sum (total), units
FROM mobile_summary_usage msu, mobile_summary_type mst
WHERE type = id AND msisdn = ? AND cycle = ?
GROUP BY msisdn, type, units;

Where:
msisdn is a mobile number
cycle is a billing cycle, e.g. 2016-10
mobile_summary_type contains 3 rows, one for each usage type.

Everything was working fine until we flipped over from 2016-10 to
2016-11. Then instead of averaging well below a 0.5 seconds to
respond, Postgres started taking over a second.

Running EXPLAIN ANALYZE on the above query shows that in 2016-10 when
there are approximately 100M rows, Postgres uses the compound (msisdn,
cycle) index. This has a cost of 3218.98 and takes 0.071 seconds.

HashAggregate  (cost=3213.12..3218.98 rows=586 width=52) (actual
time=0.071..0.071 rows=0 loops=1)
  Group Key: msu.msisdn, msu.type, mst.units
  ->  Hash Join  (cost=62.54..3205.15 rows=797 width=52) (actual
time=0.069..0.069 rows=0 loops=1)
Hash Cond: (msu.type = mst.id)
->  Bitmap Heap Scan on mobile_summary_usage msu
(cost=32.74..3164.39 rows=797 width=20) (actual time=0.037..0.037
rows=0 loops=1)
  Recheck Cond: ((msisdn = '07700900331'::text) AND (cycle
= '2016-10'::text))
  ->  Bitmap Index Scan on
mobile_summary_usage_msisdn_cycle  (cost=0.00..32.54 rows=797 width=0)
(actual time=0.036..0.036 rows=0 loops=1)
Index Cond: ((msisdn = '07700900331'::text) AND
(cycle = '2016-10'::text))
->  Hash  (cost=18.80..18.80 rows=880 width=64) (actual
time=0.026..0.026 rows=4 loops=1)
  Buckets: 1024  Batches: 1  Memory Usage: 9kB
  ->  Seq Scan on mobile_summary_type mst
(cost=0.00..18.80 rows=880 width=64) (actual time=0.014..0.016 rows=4
loops=1)
Planning time: 0.197 ms
Execution time: 0.125 ms


When I re-run the plan for 2016-11 (currently 4M rows), Postgres uses
the simpler "cycle" index. The cost is 12.79 but the actual time taken
is 1412.609 seconds

HashAggregate  (cost=12.78..12.79 rows=1 width=52) (actual
time=1412.609..1412.609 rows=0 loops=1)
Execution time: 1412.674 ms
  Group Key: msu.msisdn, msu.type, mst.units
  ->  Nested Loop  (cost=0.72..12.77 rows=1 width=52) (actual
time=1412.606..1412.606 rows=0 loops=1)
->  Index Scan using mobile_summary_usage_cycle on
mobile_summary_usage msu  (cost=0.57..4.59 rows=1 width=20) (actual
time=1412.604..1412.604 rows=0 loops=1)
->  Index Scan using mobile_summary_type_pkey on
mobile_summary_type mst  (cost=0.15..8.17 rows=1 width=64) (never
executed)
  Rows Removed by Filter: 3932875
  Index Cond: (id = msu.type)
  Index Cond: (cycle = '2016-11'::text)
  Filter: (msisdn = '07700900331'::text)



I understand there are a whole host of reasons why postgres may chose
different plans based on data volumes, but in this case despite the
lower cost the performance is significantly worse. Is there any
explanation for why it's making such a poor decision and
recommendations for how to fix it?

Any help appreciated.


Re: [PERFORM] Why is the optimiser choosing a sub-optimal plan?

2016-11-14 Thread Tom Lane
Stephen Cresswell  writes:
> I have the a table with two indexes...

(1) Tell us about the other table, mobile_summary_type.

(2) Did you transcribe the second query plan correctly?  I have a hard
time believing that EXPLAIN printed two Index Cond lines for the same
indexscan.

(3) What PG version is this, exactly?

(4) Are you doing anything funny like disabling autovacuum/autoanalyze?
The rowcount estimates in the "good" plan seem rather far away from
reality, and it's not obvious why, particularly here:

>   ->  Seq Scan on mobile_summary_type mst
> (cost=0.00..18.80 rows=880 width=64) (actual time=0.014..0.016 rows=4
> loops=1)

regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Some tuning suggestions on a Red Hat 6.7 - PG 9.5.3 production environment

2016-11-14 Thread Pietro Pugni
Dear Domenico,
I pushed a little hard on that because the virtualizer runs on a distributed 
system composed by 7 clusters with more than 100 cores and an enterprise 
storage. I know that usually effective_io_concurrency is set based on the 
number of disks available in a RAID configuration (minus the stripe disks), so 
I decided to push hard on this due to the nature of the host machine. I have no 
much information about that machine but I can investigate.

Anyway, the OOM configuration described in the official documentation prevents 
from killing the postmaster.

I’m still wondering about the other configuration parameters, if they are 
reasonable or can be tweaked to get more performance.

Thank you for your support
 Pietro

PS I made a typo in the configuration. max_connections is 20, not 30.


> Il giorno 14 nov 2016, alle ore 18:36, domenico febbo 
>  ha scritto:
> 
> dear Pietro, 
> are you sure about
> 
> effective_io_concurrency = 30
> 
> could you please explain the type of disk storage?
> 
> 
> Il 14/Nov/2016 12:46, "Pietro Pugni"  > ha scritto:
> Dear list,
> I’m looking for some guidelines on how to optimize the configuration of a 
> production database dedicated to a DWH application.
> I run the application on different machines and have solved several issues 
> since now but am struggling on a production environment running Red Hat 6.7 
> and PostgreSQL 9.5.3.
> My application does a lot of reads and many writes (plain “SELECT … INTO” and 
> “INSERT”, no “UPDATE”), but on a order of magnitude lower than the reads.
> The work flow consists of two big blocks: an ETL phase and the workloads on 
> the data imported during the ETL phase.
> 
> The biggest schema has about 1.2 billions of rows distributed over a ten of 
> tables; many of those tables are partitioned and have indexes. At the moment 
> the database stores two schemas but I plan to add other three schemas of 
> similar size.
> 
> The machine is virtualized and has 8 CPUs at about 3GHz, 64GB of RAM and 5TB 
> of storage. It runs on Red Hat 6.7, kernel 2.6.x
> 
> The configuration changes I made so far are:
> max_connections = 30
> shared_buffers = 32GB
> work_mem = 256MB
> maintenance_work_mem = 4GB
> effective_io_concurrency = 30
> checkpoint_completion_target = 0.9
> random_page_cost = 2.0
> effective_cache_size = 48GB
> default_statistics_target = 1000
> 
> autovacuum is on and the collation is ‘C’.
> 
> 
> The first issue I faced was about maintenance_work_mem because I set it to 
> 16GB and the server silently crashed during a VACUUM because I didn’t 
> consider that it could take up to autovacuum_max_workers * 
> maintenance_work_mem (roughly 48GB). So I lowered maintenance_work_mem to 4GB 
> and it did work. Should I set maintenance_work_mem to a smaller value (1GB) 
> after the ETL terminates or can I leave it at 4GB without degrading the 
> overall performance?
> 
> The second issue emerged during a intensive parallel query. I implemented a 
> splitter that parallelize certain kind of queries. There were 8 similar 
> queries running that was working on 8 overall disjoined subsets of the same 
> table; this table has roughly 4.5 millions of rows. These queries uses SELECT 
> DISTINCT, ORDER BY, OVER (PARTITION BY … ORDER BY) and COALESCE(). At a 
> certain point the server crashed and I found the following error in the logs: 
> 
> postgres server process was terminated by signal 9 killed
> 
> After some research, I found that probably it was the OOM killer. Running 
> “dmesg” tells that effectively it was. Reading the documentation and this 
> answer on SO ( 
> http://stackoverflow.com/questions/16418173/psql-seems-to-timeout-with-long-queries
>  
> 
>  ), I realized that probably the issue is due to a misconfiguration. The 
> value I set for this pg instance don’t seem to be so wrong, except maybe from 
> maintenance_work_mem. I will certainly disable OOM as suggested by the 
> official docs ( 
> https://www.postgresql.org/docs/current/static/kernel-resources.html#LINUX-MEMORY-OVERCOMMIT
>  
> 
>  ) but was wondering if I could tune the configuration a little better. Can 
> someone give me some more advices?
> 
> I run the same application with different data (and workload) on other 
> machines, but they have different configurations (Ubuntu 16.0.4). On one of 
> them I previously disabled the virtual memory overcommit and never 
> experienced that issue, but the machine has 128GB of RAM.
> 
> I hope to have been clear enough.
> Thank you everyone
>  Pietro



Re: [PERFORM] Some tuning suggestions on a Red Hat 6.7 - PG 9.5.3 production environment

2016-11-14 Thread Merlin Moncure
On Mon, Nov 14, 2016 at 11:36 AM, domenico febbo
 wrote:
> dear Pietro,
> are you sure about
>
> effective_io_concurrency = 30
>
> could you please explain the type of disk storage?

fast storage can certainly utilize high settings of
effective_io_concurrency at least in some cases...for example see:
https://www.postgresql.org/message-id/CAHyXU0yiVvfQAnR9cyH%3DHWh1WbLRsioe%3DmzRJTHwtr%3D2azsTdQ%40mail.gmail.com

merlin


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance