Re: [PERFORM] Slow query after 9.3 to 9.6 migration

2017-01-05 Thread Merlin Moncure
On Tue, Dec 27, 2016 at 5:50 PM, Flávio Henrique  wrote:
> Hi there, fellow experts!
>
> I need an advice with query that became slower after 9.3 to 9.6 migration.
>
> First of all, I'm from the dev team.
>
> Before migration, we (programmers) made some modifications on query bring
> it's average time from 8s to 2-3s.
>
> As this query is the most executed on our system (it builds the user panel
> to work), every bit that we can squeeze from it will be nice.
>
> Now, after server migration to 9.6 we're experiencing bad times with this
> query again.
>
> Unfortunately, I don't have the old query plain (9.3 version) to show you,
> but in the actual version (9.6) I can see some buffers written that tells me
> that something is wrong.
>
> Our server has 250GB of memory available, but the database team says that
> they can't do nothing to make this query better. I'm not sure, as some
> buffers are written on disk.
>
> Any tip/help will be much appreciated (even from the query side).
>
> Thank you!
>
> The query plan: https://explain.depesz.com/s/5KMn
>
> Note: I tried to add index on kilo_victor table already, but Postgresql
> still thinks that is better to do a seq scan.

Hard to provide more without the query or the 'old' plan.   Here are
some things you can try:
*) Set effective_io_concurrency high.You have some heap scanning
going on and this can sometimes help (but it should be marginal).
*) See if you can get any juice out of parallel query
*) try playing with enable_nestloop and enable_seqscan.   these are
hail mary passes but worth a shot.

Run the query back to back with same arguments in the same database
session. Does performance improve?

Big gains (if any) are likely due to indexing strategy.
I do see some suspicious casting, for example:

Join Filter: ((four_charlie.delta_tango)::integer =
(six_quebec.golf_bravo)::integer)

Are you casting in the query or joining through dissimilar data types?
 I suspect your database team might be incorrect.

merlin


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow query after 9.3 to 9.6 migration

2017-01-05 Thread Flávio Henrique
Hi all!
Sorry the delay (holidays).

Well, the most expensive sequencial scan was solved.
I asked the db team to drop the index and recreate it and guess what: now
postgresql is using it and the time dropped.
(thank you, @Gerardo Herzig!)

I think there's still room for improvement, but the problem is not so
crucial right now.
I'll try to investigate every help mentioned here. Thank you all.

@Daniel Blanch
I'll make some tests with a materialized view. Thank you.

> On systems side: ask them if they have not changed anything in
> effective_cache_size and shared_buffers parameters, I presume they haven’t
> change anything related to costs.

Replying your comment, I think they tunned the server:
effective_cache_size = 196GB
shared_buffers = 24GB (this shouldn't be higher?)

@Kevin Grittner
sorry, but I'm not sure when the autovacuum is aggressive enough, but here
my settings related:
autovacuum  |on
autovacuum_analyze_scale_factor |0.05
autovacuum_analyze_threshold|10
autovacuum_freeze_max_age   |2
autovacuum_max_workers  |3
autovacuum_multixact_freeze_max_age |4
autovacuum_naptime  |15s
autovacuum_vacuum_cost_delay|10ms
autovacuum_vacuum_cost_limit|-1
autovacuum_vacuum_scale_factor  |0.1
autovacuum_vacuum_threshold |10
autovacuum_work_mem |-1

@Merlin Moncure

> Big gains (if any) are likely due to indexing strategy.
> I do see some suspicious casting, for example:
> Join Filter: ((four_charlie.delta_tango)::integer =
> (six_quebec.golf_bravo)::integer)
> Are you casting in the query or joining through dissimilar data types?

No casts in query. The joins are on same data types.

Thank you all for the answers. Happy 2017!

Flávio Henrique

"There are only 10 types of people in the world: Those who understand
binary, and those who don't"


On Thu, Jan 5, 2017 at 12:40 PM, Merlin Moncure  wrote:

> On Tue, Dec 27, 2016 at 5:50 PM, Flávio Henrique 
> wrote:
> > Hi there, fellow experts!
> >
> > I need an advice with query that became slower after 9.3 to 9.6
> migration.
> >
> > First of all, I'm from the dev team.
> >
> > Before migration, we (programmers) made some modifications on query bring
> > it's average time from 8s to 2-3s.
> >
> > As this query is the most executed on our system (it builds the user
> panel
> > to work), every bit that we can squeeze from it will be nice.
> >
> > Now, after server migration to 9.6 we're experiencing bad times with this
> > query again.
> >
> > Unfortunately, I don't have the old query plain (9.3 version) to show
> you,
> > but in the actual version (9.6) I can see some buffers written that
> tells me
> > that something is wrong.
> >
> > Our server has 250GB of memory available, but the database team says that
> > they can't do nothing to make this query better. I'm not sure, as some
> > buffers are written on disk.
> >
> > Any tip/help will be much appreciated (even from the query side).
> >
> > Thank you!
> >
> > The query plan: https://explain.depesz.com/s/5KMn
> >
> > Note: I tried to add index on kilo_victor table already, but Postgresql
> > still thinks that is better to do a seq scan.
>
> Hard to provide more without the query or the 'old' plan.   Here are
> some things you can try:
> *) Set effective_io_concurrency high.You have some heap scanning
> going on and this can sometimes help (but it should be marginal).
> *) See if you can get any juice out of parallel query
> *) try playing with enable_nestloop and enable_seqscan.   these are
> hail mary passes but worth a shot.
>
> Run the query back to back with same arguments in the same database
> session. Does performance improve?
>
> Big gains (if any) are likely due to indexing strategy.
> I do see some suspicious casting, for example:
>
> Join Filter: ((four_charlie.delta_tango)::integer =
> (six_quebec.golf_bravo)::integer)
>
> Are you casting in the query or joining through dissimilar data types?
>  I suspect your database team might be incorrect.
>
> merlin
>


Re: [PERFORM] Slow query after 9.3 to 9.6 migration

2017-01-05 Thread Kevin Grittner
On Thu, Jan 5, 2017 at 10:51 AM, Flávio Henrique  wrote:

> Replying your comment, I think they tunned the server:
> effective_cache_size = 196GB
> shared_buffers = 24GB (this shouldn't be higher?)

Probably not, although it may be a good idea to try settings either
side of that (say, 16GB and 32GB) and monitor performance compared
to the current setting.

> autovacuum_max_workers  |3

If you ever see all workers busy at the same time for 30 minutes or
more, you should probably consider raising that so that small,
frequently updated tables are not neglected for too long.

> autovacuum_vacuum_cost_limit|-1

That is going to default to vacuum_cost_limit, which is usually
200.  If the server is actually big enough to merit
"effective_cache_size = 196GB" then you should probably bump this
setting to something like 2000.

> autovacuum_work_mem |-1

That is going to default to maintenance_work_mem.  On a big
machine, you probably want that set to somewhere between 1GB and
2GB.

Some other tuning to the cost parameters might be helpful, but
there's not enough data on the thread to know what else to suggest.
If you hit some other slow query, you might want to report it in
the manner suggested here:

https://wiki.postgresql.org/wiki/SlowQueryQuestions

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow query after 9.3 to 9.6 migration

2017-01-05 Thread Daniel Blanch Bataller
Hi,

If just recreating the index now it uses it, it might mean that the index was 
bloated, that is, it grew so big that it was cheaper a seq scan.

I’ve seen another case recently where postgres 9.6 wasn’t using the right index 
in a query, I was able to reproduce the issue crafting index bigger, much 
bigger than it should be. 

Can you record index size as it is now? Keep this info, and If problem happens 
again check indexes size, and see if they have grow too much.

i.e. SELECT relname, relpages, reltuples FROM pg_class WHERE relname = 
‘index_name'

This might help to see if this is the problem, that indexes are growing too 
much for some reason.

Regards.

P.S the other parameters don't seem to be the cause of the problem to me.

> El 5 ene 2017, a las 17:51, Flávio Henrique  escribió:
> 
> Hi all!
> Sorry the delay (holidays).
> 
> Well, the most expensive sequencial scan was solved.
> I asked the db team to drop the index and recreate it and guess what: now 
> postgresql is using it and the time dropped.
> (thank you, @Gerardo Herzig!)
> 
> I think there's still room for improvement, but the problem is not so crucial 
> right now.
> I'll try to investigate every help mentioned here. Thank you all.
> 
> @Daniel Blanch
> I'll make some tests with a materialized view. Thank you.
> On systems side: ask them if they have not changed anything in 
> effective_cache_size and shared_buffers parameters, I presume they haven’t 
> change anything related to costs.
> Replying your comment, I think they tunned the server:
> effective_cache_size = 196GB
> shared_buffers = 24GB (this shouldn't be higher?)
> 
> @Kevin Grittner
> sorry, but I'm not sure when the autovacuum is aggressive enough, but here my 
> settings related:
> autovacuum  |on
> autovacuum_analyze_scale_factor |0.05  
> autovacuum_analyze_threshold|10
> autovacuum_freeze_max_age   |2 
> autovacuum_max_workers  |3 
> autovacuum_multixact_freeze_max_age |4 
> autovacuum_naptime  |15s   
> autovacuum_vacuum_cost_delay|10ms  
> autovacuum_vacuum_cost_limit|-1
> autovacuum_vacuum_scale_factor  |0.1   
> autovacuum_vacuum_threshold |10
> autovacuum_work_mem |-1
> 
> @Merlin Moncure
> Big gains (if any) are likely due to indexing strategy.
> I do see some suspicious casting, for example:
> Join Filter: ((four_charlie.delta_tango)::integer =
> (six_quebec.golf_bravo)::integer)
> Are you casting in the query or joining through dissimilar data types?
> No casts in query. The joins are on same data types. 
> 
> Thank you all for the answers. Happy 2017!
> 
> Flávio Henrique
> 
> "There are only 10 types of people in the world: Those who understand binary, 
> and those who don't"
> 
> 
> On Thu, Jan 5, 2017 at 12:40 PM, Merlin Moncure  > wrote:
> On Tue, Dec 27, 2016 at 5:50 PM, Flávio Henrique  > wrote:
> > Hi there, fellow experts!
> >
> > I need an advice with query that became slower after 9.3 to 9.6 migration.
> >
> > First of all, I'm from the dev team.
> >
> > Before migration, we (programmers) made some modifications on query bring
> > it's average time from 8s to 2-3s.
> >
> > As this query is the most executed on our system (it builds the user panel
> > to work), every bit that we can squeeze from it will be nice.
> >
> > Now, after server migration to 9.6 we're experiencing bad times with this
> > query again.
> >
> > Unfortunately, I don't have the old query plain (9.3 version) to show you,
> > but in the actual version (9.6) I can see some buffers written that tells me
> > that something is wrong.
> >
> > Our server has 250GB of memory available, but the database team says that
> > they can't do nothing to make this query better. I'm not sure, as some
> > buffers are written on disk.
> >
> > Any tip/help will be much appreciated (even from the query side).
> >
> > Thank you!
> >
> > The query plan: https://explain.depesz.com/s/5KMn 
> > 
> >
> > Note: I tried to add index on kilo_victor table already, but Postgresql
> > still thinks that is better to do a seq scan.
> 
> Hard to provide more without the query or the 'old' plan.   Here are
> some things you can try:
> *) Set effective_io_concurrency high.You have some heap scanning
> going on and this can sometimes help (but it should be marginal).
> *) See if you can get any juice out of parallel query
> *) try playing with enable_nestloop and enable_seqscan.   these are
> hail mary passes but worth a shot.
> 
> Run the query back to back with same arguments in the same database
> session. Does performance improve?
> 
> Big gains (if any) are likely due to indexing strategy.
> I do see some suspicious c

Re: [PERFORM] Slow query after 9.3 to 9.6 migration

2017-01-05 Thread Merlin Moncure
On Thu, Jan 5, 2017 at 10:51 AM, Flávio Henrique  wrote:
> @Merlin Moncure
>>
>> Big gains (if any) are likely due to indexing strategy.
>> I do see some suspicious casting, for example:
>> Join Filter: ((four_charlie.delta_tango)::integer =
>> (six_quebec.golf_bravo)::integer)
>> Are you casting in the query or joining through dissimilar data types?
>
> No casts in query. The joins are on same data types.

well, something is going on.

create table t(i int);
create table t2(i int);
set enable_hashjoin to false;
set enable_mergejoin to false;

yields:

postgres=# explain select * from t join t2 on t.i = t2.i;
QUERY PLAN
──
 Nested Loop  (cost=0.00..97614.88 rows=32512 width=8)
   Join Filter: (t.i = t2.i)
   ->  Seq Scan on t  (cost=0.00..35.50 rows=2550 width=4)
   ->  Materialize  (cost=0.00..48.25 rows=2550 width=4)
 ->  Seq Scan on t2  (cost=0.00..35.50 rows=2550 width=4)

please note the non-casted join filter.

however,

postgres=# explain select * from t join t2 on t.i::bigint = t2.i::bigint;
QUERY PLAN
──
 Nested Loop  (cost=0.00..130127.38 rows=32512 width=8)
   Join Filter: ((t.i)::bigint = (t2.i)::bigint)
   ->  Seq Scan on t  (cost=0.00..35.50 rows=2550 width=4)
   ->  Materialize  (cost=0.00..48.25 rows=2550 width=4)
 ->  Seq Scan on t2  (cost=0.00..35.50 rows=2550 width=4)

notice the casts in the join filter.   Furthermore, please note the
higher  query cost due to the server accounting for the casting
involved in the join.  Any kind of non-equality based operation in a
join or the predicate side of a where condition can get very expensive
very quickly.   (it remains difficult to see if there's any way to
improve the join operation due to lack of visibility on the query
string).

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow query after 9.3 to 9.6 migration

2017-01-05 Thread Filipe Oliveira
Can you remove me from your mailing list?

Thanks.