Re: Why could different data in a table be processed with different performance?

2018-09-22 Thread Vladimir Ryabtsev
> is the length of the text equally distributed over the 2 partitions?
Not 100% equally, but to me it does not seem to be a big deal...
Considering the ranges independently:
First range: ~70% < 10 KB, ~25% for 10-20 KB, ~3% for 20-30 KB, everything
else is less than 1% (with 10 KB steps).
Second range: ~80% < 10 KB, ~18% for 10-20 KB, ~2% for 20-30 KB, everything
else is less than 1% (with 10 KB steps).

>From what you posted, the first query retrieves 5005 rows, but the second
2416. It might be helpful if we are able to compare 5000 vs 5000
Yes it was just an example, here are the plans for approximately same
number of rows:

Aggregate  (cost=9210.12..9210.13 rows=1 width=16) (actual
time=4265.478..4265.479 rows=1 loops=1)
  Buffers: shared hit=27027 read=4311
  I/O Timings: read=2738.728
  ->  Index Scan using articles_pkey on articles  (cost=0.57..9143.40
rows=5338 width=107) (actual time=12.254..873.081 rows=5001 loops=1)
Index Cond: ((article_id >= 43803) AND (article_id <=
438035000))
Buffers: shared hit=4282 read=710
I/O Timings: read=852.547
Planning time: 0.235 ms
Execution time: 4265.554 ms

Aggregate  (cost=11794.59..11794.60 rows=1 width=16) (actual
time=62298.559..62298.559 rows=1 loops=1)
  Buffers: shared hit=15071 read=14847
  I/O Timings: read=60703.859
  ->  Index Scan using articles_pkey on articles  (cost=0.57..11709.13
rows=6837 width=107) (actual time=24.686..24582.221 rows=5417 loops=1)
Index Cond: ((article_id >= '10002104'::bigint) AND (article_id
<= '10002106'::bigint))
Buffers: shared hit=195 read=5244
I/O Timings: read=24507.621
Planning time: 0.494 ms
Execution time: 62298.630 ms

If we subtract I/O from total time, we get 1527 ms vs 1596 ms — very close
timings for other than I/O operations (considering slightly higher number
of rows in second case). But  I/O time differs dramatically.

> Also is worth noticing that the 'estimated' differs from 'actual' on the
second query. I think that happens because data is differently distributed
over the ranges. Probably the analyzer does not have enough samples to
understand the real distribution.
I think we should not worry about it unless the planner chose poor plan,
should we? Statistics affects on picking a proper plan, but not on
execution of the plan, doesn't it?

> You might try to increase the number of samples (and run analyze)
To be honest, I don't understand it... As I know, in Postgres we have two
options: set column target percentile and set n_distinct. We can't increase
fraction of rows analyzed (like in other DBMSs we can set ANALYZE
percentage explicitly). Moreover, in our case the problem column is PRIMARY
KEY with all distinct values, Could you point me, what exactly should I do?

> or to create partial indexes on the 2 ranges.
Sure, will try it with partial indexes. Should I drop existing PK index, or
ensuring that planner picks range index is enough?

> i would do a sync at the end, after dropping caches.
A bit off-topic, but why? Doing sync may put something to cache again.
https://linux-mm.org/Drop_Caches
https://unix.stackexchange.com/a/82164/309344

> - does the raid controller have a cache?
> - how big is the cache? (when you measure disk speed, that will influence
the result very much, if  you do not run the test on big-enough data chunk)
best if is disabled during your tests
I am pretty sure there is some, usually it's several tens of megabytes, but
I ran disk read tests several times with chunks that could not be fit in
the cache and with random offset, so I am pretty sure that something around
500 MB/s is enough reasonably accurate (but it is only for sequential read).

> - is the OS caching disk blocks too? maybe you want to drop everything
from there too.
How can I find it out? And how to drop it? Or you mean hypervisor OS?
Anyway, don't you think that caching specifics could not really explain
these issues?

> I think that you should be pragmatic and try to run the tests on a
physical machine.
I wish I could do it, but hardly it is possible. In some future we may
migrate the DB to physical hosts, but now we need to make it work in
virtual.

> on the VM or on the physical host?
On the VM. The physical host is Windows (no iotop) and I have no access to
it.

Vlad


Re: Why could different data in a table be processed with different performance?

2018-09-22 Thread Vladimir Ryabtsev
> I think reindex will improve the heap access..and maybe the index access
too. I don't see why it would be bloated without UPDATE/DELETE, but you
could check to see if its size changes significantly after reindex.
I tried REINDEX, and size of PK index changed from 2579 to 1548 MB.
But test don't show any significant improvement from what it was. May be
read speed for the "big" range became just slightly faster in average.

Vlad


Re: Multi-second pauses blocking even trivial activity

2018-09-22 Thread Andreas Kretschmer




Am 21.09.2018 um 21:07 schrieb Patrick Molgaard:
Andreas -- just following up to say that this was indeed the root 
cause. Thanks again.




glad i could help you.


Regards, Andreas

--
2ndQuadrant - The PostgreSQL Support Company.
www.2ndQuadrant.com




Re: Why could different data in a table be processed with different performance?

2018-09-22 Thread didier
Hi,
Assuming DB is quiescent.

And if you run?

select count(*) from articles where article_id between %s and %s

ie without reading json, is your buffers hit count increasing?
20 000 8K blocks *2 is  500MB , should be in RAM after the first run.

Fast:
read=710 I/O Timings: read=852.547 ==> 1.3 ms /IO
800 IO/s some memory, sequential reads or a good raid layout.

Slow:
read=5244 I/O Timings: read=24507.621 ==> 4.7 ms /IO
200 IO/s more HD reads? more seeks? slower HD zones ?

Maybe you can play with PG cache size.


On Sat, Sep 22, 2018 at 12:32 PM Vladimir Ryabtsev 
wrote:

> > I think reindex will improve the heap access..and maybe the index access
> too. I don't see why it would be bloated without UPDATE/DELETE, but you
> could check to see if its size changes significantly after reindex.
> I tried REINDEX, and size of PK index changed from 2579 to 1548 MB.
> But test don't show any significant improvement from what it was. May be
> read speed for the "big" range became just slightly faster in average.
>
> Vlad
>
>