[GENERAL] 2x Time difference between first and subsequent run of the same query on fresh established connection (on linux, with perf data included, all query data in the shared buffers) on postgresql 9.3.10. Any explanation?

Maxim Boguk Wed, 24 Feb 2016 18:52:19 -0800

Hi,

I found a strange case when the query (which works through large amount of
shared buffers) run 2x time faster on the second and subsequent run in new
connection to db.
For sample:


postgres@base1:~$ psql ***
psql (9.3.10)

***=# explain (analyze, costs, buffers, timing) select *         from
transactions where "timestamp" >= '2016-02-20' and "timestamp"  <
'2016-02-23'::date;

QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using i_transactions_timestamp on transactions
(cost=0.57..138824.52 rows=1955459 width=790) (actual time=0.146..3416.477
rows=1950630 loops=1)
   Index Cond: (("timestamp" >= '2016-02-20 00:00:00'::timestamp without
time zone) AND ("timestamp" < '2016-02-23'::date))
   Buffers: shared hit=1965635
 Total runtime: 3481.322 ms
(4 строки)

***=# explain (analyze, costs, buffers, timing) select *         from
transactions where "timestamp" >= '2016-02-20' and "timestamp"  <
'2016-02-23'::date;

QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using i_transactions_timestamp on transactions
(cost=0.57..138824.52 rows=1955459 width=790) (actual time=0.030..1812.361
rows=1950630 loops=1)
   Index Cond: (("timestamp" >= '2016-02-20 00:00:00'::timestamp without
time zone) AND ("timestamp" < '2016-02-23'::date))
   Buffers: shared hit=1965635
 Total runtime: 1878.503 ms

And every run after it - works in 1.8-1.9s, but if I establish the new
connection to database - the first query will run 3.5s again.

Time difference and timing of each run pretty repeatable (+/- 100ms).

There are perf report data for the first and for the second runs:

The first run (something fishy with kernel calls):
  19,60%  postgres  [kernel.kallsyms]  [k] filemap_map_pages
  15,86%  postgres  postgres           [.] hash_search_with_hash_value
   8,20%  postgres  postgres           [.] heap_hot_search_buffer
   8,20%  postgres  postgres           [.] heap_page_prune_opt
   5,72%  postgres  postgres           [.] PinBuffer
   4,38%  postgres  [kernel.kallsyms]  [k] page_fault
   4,04%  postgres  [kernel.kallsyms]  [k] page_waitqueue
   3,55%  postgres  [kernel.kallsyms]  [k] __wake_up_bit
   2,95%  postgres  postgres           [.] LWLockAcquire
   2,31%  postgres  [kernel.kallsyms]  [k] unlock_page
   1,96%  postgres  [vdso]             [.] __vdso_gettimeofday
   1,83%  postgres  [kernel.kallsyms]  [k] radix_tree_next_chunk
   1,77%  postgres  [kernel.kallsyms]  [k] page_add_file_rmap
   1,66%  postgres  postgres           [.] _bt_checkkeys
   1,27%  postgres  postgres           [.] LWLockRelease

The second run (look perfectly ok for such kind of query):
  27,74%  postgres  postgres           [.] hash_search_with_hash_value
  15,51%  postgres  postgres           [.] heap_hot_search_buffer
  14,28%  postgres  postgres           [.] heap_page_prune_opt
   9,22%  postgres  postgres           [.] PinBuffer
   3,94%  postgres  [vdso]             [.] __vdso_gettimeofday
   3,32%  postgres  postgres           [.] _bt_checkkeys
   3,29%  postgres  postgres           [.] LWLockAcquire
   2,86%  postgres  postgres           [.] LWLockRelease
   1,54%  postgres  postgres           [.] UnpinBuffer


So it's looks like that something strange going inside linux kernel 3.16.0
memory managment (it's vanilla kernel on the bare hardware, no
virtualization, swap off).
Question is: it's work as expected (and in that case probably good idea use
pgbouncer even for one-shot analytical queries), or it's sign of the
potential issues with the ykernel?



-- 
Maxim Boguk
Senior Postgresql DBA
http://www.postgresql-consulting.com/

Phone RU: +7 910 405 4718
Phone AU: +61 45 218 5678

LinkedIn: http://www.linkedin.com/pub/maksym-boguk/80/b99/b1b
Skype: maxim.boguk
Jabber: maxim.bo...@gmail.com

"People problems are solved with people.
If people cannot solve the problem, try technology.
People will then wish they'd listened at the first stage."

[GENERAL] 2x Time difference between first and subsequent run of the same query on fresh established connection (on linux, with perf data included, all query data in the shared buffers) on postgresql 9.3.10. Any explanation?

Reply via email to