Re: strange slow query - lost lot of time somewhere

Pavel Stehule Mon, 02 May 2022 07:57:09 -0700

po 2. 5. 2022 v 16:44 odesílatel Matthias van de Meent <
[email protected]> napsal:


> On Mon, 2 May 2022 at 16:09, Pavel Stehule <[email protected]>
> wrote:
> >
> >
> >
> > po 2. 5. 2022 v 16:02 odesílatel Pavel Stehule <[email protected]>
> napsal:
> >> there is just shared buffers changed to 32GB and work_mem to 70MB.
> Unfortunately - it is in production environment with customer data, so I
> cannot to play too much
> >>
> >> This is perf of slow
> >>
> >>   25,94%  postmaster  [kernel.kallsyms]  [k] clear_page_erms
> >>   11,06%  postmaster  [kernel.kallsyms]  [k] page_fault
> >>    5,51%  postmaster  [kernel.kallsyms]  [k] prepare_exit_to_usermode
> >>    5,18%  postmaster  [kernel.kallsyms]  [k] __list_del_entry_valid
> >>    5,15%  postmaster  libc-2.28.so       [.] __memset_avx2_erms
> >>    3,99%  postmaster  [kernel.kallsyms]  [k] unmap_page_range
> >>    3,07%  postmaster  postgres           [.] hash_search_with_hash_value
> >>    2,73%  postmaster  [kernel.kallsyms]  [k] cgroup_throttle_swaprate
> >>    2,49%  postmaster  postgres           [.] heap_page_prune_opt
> >>    1,92%  postmaster  [kernel.kallsyms]  [k] try_charge
> >>    1,85%  postmaster  [kernel.kallsyms]  [k]
> swapgs_restore_regs_and_return_to_usermode
> >>    1,82%  postmaster  [kernel.kallsyms]  [k] error_entry
> >>    1,73%  postmaster  postgres           [.] _bt_checkkeys
> >>    1,48%  postmaster  [kernel.kallsyms]  [k] free_pcppages_bulk
> >>    1,35%  postmaster  [kernel.kallsyms]  [k] get_page_from_freelist
> >>    1,20%  postmaster  [kernel.kallsyms]  [k] __pagevec_lru_add_fn
> >>    1,08%  postmaster  [kernel.kallsyms]  [k]
> percpu_ref_put_many.constprop.84
> >>    1,08%  postmaster  postgres           [.] 0x00000000003c1be6
> >>    1,06%  postmaster  [kernel.kallsyms]  [k]
> get_mem_cgroup_from_mm.part.49
> >>    0,86%  postmaster  [kernel.kallsyms]  [k] __handle_mm_fault
> >>    0,79%  postmaster  [kernel.kallsyms]  [k] mem_cgroup_charge
> >>    0,70%  postmaster  [kernel.kallsyms]  [k] release_pages
> >>    0,61%  postmaster  postgres           [.] _bt_checkpage
> >>    0,61%  postmaster  [kernel.kallsyms]  [k] free_pages_and_swap_cache
> >>    0,60%  postmaster  [kernel.kallsyms]  [k] handle_mm_fault
> >>    0,57%  postmaster  postgres           [.] tbm_iterate
> >>    0,56%  postmaster  [kernel.kallsyms]  [k]
> __count_memcg_events.part.70
> >>    0,55%  postmaster  [kernel.kallsyms]  [k] __mod_memcg_lruvec_state
> >>    0,52%  postmaster  postgres           [.] 0x000000000015f6e5
> >>    0,50%  postmaster  [kernel.kallsyms]  [k] prep_new_page
> >>    0,49%  postmaster  [kernel.kallsyms]  [k] __do_page_fault
> >>    0,46%  postmaster  [kernel.kallsyms]  [k] _raw_spin_lock
> >>    0,44%  postmaster  [kernel.kallsyms]  [k] do_anonymous_page
> >>
> >> This is fast
> >>
> >>   21,13%  postmaster  postgres           [.] hash_search_with_hash_value
> >>   15,33%  postmaster  postgres           [.] heap_page_prune_opt
> >>   10,22%  postmaster  libc-2.28.so       [.] __memset_avx2_erms
> >>   10,00%  postmaster  postgres           [.] _bt_checkkeys
> >>    6,23%  postmaster  postgres           [.] 0x00000000003c1be6
> >>    4,94%  postmaster  postgres           [.] _bt_checkpage
> >>    2,85%  postmaster  postgres           [.] tbm_iterate
> >>    2,31%  postmaster  postgres           [.] nocache_index_getattr
> >>    2,13%  postmaster  postgres           [.] pg_qsort
> >>    1,58%  postmaster  postgres           [.] heap_hot_search_buffer
> >>    1,58%  postmaster  postgres           [.] FunctionCall2Coll
> >>    1,58%  postmaster  postgres           [.] 0x000000000015f6e5
> >>    1,17%  postmaster  postgres           [.] LWLockRelease
> >>    0,85%  postmaster  libc-2.28.so       [.] __memcmp_avx2_movbe
> >>    0,64%  postmaster  postgres           [.] 0x00000000003e4233
> >>    0,54%  postmaster  postgres           [.] hash_bytes
> >>    0,53%  postmaster  postgres           [.] 0x0000000000306fbb
> >>    0,53%  postmaster  postgres           [.] LWLockAcquire
> >>    0,42%  postmaster  postgres           [.] 0x00000000003c1c6f
> >>    0,42%  postmaster  postgres           [.] _bt_compare
> >>
> >
> > It looks so memoization allocate lot of memory - maybe there are some
> temporal memory leak
>
> Memoization doesn't leak memory any more than hash tables do; so I
> doubt that that is the issue.
>
> >  Performance counter stats for process id '4004464':
> >
> >              84,26 msec task-clock                #    0,012 CPUs
> utilized
> >                  3      context-switches          #    0,036 K/sec
> >                  0      cpu-migrations            #    0,000 K/sec
> >                 19      page-faults               #    0,225 K/sec
> >                  0      cycles                    #    0,000 GHz
> >        106 873 995      instructions
> >         20 225 431      branches                  #  240,026 M/sec
> >            348 834      branch-misses             #    1,72% of all
> branches
> >
> >        7,106142051 seconds time elapsed
> >
>
> Assuming the above was for the fast query
>
> >  Performance counter stats for process id '4004464':
> >
> >           1 116,97 msec task-clock                #    0,214 CPUs
> utilized
> >                  4      context-switches          #    0,004 K/sec
> >                  0      cpu-migrations            #    0,000 K/sec
> >             99 349      page-faults               #    0,089 M/sec
> >                  0      cycles                    #    0,000 GHz
> >        478 842 411      instructions
> >         89 495 015      branches                  #   80,123 M/sec
> >          1 014 763      branch-misses             #    1,13% of all
> branches
> >
> >        5,221116331 seconds time elapsed
>
> ... and this for the slow one:
>
> It seems like this system is actively swapping memory; which has a
> negative impact on your system. This seems to be indicated by the high
> amount of page faults and the high amount of time spent in the kernel
> (as per the perf report one mail earlier). Maybe too much (work)memory
> was assigned and the machine you're running on doesn't have that
> amount of memory left?
>

This computer has 354GB RAM, and there is 63GB RAM free (unused memory)



> Either way, seeing that so much time is spent in the kernel I don't
> think that PostgreSQL is the main/only source of the slow query here,
> so I don't think pgsql-hackers is the right place to continue with
> this conversation.
>

I can see this issue only when Memoize is enabled. So it looks like a
Postgres issue. 400MB of work mem is not too much.








>
> Regards,
>
> Matthias
>
>
> PS. Maybe next time start off in
> https://www.postgresql.org/list/pgsql-performance/ if you have
> performance issues with unknown origin.
> The wiki also has some nice tips to debug performance issues:
> https://wiki.postgresql.org/wiki/Slow_Query_Questions
>

Re: strange slow query - lost lot of time somewhere

Reply via email to