po 2. 5. 2022 v 16:44 odesílatel Matthias van de Meent < boekewurm+postg...@gmail.com> napsal:
> On Mon, 2 May 2022 at 16:09, Pavel Stehule <pavel.steh...@gmail.com> > wrote: > > > > > > > > po 2. 5. 2022 v 16:02 odesílatel Pavel Stehule <pavel.steh...@gmail.com> > napsal: > >> there is just shared buffers changed to 32GB and work_mem to 70MB. > Unfortunately - it is in production environment with customer data, so I > cannot to play too much > >> > >> This is perf of slow > >> > >> 25,94% postmaster [kernel.kallsyms] [k] clear_page_erms > >> 11,06% postmaster [kernel.kallsyms] [k] page_fault > >> 5,51% postmaster [kernel.kallsyms] [k] prepare_exit_to_usermode > >> 5,18% postmaster [kernel.kallsyms] [k] __list_del_entry_valid > >> 5,15% postmaster libc-2.28.so [.] __memset_avx2_erms > >> 3,99% postmaster [kernel.kallsyms] [k] unmap_page_range > >> 3,07% postmaster postgres [.] hash_search_with_hash_value > >> 2,73% postmaster [kernel.kallsyms] [k] cgroup_throttle_swaprate > >> 2,49% postmaster postgres [.] heap_page_prune_opt > >> 1,92% postmaster [kernel.kallsyms] [k] try_charge > >> 1,85% postmaster [kernel.kallsyms] [k] > swapgs_restore_regs_and_return_to_usermode > >> 1,82% postmaster [kernel.kallsyms] [k] error_entry > >> 1,73% postmaster postgres [.] _bt_checkkeys > >> 1,48% postmaster [kernel.kallsyms] [k] free_pcppages_bulk > >> 1,35% postmaster [kernel.kallsyms] [k] get_page_from_freelist > >> 1,20% postmaster [kernel.kallsyms] [k] __pagevec_lru_add_fn > >> 1,08% postmaster [kernel.kallsyms] [k] > percpu_ref_put_many.constprop.84 > >> 1,08% postmaster postgres [.] 0x00000000003c1be6 > >> 1,06% postmaster [kernel.kallsyms] [k] > get_mem_cgroup_from_mm.part.49 > >> 0,86% postmaster [kernel.kallsyms] [k] __handle_mm_fault > >> 0,79% postmaster [kernel.kallsyms] [k] mem_cgroup_charge > >> 0,70% postmaster [kernel.kallsyms] [k] release_pages > >> 0,61% postmaster postgres [.] _bt_checkpage > >> 0,61% postmaster [kernel.kallsyms] [k] free_pages_and_swap_cache > >> 0,60% postmaster [kernel.kallsyms] [k] handle_mm_fault > >> 0,57% postmaster postgres [.] tbm_iterate > >> 0,56% postmaster [kernel.kallsyms] [k] > __count_memcg_events.part.70 > >> 0,55% postmaster [kernel.kallsyms] [k] __mod_memcg_lruvec_state > >> 0,52% postmaster postgres [.] 0x000000000015f6e5 > >> 0,50% postmaster [kernel.kallsyms] [k] prep_new_page > >> 0,49% postmaster [kernel.kallsyms] [k] __do_page_fault > >> 0,46% postmaster [kernel.kallsyms] [k] _raw_spin_lock > >> 0,44% postmaster [kernel.kallsyms] [k] do_anonymous_page > >> > >> This is fast > >> > >> 21,13% postmaster postgres [.] hash_search_with_hash_value > >> 15,33% postmaster postgres [.] heap_page_prune_opt > >> 10,22% postmaster libc-2.28.so [.] __memset_avx2_erms > >> 10,00% postmaster postgres [.] _bt_checkkeys > >> 6,23% postmaster postgres [.] 0x00000000003c1be6 > >> 4,94% postmaster postgres [.] _bt_checkpage > >> 2,85% postmaster postgres [.] tbm_iterate > >> 2,31% postmaster postgres [.] nocache_index_getattr > >> 2,13% postmaster postgres [.] pg_qsort > >> 1,58% postmaster postgres [.] heap_hot_search_buffer > >> 1,58% postmaster postgres [.] FunctionCall2Coll > >> 1,58% postmaster postgres [.] 0x000000000015f6e5 > >> 1,17% postmaster postgres [.] LWLockRelease > >> 0,85% postmaster libc-2.28.so [.] __memcmp_avx2_movbe > >> 0,64% postmaster postgres [.] 0x00000000003e4233 > >> 0,54% postmaster postgres [.] hash_bytes > >> 0,53% postmaster postgres [.] 0x0000000000306fbb > >> 0,53% postmaster postgres [.] LWLockAcquire > >> 0,42% postmaster postgres [.] 0x00000000003c1c6f > >> 0,42% postmaster postgres [.] _bt_compare > >> > > > > It looks so memoization allocate lot of memory - maybe there are some > temporal memory leak > > Memoization doesn't leak memory any more than hash tables do; so I > doubt that that is the issue. > > > Performance counter stats for process id '4004464': > > > > 84,26 msec task-clock # 0,012 CPUs > utilized > > 3 context-switches # 0,036 K/sec > > 0 cpu-migrations # 0,000 K/sec > > 19 page-faults # 0,225 K/sec > > 0 cycles # 0,000 GHz > > 106 873 995 instructions > > 20 225 431 branches # 240,026 M/sec > > 348 834 branch-misses # 1,72% of all > branches > > > > 7,106142051 seconds time elapsed > > > > Assuming the above was for the fast query > > > Performance counter stats for process id '4004464': > > > > 1 116,97 msec task-clock # 0,214 CPUs > utilized > > 4 context-switches # 0,004 K/sec > > 0 cpu-migrations # 0,000 K/sec > > 99 349 page-faults # 0,089 M/sec > > 0 cycles # 0,000 GHz > > 478 842 411 instructions > > 89 495 015 branches # 80,123 M/sec > > 1 014 763 branch-misses # 1,13% of all > branches > > > > 5,221116331 seconds time elapsed > > ... and this for the slow one: > > It seems like this system is actively swapping memory; which has a > negative impact on your system. This seems to be indicated by the high > amount of page faults and the high amount of time spent in the kernel > (as per the perf report one mail earlier). Maybe too much (work)memory > was assigned and the machine you're running on doesn't have that > amount of memory left? > This computer has 354GB RAM, and there is 63GB RAM free (unused memory) > Either way, seeing that so much time is spent in the kernel I don't > think that PostgreSQL is the main/only source of the slow query here, > so I don't think pgsql-hackers is the right place to continue with > this conversation. > I can see this issue only when Memoize is enabled. So it looks like a Postgres issue. 400MB of work mem is not too much. > > Regards, > > Matthias > > > PS. Maybe next time start off in > https://www.postgresql.org/list/pgsql-performance/ if you have > performance issues with unknown origin. > The wiki also has some nice tips to debug performance issues: > https://wiki.postgresql.org/wiki/Slow_Query_Questions >