Re: 12.2: Howto check memory-leak in worker?

Dilip Kumar Mon, 04 May 2020 09:15:20 -0700

On Mon, May 4, 2020 at 5:43 PM Peter <p...@citylink.dinoex.sub.org> wrote:
>
> Hi all,
>   I have something that looks a bit insane:
>
> # ps axl | grep 6145
>   UID   PID  PPID CPU PRI NI     VSZ    RSS MWCHAN   STAT TT         TIME 
> COMMAND
>   770  6145     1   0  20  0  241756    868 select   SsJ   -      0:24.62 
> /usr/local/bin/postgres -D
>   770  6147  6145   0  23  0  243804 109784 select   IsJ   -      3:18.52 
> postgres: checkpointer    (
>   770  6148  6145   0  20  0  241756  21348 select   SsJ   -      2:02.83 
> postgres: background writer
>   770  6149  6145   0  20  0  241756   7240 select   SsJ   -     16:36.80 
> postgres: walwriter    (pos
>   770  6150  6145   0  20  0   21980    876 select   SsJ   -      0:13.92 
> postgres: archiver   last w
>   770  6151  6145   0  20  0   21980    980 select   SsJ   -      0:58.45 
> postgres: stats collector
>   770  6152  6145   0  20  0  241756   1268 select   IsJ   -      0:02.07 
> postgres: logical replicati
>   770 43315  6145   0  21  0  251844   7520 select   IsJ   -      1:07.74 
> postgres: admin postgres 19
>   770 43317  6145   0  25  0  251764   8684 select   IsJ   -      1:28.89 
> postgres: admin bareos 192.
>   770 43596  6145   0  20  0  245620   4476 select   IsJ   -      0:00.12 
> postgres: admin bareos 192.
>   770 43761  6145   0  20  0  245620   4476 select   IsJ   -      0:00.15 
> postgres: admin bareos 192.
>   770 90206  6145   0  52  0 1331256 219720 racct    DsJ   -    563:45.41 
> postgres: bareos bareos 192
>
> The 90206 is continuously growing. It is the unspecific, all-purpose
> worker for the www.bareos.com backup tool, so it is a bit difficult to
> figure what precisely it does - but it tends to be rather simple
> straight-forward queries, so it is unlikely to have dozens of "internal sort
> operations and hash tables".
>
> What I can say that at times this worker is completely idle in
> ClientRead, but does not shrink in memory. Is this a normal behaviour?
>
> Here is a more dynamic picture: it continues to add 2048kB chunks (and
> does not do noticeable paging):
>
>   UID   PID  PPID CPU PRI NI     VSZ    RSS MWCHAN   STAT TT         TIME 
> COMMAND
> Mon May  4 13:33:09 CEST 2020
>   770 90206  6145   0  91  0 1335352 226900 -        RsJ   -    569:09.19 
> postgres: bareos bareos SELECT (postgres)
> Mon May  4 13:33:39 CEST 2020
>   770 90206  6145   0  93  0 1335352 227696 -        RsJ   -    569:28.48 
> postgres: bareos bareos idle (postgres)
> Mon May  4 13:34:09 CEST 2020
>   770 90206  6145   0  92  0 1337400 228116 -        RsJ   -    569:47.46 
> postgres: bareos bareos SELECT (postgres)
> Mon May  4 13:34:39 CEST 2020
>   770 90206  6145   0  92  0 1337400 228596 -        RsJ   -    570:06.56 
> postgres: bareos bareos UPDATE (postgres)
> Mon May  4 13:35:09 CEST 2020
>   770 90206  6145   0  92  0 1337400 228944 -        RsJ   -    570:25.62 
> postgres: bareos bareos SELECT (postgres)
> Mon May  4 13:35:40 CEST 2020
>   770 90206  6145   0  52  0 1337400 229288 racct    DsJ   -    570:44.33 
> postgres: bareos bareos UPDATE (postgres)
> Mon May  4 13:36:10 CEST 2020
>   770 90206  6145   0  91  0 1337400 229952 -        RsJ   -    571:03.20 
> postgres: bareos bareos SELECT (postgres)
> Mon May  4 13:36:40 CEST 2020
>   770 90206  6145   0  52  0 1337400 223772 racct    DsJ   -    571:21.50 
> postgres: bareos bareos SELECT (postgres)
> Mon May  4 13:37:10 CEST 2020
>   770 90206  6145   0  91  0 1337400 224448 -        RsJ   -    571:40.63 
> postgres: bareos bareos idle (postgres)
> Mon May  4 13:37:40 CEST 2020
>   770 90206  6145   0  91  0 1339448 225464 -        RsJ   -    571:58.36 
> postgres: bareos bareos SELECT (postgres)
> Mon May  4 13:38:10 CEST 2020
>   770 90206  6145   0  52  0 1339448 215620 select   SsJ   -    572:14.24 
> postgres: bareos bareos idle (postgres)
> Mon May  4 13:38:40 CEST 2020
>   770 90206  6145   0  81  0 1339448 215320 -        RsJ   -    572:21.09 
> postgres: bareos bareos idle (postgres)
> Mon May  4 13:39:10 CEST 2020
>
>
> OS is FreeBSD 11.3-RELEASE-p8 r360175M i386
> PostgreSQL 12.2 on i386-portbld-freebsd11.3, compiled by gcc9 (FreeBSD Ports 
> Collection) 9.3.0, 32-bit
>
> autovacuum is Disabled.
>
> The memory-specific config is:
> > shared_buffers = 200MB
> > temp_buffers = 40MB
> > work_mem = 80MB
> > maintenance_work_mem = 250MB
> > dynamic_shared_memory_type = posix
> > random_page_cost = 2.0
> > effective_cache_size = 1GB
> (others are left at default)
>
> I remember vaguely that there are means to have a closer look into
> what is using the memory, but do not recall the specifics. Some
> pointers or ideas to proceed would be gladly appreciated (Dtrace
> should work) - processes will usually fail with OOM at this size, due
> to machine configuration - I'm waiting for that now (it is a very very
> old pentium3 machine ;) ).


One idea is that you can attach your process in gdb and call
MemoryContextStats(TopMemoryContext).  This will show which context is
using how much memory.  So basically u can call this function 2-3
times with some interval and see in which context the memory is
continuously increasing.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: 12.2: Howto check memory-leak in worker?

Reply via email to