On Monday, December 10, 2018, 7:45:07 PM EST, Tom Lane <t...@sss.pgh.pa.us> wrote: Thomas Carroll <tomfecarr...@yahoo.com> writes: > On Monday, December 10, 2018, 5:50:22 PM EST, Tom Lane <t...@sss.pgh.pa.us> >wrote: >> Also, as mentioned upthread, it'd be interesting to see if there's >> a memory context dump showing up in your server log.
> There are many memory context dumps, and they are long (well over 100 lines > apiece). To me these are unfamiliar and hard to read, but I tried to pick > one that referenced all the same tables used in the function. I put one such > dump (188 lines) here: > https://drive.google.com/file/d/1QI4nffdZByIehb_-GULOagI_dKpKElg_/view Thanks. Here's the smoking gun: CacheMemoryContext: 221241400 total in 38 blocks; 152192 free (90 chunks); 221089208 used This looks like something is leaking memory directly in CacheMemoryContext (not in any of its child contexts). Can you confirm that all the memory context dumps in your logs are showing similarly very-large total space values for CacheMemoryContext? This doesn't seem to be related to temp table usage, at least not directly. (As a cross-check, I created and dropped 10000 temp tables in a session, and the process's VM footprint didn't budge.) My guess is that something else you are doing is tickling the leak. You said you'd been able to reproduce this problem outside production. Any chance you could boil that down to a sharable test case? I'd bet long odds that it's not related to your data particularly, but rather to your schema or something your function is doing, so you probably could reproduce it with some dummy data (and, perhaps, not even very much of that; repeating some DDL action many times is probably what is making the leak grow to something noticeable). If I had to guess at this point, I'd wonder if it's something to do with partitioned tables. That code is still relatively wet behind the ears, and memory leaks are the kind of bug I'd be expecting to still be shaking out of it now. But you might not be using partitioned tables at all. Anyway, anything you're doing with new-in-v10 features, particularly schema-related features, would be a good bet for the source of the problem. regards, tom lane Thanks for your response and sorry for the delay in getting back to this. Because holidays. To refresh your memory, the issue I encountered is that a long-persistent connection that periodically issues function calls consumes more and more memory - eventually causing memory exhaustion. Admittedly, it does take weeks. This email is about creating a shareable test case. The test case is simple - so simple that I am surprised no one has encountered this before, and therefore I am really more fearing that I am doing something stupid in my effort to make it shareable and simple. In creating this example I have not run the code till it crashed the server; instead I used the measuring methodology you will see below (monitoring the smaps file). I have put needed files here: https://drive.google.com/open?id=1ff074Qxn7Bx9Ud9GRaegDfuJn_Tf_NTi Table of contents: testbed.public.backup # database dump, not compressed, includes "create database" pg_hba.conf # self explanatory; allows local postgres to log in as trusted postgresql.conf # self explanatory. nothing weird loop_query.bash # sends a query via named pipe to Postgres once per second To recreate the issue, load the "testbed" database into a Postgres server, configured so that psql does not need to send a password for user postgres. I created the server with this:initdb $PGDATA --locale="en_US.UTF-8" -E "UTF-8" -k Then run as postgres: cd /my/directory mkfifo qpipe # named pipe conduit to persistent Postgres connection cat > qpipe & # keep pipe from closing and terminating the PG connection psql -h localhost -d testbed -U postgres < qpipe & # listener connection ./query_loop.bash # Send function call over and over once per second # Observe output; should be 78 rows per second # Switch to a new terminal session # Identify the PID of the psql process # Monitor with this command, but plug in the real PID for the 1234 grep "^Private" /proc/1234/smaps | awk '{a+=$2}END{print a/1024.0}' Give it a few minutes, and you will observe memory usage climbing steadily (if my measurement technique is good). From last night to this morning I observed memory consumption grow from 3.5 to 46.6 MB. Tested on Postgres version 10.5. Again, thanks for any help. Tom Carroll