Hello Tom. Thanks for your response. I spent most of the time looking for evidence and checking other installations with similar patterns since your response.
this installation is in the habit of doing pg_stat_statements_reset() a lot? * resetting is very rare. How can I get "pgss->mean_query_len" via sql?* Maybe it does get truncated, but then the cycle repeats after a while? *it is possible as the slowness happened some days apart 3 times.* *Question: *Besides the gc issue that you mentioned, having a large ( 700MB or 1GB ) pgss_query_texts.stat could cause slowness in pg_stat_statement processing than leading to slower query responses with a 32bit PG? I'm thinking in reducing pg_stat_statements.max from 10k to 3k Thanks On Tue, Aug 2, 2022 at 3:14 PM Tom Lane <t...@sss.pgh.pa.us> wrote: > I wrote: > > bruno da silva <brunogi...@gmail.com> writes: > >> Do you have a lot of especially long statements being tracked > >> in the pg_stat_statements view?* well, the view was showing the query > >> column null.* > >> * but looking on pgss_query_texts.stat there are very large sql > >> statements, of around ~ 400kb, multiple thousands. * > > I see one possible piece of the puzzle here: since you're using a 32-bit > build, overflowing size_t is a reachable hazard. Specifically, in this > test to see if we need to garbage-collect the query text file: > > if (extent < pgss->mean_query_len * pgss_max * 2) > return false; > > You said earlier that pg_stat_statements.max = 10000, so a mean_query_len > exceeding about 2^32 / 10000 / 2 = 214748.3648 would be enough to overflow > size_t and break this comparison. Now, a mean SQL query length in excess > of 200kB sounds mighty improbable, but it's really the mean length of the > query texts in the view. If your "normal" queries fall into just a few > patterns they might be represented by a relatively small number of view > entries. And if the "big" queries are sufficiently not alike, they might > each get their own view entry, which could potentially drive the mean high > enough to cause trouble. It'd be interesting to track what > "SELECT avg(length(query)) FROM pg_stat_statements" gives. > > However, even if we grant that mean_query_len is that big, overflow here > would make garbage collection of the query text file more likely not less > so. What I'm speculating is that overflow is occurring and causing all > processes to decide they need to run gc_qtexts() every time they insert > a new query entry, even though the query texts file isn't actually > bloated. That could possibly explain your performance issues: a garbage > collection pass over a multi-gig file will take awhile, and what's worse > is that it's done under an exclusive lock, meaning that all the backends > stack up waiting their turn to perform a useless GC pass. > > What this doesn't explain is why the condition doesn't clear once you > observe one of those "out of memory" complaints, because that should > lead to truncating the texts file. Maybe it does get truncated, but > then the cycle repeats after awhile? If you have a steady stream of > incoming new 400kB queries, you could build back up to 2.2GB of text > after five thousand or so of those. > > I'm also curious whether this installation is in the habit of doing > pg_stat_statements_reset() a lot. It looks like that fails to > reset mean_query_len, which might be intentional but perhaps it > could play into getting a silly result here later on. > > regards, tom lane > -- Bruno da Silva