Re: Postgres is not able to handle more than 4k tables!?

Konstantin Knizhnik Thu, 09 Jul 2020 09:50:24 -0700



On 09.07.2020 19:19, Nikolay Samokhvalov wrote:

Hi Konstantin, a silly question: do you consider the workload you haveas well-optimized? Can it be optimized further? Reading this thread Ihave a strong feeling that a very basic set of regular optimizationactions is missing here (or not explained): query analysis andoptimization based on pg_stat_statements (and, maybe pg_stat_kcache),some method to analyze the state of the server in general, resourceconsumption, etc.
Do you have some monitoring that covers pg_stat_statements?
Before looking under the hood, I would use multiple pg_stat_statementssnapshots (can be analyzed using, say, postgres-checkup or pgCenter)to understand the workload and identify the heaviest queries -- firstof all, in terms of total_time, calls, shared buffers reads/hits,temporary files generation. Which query groups are Top-N in eachcategory, have you looked at it?
You mentioned some crazy numbers for the planning time, but why not toanalyze the picture holistically and see the overall numbers? Thosequeries that have increased planning time, what their part oftotal_time, on the overall picture, in %? (Unfortunately, we cannotsee Top-N by planning time in pg_stat_statements till PG13, but itdoesn't mean that we cannot have some good understanding of overallpicture today, it just requires more work).
If workload analysis & optimization was done holistically already, ornot possible due to some reason — pardon me. But if not and if yourprimary goal is to improve this particular setup ASAP, then the topiccould be started in the -performance mailing list first, discussingthe workload and its aspects, and only after it's done, raised in-hackers. No?


Certainly, both we and customer has made workload analysis & optimization.

It is not a problem of particular queries, bad plans, resourceexhaustion,...

Unfortunately there many scenarios when Postgres demonstrates notgradual degrade of performance with increasing workload,but "snow avalanche" whennegative feedback cause very fastparalysis ofthe system.

This case is just one if this scenarios. It is hard to say for sure whattriggers the avalanche... Long living transaction, huge number of tables,aggressive autovacuum settings... But there is cascade of negativeevents which cause system which normally function for months to stopworking at all.


In this particular case we have the following chain:

- long living transaction cause autovacuum to send a lot of invalidationmessage- this messages cause overflow of invalidation message queues, forcingbackens to invalidate their caches and reload from catalog.- too small value of fastpath lock cache cause many concurrent accessesto shared lock hash- contention for LW-lock caused by small number of lock partition causestarvation

Re: Postgres is not able to handle more than 4k tables!?

Reply via email to