On 05.08.2019 22:35, Daniel Migowski wrote:
.
I think that including in pg_prepared_statements information about
memory used this statement is very useful.
CachedPlanMemoryUsage function may be useful not only for this view,
but for example it is also need in my autoprepare patch.
I would love to use your work if it's done, and would also love to
work together here. I am quite novice in C thought, I might take my
time to get things right.
Right now I resused your implementation of CachedPlanMemoryUsage function:)
Before I took in account only memory used by plan->context, but not of
plan->query_context and plan->gplan->context (although query_context for
raw parse tree seems to be much smaller).
I wonder if you consider go further and not only report but control
memory used by prepared statements?
For example implement some LRU replacement discipline on top of
prepared statements cache which can
evict rarely used prepared statements to avoid memory overflow.
THIS! Having some kind of safety net here would finally make sure that
my precious processes will not grow endlessly until all mem is eaten
up, even with prep statement count limits.
While working on stuff I noticed there are three things stored in a
CachedPlanSource. The raw query tree (a relatively small thing), the
query list (analyzed-and-rewritten query tree) which takes up the most
memory (at least here, maybe different with your usecases), and (often
after the 6th call) the CachedPlan, which is more optimized that the
query list and often needs less memory (half of the query list here).
The query list seems to take the most time to create here, because I
hit the GEQO engine here, but it could be recreated easily (up to
500ms for some queries). Creating the CachedPlan afterwards takes 60ms
in some usecase. IF we could just invalidate them from time to time,
that would be grate.
Also, invalidating just the queries or the CachedPlan would not
invalidate the whole prepared statement, which would break clients
expectations, but just make them a slower, adding much to the
stability of the system. I would pay that price, because I just don't
use manually named prepared statements anyway and just autogenerate
them as performance sugar without thinking about what really needs to
be prepared anyway. There is an option in the JDBC driver to use
prepared statements automatically after you have used them a few time.
I have noticed that cached plans for implicitly prepared statements in
stored procedures are not shown in pg_prepared_statements view.
It may be not a problem in your case (if you are accessing Postgres
through JDBC and not using prepared statements),
but can cause memory overflow in applications actively using stored
procedures, because unlike explicitly created prepared statements, it is
very difficult
to estimate and control statements implicitly prepared by plpgsql.
I am not sure what will be the best solution in this case. Adding yet
another view for implicitly prepared statements? Or include them in
pg_prepared_statements view?
We have such patch for PgPro-EE but it limits only number of prepared
statement, not taken in account amount of memory used by them.
I think that memory based limit will be more accurate (although it
adds more overhead).
Limiting them by number is already done automatically here and would
really not be of much value, but having a mem limit would be great. We
could have a combined memory limit for your autoprepared statements as
well as the manually prepared ones, so clients can know for sure the
server processes won't eat up more that e.g. 800MB for prepared
statements. And also I would like to have this value spread across all
client processes, e.g. specifying max_prepared_statement_total_mem=5G
for the server, and maybe max_prepared_statement_mem=1G for client
processes. Of course we would have to implement cross client process
invalidation here, and I don't know if communicating client processes
are even intended.
Anyway, a memory limit won't really add that much more overhead. At
least not more than having no prepared statements at all because of
the fear of server OOMs, or have just a small count of those
statements. I was even think about a prepared statement reaper that
checks the pg_prepared_statements every few minutes to clean things up
manually, but having this in the server would be of great value to me.
Right now memory context has no field containing amount of currently
used memory.
This is why context->methods->stats function implementation has to
traverse all blocks to calculate size of memory used by context.
It may be not so fast for large contexts. But I do not expect that
contexts of prepared statements will be very large, although
I have deal with customers which issued queries with query text length
larger than few megabytes. I afraid to estimate size of plan for such
queries.
This is the reason of my concern that calculating memory context size
may have negative effect on performance. But is has to be done only once
when statement is prepared. So may be it is not a problem at all.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company