Hello,

While examining the reasons for excessive memory usage in prepared statements I noticed that RTE_JOIN-kind RTEs contain a bunch of columnNames and joinaliasvars, that are irrelevant after the Query after has been rewritten. I have some queries that join about 20 tables and select only a few values, mainly names of objects from those tables.

The attached patch adds a small cleanup function that iterates thought the query and cleans stuff up. I may have missed some places that could also be cleaned up but for now the memory requirements for my largest statements have dropped from 31.2MB to 10.4MB with this patch.

After the statement has be executed seven times a generic plan is stored in the statement, resulting in an extra 8,8MB memory usage, but still this makes a difference of more than 50% total.

But the most interesting thing was that this patch reduced query execution time by 50% (~110ms vs. 55ms) when no generic was created yet, and by 35% (7.5ms vs. 5.1ms) when the global query plan had been created.

All tests still pass with my cleanup command, but I am afraid the tests might not contain queries that still need that info after statement preparation.

If anyone might have a look at it and hint me to a situation where this might crash later on? Also, would it be possible for someone to run a benchmark after applying this test to ensure my findings are not totally off? I tested on a Intel(R) Xeon(R) CPU E5-2667 v4 @ 3.20GHz with SSDs, but everything should have been in memory when I ran the test.

Regards,
Daniel Migowski


diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index b945b15..af8dbc9 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -51,6 +51,36 @@ static ParamListInfo EvaluateParams(PreparedStatement 
*pstmt, List *params,
 static Datum build_regtype_array(Oid *param_types, int num_params);
 
 /*
+ * cleanUpQuery - Cleanup queries after they have been rewritten.
+ *
+ * This nulls all the joinaliasvars and colnames in RTEs of kind RTE_JOIN
+ * in CMD_SELECT queries after the Query has been analyzed and rewritten.
+ * For some join-heavy queries this results in a 70% memory usage reduction.
+ */
+static void cleanQuery(Query* q) {
+       ListCell* lc2;
+       if( q->commandType == CMD_SELECT ) {
+               foreach(lc2, q->rtable) {
+                       RangeTblEntry* rte = (RangeTblEntry*)lfirst(lc2);
+                       if( rte->rtekind == RTE_JOIN ) {
+                               ListCell* lc3;
+                               foreach(lc3, rte->joinaliasvars) {
+                                       lc3->data.ptr_value = NULL;
+                               }
+                               if( rte->eref ) {
+                                       foreach(lc3, rte->eref->colnames) {
+                                               lc3->data.ptr_value = NULL;
+                                       }
+                               }
+                       }
+                       if( rte->rtekind == RTE_SUBQUERY ) {
+                               cleanQuery(rte->subquery);
+                       }
+               }
+       }
+}
+
+/*
  * Implements the 'PREPARE' utility statement.
  */
 void
@@ -64,6 +94,7 @@ PrepareQuery(PrepareStmt *stmt, const char *queryString,
        Query      *query;
        List       *query_list;
        int                     i;
+       ListCell*   lc;
 
        /*
         * Disallow empty-string statement name (conflicts with protocol-level
@@ -99,7 +130,6 @@ PrepareQuery(PrepareStmt *stmt, const char *queryString,
        if (nargs)
        {
                ParseState *pstate;
-               ListCell   *l;
 
                /*
                 * typenameTypeId wants a ParseState to carry the source query 
string.
@@ -111,9 +141,9 @@ PrepareQuery(PrepareStmt *stmt, const char *queryString,
                argtypes = (Oid *) palloc(nargs * sizeof(Oid));
                i = 0;
 
-               foreach(l, stmt->argtypes)
+               foreach(lc, stmt->argtypes)
                {
-                       TypeName   *tn = lfirst(l);
+                       TypeName   *tn = lfirst(lc);
                        Oid                     toid = typenameTypeId(pstate, 
tn);
 
                        argtypes[i++] = toid;
@@ -163,6 +193,11 @@ PrepareQuery(PrepareStmt *stmt, const char *queryString,
        /* Rewrite the query. The result could be 0, 1, or many queries. */
        query_list = QueryRewrite(query);
 
+       /* Clearing joinaliasvars from join rte's now */
+       foreach(lc, query_list) {
+               cleanQuery((Query*)lfirst(lc));
+       }
+
        /* Finish filling in the CachedPlanSource */
        CompleteCachedPlan(plansource,
                                           query_list,

Reply via email to