The report in https://www.postgresql.org/message-id/flat/3690074f-abd2-56a9-144a-aa5545d7a291%40postgrespro.ru
set off substantial alarm bells for me about whether outfuncs/readfuncs processing had any additional problems we'd failed to notice. I thought that to investigate that, it'd be a good idea to invent a debugging option comparable to COPY_PARSE_PLAN_TREES, except it passes all parse and plan trees through nodeToString + stringToNode rather than copyObject. I did so, and darn if a number of creepy-crawlies didn't get shaken out of the woodwork immediately. The first attached patch adds that debugging option (plus some hackery I'll explain in a moment). I propose that some form of this should go into HEAD and we should configure at least one buildfarm animal to enable it. The second attached patch fixes the bugs I found; parts of it need to get back-patched. This debugging option also exposes the XMLNAMESPACES issue shown in the aforementioned thread. So the patch shown there needs to go in first, or you get a core dump in xml-enabled builds. But the sum total of all three patches does pass make check-world. The hackery in patch 0001 has to do with copying queryId, stmt_location, and stmt_len fields forward across nodeToString + stringToNode. The core regression tests pass fine without that, but pg_stat_statements falls over, because it needs that data to survive parse analysis as well as planning. As far as queryId goes, I'm satisfied with the hack shown here of just having pg_rewrite_query propagate the field forward across the test. There's a case to be made for fixing it by storing queryId in stored rules instead, but that would require a far higher commitment to stability and universality of the queryId computation than we've made so far. Such a change seems like something to consider separately if at all. (Note that no hack is needed for plan trees, because _outPlannedStmt/_readPlannedStmt already transmit the queryId for a plan.) The location fields are a much more ticklish matter, because the behavior for those ties into a bunch of historical and possible future behaviors. Currently, although outfuncs.c prints node location fields just fine, readfuncs.c ignores the stored values and inserts -1. The reason for that is to be able to distinguish nodes that actually came from the current query (for which the locations actually mean something with respect to a query text we have at hand) from nodes that came from some stored view or rule (for which we lack corresponding query text). So readfuncs.c marks nodes of the latter type as "unknown location" and all is well. As this patch stands, when the debug option is on, all nodes coming out of the parser will have unknown locations, except for the top-level Query or PlannedStmt node that we specially hacked. That's not great; a debugging option shouldn't risk behavioral changes like that. Right now, we pass regression tests anyway because pretty much nothing after parse analysis looks at the location fields, but there have been discussions about changing that so as to get better execution-time error reporting. I thought for a bit about replacing those hacks with something like this in readfuncs.c: #ifdef WRITE_READ_PARSE_PLAN_TREES #define READ_LOCATION_FIELD(fldname) READ_INT_FIELD(fldname) #else // existing definition of READ_LOCATION_FIELD #endif but that breaks the property of clearing the location info coming in from stored rules, so it's unlikely to be satisfactory. Another idea is to have a runtime switch in readfuncs.c, allowing stringToNode to treat location fields as it does now while there'd be an additional entry point that would read location fields like plain ints. Then we'd use the latter for the test code. This is a bit ugly but it might be the best solution. Patch 0002 addresses several more-or-less-independent issues that are exposed by running the regression tests with patch 0001 activated: * Query.withCheckOptions fails to propagate through write+read, because it's intentionally ignored by outfuncs/readfuncs on the grounds that it'd always be NIL anyway in stored rules. This seems like premature optimization of the worst sort, since it's not even saving very much: if the assumption holds, what's suppressed from a stored Query is only " :withCheckOptions <>", hardly a big savings. And of course if the assumption ever fails to hold, it's just broken, and broken in a non obvious way too (you'd only notice if you expected a check option failure and didn't get one). So this is pretty dumb and I think we ought to fix it by treating that field normally in outfuncs/readfuncs. That'd require a catversion bump, but we only need to do it in HEAD. The only plausible alternative is to change _equalQuery to ignore withCheckOptions, which does not sound like a good idea at all. * The system expects TABLEFUNC RTEs to have coltypes, coltypmods, and colcollations lists, but outfuncs doesn't dump them and readfuncs doesn't restore them. This doesn't cause obvious failures, because the only things that look at those fields are expandRTE() and get_rte_attribute_type(), which are mostly used during parse analysis. But expandRTE() is used in the rewriter and planner, so probably it's possible to cause a crash or misbehavior with a stored view that returns the result of an XMLTABLE FROM-item. I've not tried to build a test case, but I'm pretty sure we need to back-patch a fix for this as far back as we have XMLTABLE. Fortunately, because those fields are redundant with the TableFunc node, we can just link to the lists we read in the TableFunc node rather than force a catversion bump to store them separately. (This is pretty grotty, but since it's replicating the physical duplication established by addRangeTableEntryForTableFunc, I think it's OK.) * readfuncs.c omits read support for NamedTuplestoreScan plan nodes. This accidentally fails to break parallel query because a query using a named tuplestore would never be considered parallel-safe anyway, so we don't have to ship it to workers. Still, I'm pretty certain this is somebody's sloppy oversight not an intentional omission, and propose that we should back-patch that addition as far back as NamedTuplestoreScan exists (ie v10). There seem to be no other cases where we've omitted plan node read support, so this one has little excuse to live. * Several places that convert a view RTE into a subquery RTE, or similar manipulations, failed to clear out fields that were specific to the original RTE type and should be zero in a subquery RTE. This has no real impact on the system, but since readfuncs.c will leave such fields as zero, equalfuncs.c thinks the nodes are different leading to a reported failure. I'm satisfied with patching this in HEAD. (You could alternatively argue that we should change _equalRangeTblEntry to ignore fields that are not supposed to be set based on the rtekind, but I don't think that's a good idea.) * BuildOnConflictExcludedTargetlist randomly set the resname of some TargetEntries to "" not NULL. outfuncs/readfuncs don't distinguish those cases, and so the string will read back in as NULL ... but equalfuncs.c does distinguish. Perhaps we ought to try to make things more consistent in this area --- but it's just useless extra code space for BuildOnConflictExcludedTargetlist to not use NULL here, so I fixed it for now by making it do so. Again, changing this in HEAD seems sufficient. Comments? regards, tom lane
diff --git a/src/include/pg_config_manual.h b/src/include/pg_config_manual.h index b309395..b036525 100644 --- a/src/include/pg_config_manual.h +++ b/src/include/pg_config_manual.h @@ -287,6 +287,13 @@ /* #define COPY_PARSE_PLAN_TREES */ /* + * Define this to force all parse and plan trees to be passed through + * outfuncs.c/readfuncs.c, to facilitate catching errors and omissions in + * those modules. + */ +/* #define WRITE_READ_PARSE_PLAN_TREES */ + +/* * Define this to force all raw parse trees for DML statements to be scanned * by raw_expression_tree_walker(), to facilitate catching errors and * omissions in that function. diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c index 7a9ada2..180634f 100644 --- a/src/backend/tcop/postgres.c +++ b/src/backend/tcop/postgres.c @@ -633,6 +633,12 @@ pg_parse_query(const char *query_string) } #endif + /* + * Currently, outfuncs/readfuncs support is missing for many raw parse + * tree nodes, so we don't try to implement WRITE_READ_PARSE_PLAN_TREES + * here. + */ + TRACE_POSTGRESQL_QUERY_PARSE_DONE(query_string); return raw_parsetree_list; @@ -776,6 +782,48 @@ pg_rewrite_query(Query *query) } #endif +#ifdef WRITE_READ_PARSE_PLAN_TREES + /* Optional debugging check: pass querytree through outfuncs/readfuncs */ + { + List *new_list = NIL; + ListCell *lc; + + /* + * We currently lack outfuncs/readfuncs support for most utility + * statement types, so only attempt to write/read non-utility queries. + */ + foreach(lc, querytree_list) + { + Query *query = castNode(Query, lfirst(lc)); + + if (query->commandType != CMD_UTILITY) + { + char *str = nodeToString(query); + Query *new_query = stringToNode(str); + + /* + * These fields are not saved in stored rules, but we must + * preserve them here to avoid breaking pg_stat_statements. + */ + new_query->queryId = query->queryId; + new_query->stmt_location = query->stmt_location; + new_query->stmt_len = query->stmt_len; + + new_list = lappend(new_list, new_query); + pfree(str); + } + else + new_list = lappend(new_list, query); + } + + /* This checks both outfuncs/readfuncs and the equal() routines... */ + if (!equal(new_list, querytree_list)) + elog(WARNING, "outfuncs/readfuncs failed to produce equal parse tree"); + else + querytree_list = new_list; + } +#endif + if (Debug_print_rewritten) elog_node_display(LOG, "rewritten parse tree", querytree_list, Debug_pretty_print); @@ -830,6 +878,30 @@ pg_plan_query(Query *querytree, int cursorOptions, ParamListInfo boundParams) } #endif +#ifdef WRITE_READ_PARSE_PLAN_TREES + /* Optional debugging check: pass plan tree through outfuncs/readfuncs */ + { + char *str; + PlannedStmt *new_plan; + + str = nodeToString(plan); + new_plan = stringToNode(str); + pfree(str); + + /* + * equal() currently does not have routines to compare Plan nodes, so + * don't try to test equality here. Perhaps fix someday? + */ +#ifdef NOT_USED + /* This checks both outfuncs/readfuncs and the equal() routines... */ + if (!equal(new_plan, plan)) + elog(WARNING, "outfuncs/readfuncs failed to produce an equal plan tree"); + else +#endif + plan = new_plan; + } +#endif + /* * Print plan if debugging. */ diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c index 69fd5b2..93f1e2c 100644 --- a/src/backend/nodes/outfuncs.c +++ b/src/backend/nodes/outfuncs.c @@ -1492,8 +1499,9 @@ _readPlannedStmt(void) READ_NODE_FIELD(invalItems); READ_NODE_FIELD(paramExecTypes); READ_NODE_FIELD(utilityStmt); - READ_LOCATION_FIELD(stmt_location); - READ_LOCATION_FIELD(stmt_len); + /* We preserve these fields for the convenience of pg_stat_statements */ + READ_INT_FIELD(stmt_location); + READ_INT_FIELD(stmt_len); READ_DONE(); }
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c index 69fd5b2..93f1e2c 100644 --- a/src/backend/nodes/outfuncs.c +++ b/src/backend/nodes/outfuncs.c @@ -3003,7 +3003,7 @@ _outQuery(StringInfo str, const Query *node) WRITE_NODE_FIELD(rowMarks); WRITE_NODE_FIELD(setOperations); WRITE_NODE_FIELD(constraintDeps); - /* withCheckOptions intentionally omitted, see comment in parsenodes.h */ + WRITE_NODE_FIELD(withCheckOptions); WRITE_LOCATION_FIELD(stmt_location); WRITE_LOCATION_FIELD(stmt_len); } diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c index 3254524..b2db5a7 100644 --- a/src/backend/nodes/readfuncs.c +++ b/src/backend/nodes/readfuncs.c @@ -269,7 +269,7 @@ _readQuery(void) READ_NODE_FIELD(rowMarks); READ_NODE_FIELD(setOperations); READ_NODE_FIELD(constraintDeps); - /* withCheckOptions intentionally omitted, see comment in parsenodes.h */ + READ_NODE_FIELD(withCheckOptions); READ_LOCATION_FIELD(stmt_location); READ_LOCATION_FIELD(stmt_len); @@ -1366,6 +1366,13 @@ _readRangeTblEntry(void) break; case RTE_TABLEFUNC: READ_NODE_FIELD(tablefunc); + /* The RTE must have a copy of the column type info, if any */ + if (local_node->tablefunc) + { + local_node->coltypes = local_node->tablefunc->coltypes; + local_node->coltypmods = local_node->tablefunc->coltypmods; + local_node->colcollations = local_node->tablefunc->colcollations; + } break; case RTE_VALUES: READ_NODE_FIELD(values_lists); @@ -1910,6 +1918,21 @@ _readCteScan(void) } /* + * _readNamedTuplestoreScan + */ +static NamedTuplestoreScan * +_readNamedTuplestoreScan(void) +{ + READ_LOCALS(NamedTuplestoreScan); + + ReadCommonScan(&local_node->scan); + + READ_STRING_FIELD(enrname); + + READ_DONE(); +} + +/* * _readWorkTableScan */ static WorkTableScan * @@ -2693,6 +2716,8 @@ parseNodeString(void) return_value = _readTableFuncScan(); else if (MATCH("CTESCAN", 7)) return_value = _readCteScan(); + else if (MATCH("NAMEDTUPLESTORESCAN", 19)) + return_value = _readNamedTuplestoreScan(); else if (MATCH("WORKTABLESCAN", 13)) return_value = _readWorkTableScan(); else if (MATCH("FOREIGNSCAN", 11)) diff --git a/src/backend/optimizer/prep/prepjointree.c b/src/backend/optimizer/prep/prepjointree.c index c3f46a2..688b3a1 100644 --- a/src/backend/optimizer/prep/prepjointree.c +++ b/src/backend/optimizer/prep/prepjointree.c @@ -586,10 +586,13 @@ inline_set_returning_functions(PlannerInfo *root) funcquery = inline_set_returning_function(root, rte); if (funcquery) { - /* Successful expansion, replace the rtable entry */ + /* Successful expansion, convert the RTE to a subquery */ rte->rtekind = RTE_SUBQUERY; rte->subquery = funcquery; + rte->security_barrier = false; + /* Clear fields that should not be set in a subquery RTE */ rte->functions = NIL; + rte->funcordinality = false; } } } diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c index c601b6d..c020600 100644 --- a/src/backend/parser/analyze.c +++ b/src/backend/parser/analyze.c @@ -1116,7 +1116,7 @@ BuildOnConflictExcludedTargetlist(Relation targetrel, * the Const claims to be. */ var = (Var *) makeNullConst(INT4OID, -1, InvalidOid); - name = ""; + name = NULL; } else { diff --git a/src/backend/rewrite/rewriteHandler.c b/src/backend/rewrite/rewriteHandler.c index d830569..327e5c3 100644 --- a/src/backend/rewrite/rewriteHandler.c +++ b/src/backend/rewrite/rewriteHandler.c @@ -1582,15 +1582,18 @@ ApplyRetrieveRule(Query *parsetree, rule_action = fireRIRrules(rule_action, activeRIRs); /* - * Now, plug the view query in as a subselect, replacing the relation's - * original RTE. + * Now, plug the view query in as a subselect, converting the relation's + * original RTE to a subquery RTE. */ rte = rt_fetch(rt_index, parsetree->rtable); rte->rtekind = RTE_SUBQUERY; - rte->relid = InvalidOid; - rte->security_barrier = RelationIsSecurityView(relation); rte->subquery = rule_action; + rte->security_barrier = RelationIsSecurityView(relation); + /* Clear fields that should not be set in a subquery RTE */ + rte->relid = InvalidOid; + rte->relkind = 0; + rte->tablesample = NULL; rte->inh = false; /* must not be set for a subquery */ /* diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h index 07ab1a3..62209a8 100644 --- a/src/include/nodes/parsenodes.h +++ b/src/include/nodes/parsenodes.h @@ -168,9 +168,8 @@ typedef struct Query List *constraintDeps; /* a list of pg_constraint OIDs that the query * depends on to be semantically valid */ - List *withCheckOptions; /* a list of WithCheckOption's, which are - * only added during rewrite and therefore - * are not written out as part of Query. */ + List *withCheckOptions; /* a list of WithCheckOption's (added + * during rewrite) */ /* * The following two fields identify the portion of the source text string @@ -1034,7 +1033,7 @@ typedef struct RangeTblEntry bool self_reference; /* is this a recursive self-reference? */ /* - * Fields valid for table functions, values, CTE and ENR RTEs (else NIL): + * Fields valid for CTE, VALUES, ENR, and TableFunc RTEs (else NIL): * * We need these for CTE RTEs so that the types of self-referential * columns are well-defined. For VALUES RTEs, storing these explicitly @@ -1042,7 +1041,9 @@ typedef struct RangeTblEntry * ENRs, we store the types explicitly here (we could get the information * from the catalogs if 'relid' was supplied, but we'd still need these * for TupleDesc-based ENRs, so we might as well always store the type - * info here). + * info here). For TableFuncs, these fields are redundant with data in + * the TableFunc node, but keeping them here allows some code sharing with + * the other cases. * * For ENRs only, we have to consider the possibility of dropped columns. * A dropped column is included in these lists, but it will have zeroes in