Hi! Thank you for the work with this subject, I think it is really
important.
On 10.02.2025 22:58, Alexander Korotkov wrote:
Hi!
On Mon, Feb 10, 2025 at 7:19 AM Andrei Lepikhov <lepi...@gmail.com> wrote:
On 9/2/2025 18:41, Alexander Korotkov wrote:
Regarding adjust_relid_set() and replace_relid(). I think they are
now strictly equivalent, except for the case then old relid is given
and not found. In this case adjust_relid_set() returns the original
relids while replace_relid() returns a copy. The behavior of
adjust_relid_set() appears more desirable as we don't need extra
copying when no modification is done. So, I've replaced all
replace_relid() with adjust_relid_set().
Ok, I glanced into it, and it makes sense to merge these routines.
I think the comment to adjust_relid_set() should be arranged, too. See
the attachment for a short variant of such modification.
Also, I did some grammar correction to your new comment in tests.
Thanks!
I've further revised adjust_relid_set() header comment.
Looking back to the work done since previous attempt to commit this to
pg17, I can highlight following.
1) We're now using more of existing infrastructure including
adjust_relid_set() and ChangeVarNodes(). The most of complexity is
still there though.
2) We've checked few ways to further simplify this patch. But yet the
current way still feels to be best possible.
3) For sure, several bugs were fixed.
I think we could give it another chance for pg18 after some further
polishing (at least commit message still needs to be revised). Any
thoughts on this? Tom?
I didn't find any mistakes, I just have a refactoring remark. I think
the part where we add non-redundant expressions with the
binfo_candidates, jinfo_candidates
check can be moved to a separate function, otherwise the code is very
repetitive in this place. I did it and attached diff file
--
Regards,
Alena Rybakina
Postgres Professional
diff --git a/src/backend/optimizer/plan/analyzejoins.c b/src/backend/optimizer/plan/analyzejoins.c
index d0d9b80be9c..d31427693da 100644
--- a/src/backend/optimizer/plan/analyzejoins.c
+++ b/src/backend/optimizer/plan/analyzejoins.c
@@ -1578,6 +1578,48 @@ restrict_infos_logically_equal(RestrictInfo *a, RestrictInfo *b)
return result;
}
+/*
+ * The functions adds all non-redundant clauses to the keeping relation.
+ * Contradictory operation. On the one side, we reduce the length of
+ * restrict lists that can impact planning or executing time.
+ * Additionally, we improve the accuracy of cardinality estimation. On the
+ * other side, it is one more place that can make planning time much
+ * longer in specific cases. It would have been better to avoid calling
+ * the equal() function here, but it's the only way to detect duplicated
+ * inequality expressions.
+ */
+static void
+add_non_redundant_clauses(PlannerInfo *root,
+ List *rinfo_candidates,
+ List *keep_rinfo_list,
+ Index removed_relid)
+{
+ foreach_node(RestrictInfo, rinfo, rinfo_candidates)
+ {
+ bool is_redundant = false;
+
+ Assert(!bms_is_member(removed_relid, rinfo->required_relids));
+
+ foreach_node(RestrictInfo, src, keep_rinfo_list)
+ {
+ if (!bms_equal(src->clause_relids, rinfo->clause_relids))
+ /* Can't compare trivially different clauses */
+ continue;
+
+ if (src == rinfo ||
+ (rinfo->parent_ec != NULL &&
+ src->parent_ec == rinfo->parent_ec) ||
+ restrict_infos_logically_equal(rinfo, src))
+ {
+ is_redundant = true;
+ break;
+ }
+ }
+ if (!is_redundant)
+ distribute_restrictinfo_to_rels(root, rinfo);
+ }
+}
+
/*
* Remove a relation after we have proven that it participates only in an
* unneeded unique self join.
@@ -1654,62 +1696,10 @@ remove_self_join_rel(PlannerInfo *root, PlanRowMark *kmark, PlanRowMark *rmark,
/*
* Now, add all non-redundant clauses to the keeping relation.
- * Contradictory operation. On the one side, we reduce the length of
- * restrict lists that can impact planning or executing time.
- * Additionally, we improve the accuracy of cardinality estimation. On the
- * other side, it is one more place that can make planning time much
- * longer in specific cases. It would have been better to avoid calling
- * the equal() function here, but it's the only way to detect duplicated
- * inequality expressions.
*/
- foreach_node(RestrictInfo, rinfo, binfo_candidates)
- {
- bool is_redundant = false;
-
- Assert(!bms_is_member(toRemove->relid, rinfo->required_relids));
-
- foreach_node(RestrictInfo, src, toKeep->baserestrictinfo)
- {
- if (!bms_equal(src->clause_relids, rinfo->clause_relids))
- /* Const and non-const expressions can't be equal */
- continue;
-
- if (src == rinfo ||
- (rinfo->parent_ec != NULL
- && src->parent_ec == rinfo->parent_ec)
- || restrict_infos_logically_equal(rinfo, src))
- {
- is_redundant = true;
- break;
- }
- }
- if (!is_redundant)
- distribute_restrictinfo_to_rels(root, rinfo);
- }
- foreach_node(RestrictInfo, rinfo, jinfo_candidates)
- {
- bool is_redundant = false;
-
- Assert(!bms_is_member(toRemove->relid, rinfo->required_relids));
+ add_non_redundant_clauses(root, binfo_candidates, toKeep->baserestrictinfo, toRemove->relid);
+ add_non_redundant_clauses(root, jinfo_candidates, toKeep->joininfo, toRemove->relid);
- foreach_node(RestrictInfo, src, toKeep->joininfo)
- {
- if (!bms_equal(src->clause_relids, rinfo->clause_relids))
- /* Can't compare trivially different clauses */
- continue;
-
- if (src == rinfo ||
- (rinfo->parent_ec != NULL
- && src->parent_ec == rinfo->parent_ec)
- || restrict_infos_logically_equal(rinfo, src))
- {
- is_redundant = true;
- break;
- }
- }
- if (!is_redundant)
- distribute_restrictinfo_to_rels(root, rinfo);
- }
list_free(binfo_candidates);
list_free(jinfo_candidates);