On Fri, Jan 3, 2025 at 8:44 AM Robert Haas <robertmh...@gmail.com> wrote: > > Actually, I noticed that we are failing to honor that in the places > > where we inject "*SELECT*" and "*SELECT* %d" names, because that > > code puts those names into RTE->alias not only RTE->eref. > > I experimented with the attached patch to not do that anymore, > > which is sort of a subset of what you did but just focused on > > not lying about what's generated versus user-written. We could > > alternatively keep the current generated names by extending > > addRangeTableEntryForSubquery's API so that alias and generated eref > > are passed separately. (I didn't look to see if anyplace else > > is messing up this distinction similarly.) > > Hmm, I definitely like not lying about what is generated vs. what is > user-written. I don't have a strong opinion right now on the best way > of accomplishing that.
I rediscovered, or re-encountered, this problem today, which motivated me to have a closer look at your (Tom's) patch. My feeling is that it's the right approach. I agree that we could try to keep the current generated names by extending addRangeTableEntryForSubquery, but I'm tentatively inclined to think that we shouldn't. Perhaps that's partly a stylistic preference on my part: I think unnamed_subquery_N looks nicer than "*SELECT * N", but there's also something to be said for keeping the code simple. I think it would be reasonable to instead extend addRangeTableEntryForSubquery if we find that the naming change breaks something, but if it doesn't then I like what you've done better. There's also an argument from consistency: even without the patch, unnamed_subquery leaks out in some contexts, and I think it's nicer to use unnamed_subquery everywhere than to use that in some places and *SELECT* in others. I then went looking for other places that have similar issues. I pretty quickly discovered that convert_ANY_sublink_to_join is another caller of addRangeTableEntryForSubquery that is fabricating an alias when it could just pass NULL; in that case, the fabricated name is ANY_subquery. Also, it seems like the recursive CTE stuff can just set the alias to NULL and the eref as it's currently doing, instead of setting both alias and eref as the code does currently. So, PFA 0001, a rebased version of Tom's patch; 0002, addressing ANY_subquery; and 0003, addressing *TLOCRN* and *TROCRN*. If we decide to adopt all of these, we may want to do some squashing before commit, but we have a little time to figure that out since I think this is v19 material anyway. Apart from those cases, and at least AFAICS, everything that's using a wholly fabricated name seems to be either (1) a case where we intend for the name to be referenceable (old, new, excluded) or (2) a value that is assigned only to eref and not to alias (e.g. *GROUP*). However, I did come across one other mildly interesting case. expand_single_inheritance_child has this: /* * We just duplicate the parent's table alias name for each child. If the * plan gets printed, ruleutils.c has to sort out unique table aliases to * use, which it can handle. */ childrte->alias = childrte->eref = makeAlias(parentrte->eref->aliasname, child_colnames); What I find curious about this is that we're assigning the parent's eref to both the child's eref and the child's alias. Maybe there's something I don't understand here, or maybe it just doesn't matter, but why wouldn't we assign eref to eref and alias to alias? Or even alias to alias and generate a new eref? -- Robert Haas EDB: http://www.enterprisedb.com
v2-0002-Don-t-generate-fake-ANY_subquery-aliases-either.patch
Description: Binary data
v2-0003-Don-t-generate-fake-TLOCRN-or-TROCRN-aliases-eith.patch
Description: Binary data
v2-0001-Don-t-generate-fake-SELECT-or-SELECT-d-subquery-a.patch
Description: Binary data