On Fri, Aug 23, 2024 at 11:56 AM Tom Lane <t...@sss.pgh.pa.us> wrote:
> Richard Guo <guofengli...@gmail.com> writes:
> > I agree that it’s always desirable to postpone work from path-creation
> > time to plan-creation time.  In this case, however, it’s a little
> > different.  The projection step could actually be avoided from the
> > start if we perform the correct check in create_ordered_paths.
>
> Well, the question is how expensive is the "correct check" compared
> to what we're doing now.  It might be cheaper than creating an extra
> level of path node, or it might not.  An important factor here is
> that we'd pay the extra cost of a more complex check every time,
> whether it avoids creation of an extra path node or not.

Fair point.  After looking at the code for a while, I believe it is
sufficient to compare PathTarget.exprs after we've checked that the
two targets have different pointers.

The sorted_path here should have projected the correct target required
by the preceding steps of sort, i.e. sort_input_target.  We need to
determine whether this target matches final_target.  If this target is
the same pointer as sort_input_target, a simple pointer comparison, as
the current code does, is sufficient, because if no post-sort
projection is needed, sort_input_target will always be equal to
final_target.

However, sorted_path's target might not be the same pointer as
sort_input_target, because in apply_scanjoin_target_to_paths, if the
target to be applied has the same expressions as the existing
reltarget, we only inject the sortgroupref info into the existing
pathtargets, rather than create projection paths.  As a result,
pointer comparison in create_ordered_paths is not reliable.

Instead, we can compare PathTarget.exprs to determine whether a
projection step is needed.  If the expressions match, we can be
confident that a post-sort projection is not required.

If this conclusion is correct, I think the extra cost of comparing
PathTarget.exprs only when pointer comparison fails should be
acceptable.  We have already done this for
apply_scanjoin_target_to_paths, and I think the rationale there
applies here as well:

* ...  By avoiding the creation of
* projection paths we save effort both immediately and at plan creation time.

Besides, it can help avoid a Result node in the final plan in some
cases, as shown by my initial example.

Hence, I propose the attached fix.  There are two ensuing plan diffs
in the regression tests, but they look reasonable and are exactly what
we are fixing here.

Thanks
Richard

Attachment: v1-0001-Avoid-unnecessary-post-sort-projection.patch
Description: Binary data

Reply via email to