On Fri, Mar 1, 2019 at 5:47 AM Etsuro Fujita <fujita.ets...@lab.ntt.co.jp> wrote: > Robert, I CCed you because you are the author of that commit. Before > that commit ("Rewrite the code that applies scan/join targets to > paths."), apply_scanjoin_target_to_paths() had a boolean parameter named > modify_in_place, and used apply_projection_to_path(), not > create_projection_path(), to adjust scan/join paths when modify_in_place > was true, which allowed us to save cycles at plan creation time by > avoiding creating projection paths, which I think would be a good thing, > but that commit removed that. Why?
One of the goals of the commit was to properly account for the cost of computing the target list. Before parallel query and partition-wise join, it didn't really matter what the cost of computing the target list was, because every path was going to have to do the same work, so it was just a constant factor getting added to every path. However, parallel query and partition-wise join mean that some paths can compute the final target list more cheaply than others, and that turns out to be important for things like PostGIS. One of the complaints that provoked that change was that PostGIS was picking non-parallel plans even when a parallel plan was substantially superior, because it wasn't accounting for the fact that in the parallel plan, the cost of computing the target-list could be shared across all the workers rather than paid entirely by the leader. In order to accomplish this goal of properly accounting for the cost of computing the target list, we need to create a new path, not just jam the target list into an already-costed path. Note that we did some performance optimization around the same time to minimize the performance hit here (see d7c19e62a8e0a634eb6b29f8f1111d944e57081f, and I think there may have been something else as well although I can't find it right now). > The real reason for this question is: I noticed that projection paths > put on foreign paths will make it hard for FDWs to detect whether there > is an already-well-enough-sorted remote path in the pathlist for the > final scan/join relation as an input relation to GetForeignUpperPaths() > for the UPPERREL_ORDERED step (the IsA(path, ForeignPath) test would not > work well enough to detect remote paths!), so I'm wondering whether we > could revive that parameter like the attached, to avoid the overhead at > plan creation time and to make the FDW work easy. Maybe I'm missing > something, though. I think this would be a bad idea, for the reasons explained above. I also think that it's probably the wrong direction on principle. I think the way we account for target lists is still pretty crude and needs to be thought of as more of a real planning step and less as something that we can just ignore when it's inconvenient for some reason. I think the FDW just needs to look through the projection path and see what's underneath, but also take the projection path's target list into account when decide whether more can be pushed down. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company