On 19 February 2018 at 15:11, Tomas Vondra <tomas.von...@2ndquadrant.com> wrote: > 1) I can confirm that it indeed eliminates the Append overhead, using > the example from [1], modified to use table with a single partition. Of > course, this is a pretty formal check, because the patch simply removes > the Append node and so it'd be utterly broken if the overhead did not > disappear too. > > [1] > https://www.postgresql.org/message-id/CAKJS1f9UXdk6ZYyqbJnjFO9a9hyHKGW7B%3DZRh-rxy9qxfPA5Gw%40mail.gmail.com
Thanks for testing that. > 2) If I had to guess where the bugs will be, my guess would be a missing > finalize_plan_tlist call - not necessarily in existing code, but perhaps > in future improvements. I agree with that. I'd say though that additionally for the future that we might end up with some missing translation calls to replace_translatable_exprs(). To try to ensure I didn't miss anything, I did previously modify the regression test tables "tenk" and "tenk1" to become partitioned tables with a single partition which allowed all possible values to try to ensure I'd not missed anywhere. I just failed to find a reasonable way to make this a more permanent test without enforcing that change on the tables as a permanent change. > I wonder if we could automate this call somehow, say by detecting when > create_plan_recurse is called after build_path_tlist (for a given path), > and if needed calling finalize_plan_tlist from create_plan_recurse. Sounds nice, but I'm unsure how we could do that. > Although, now that I look at it, promote_child_relation may be doing > something like that by adding stuff to the translate_* variables. Do you mean this? /* * Record this child as having been promoted. Some places treat child * relations in a special way, and this will give them a VIP ticket to * adulthood, where required. */ root->translated_childrelids = bms_add_members(root->translated_childrelids, child->relids); That's to get around a problem in use_physical_tlist() where there's different behaviour for RELOPT_OTHER_MEMBER_REL than there is for RELOPT_BASEREL. Another option might be to instead change the RELOPT_OTHER_MEMBER_REL into RELOPT_BASEREL, although I was a little too scared that might result in many other areas requiring being changed. I may be wrong about that though. We do, for example, occasionally change a RELOPT_BASEREL into a RELOPT_DEADREL, so RelOptInfos changing their RelOptKind is not a new concept, so perhaps it's fine. > I'm > not quite sure why we need to append to the lists, though? Multiple Append rels may be replaced by their only-child relation, so we'd need to be able to translate the targetlists for both. For example, joins above the Appends may contain Vars which belong to either of the Appends and those need to be translated into the promoted child relation's Vars. > 3) I'm not quite sure what replace_translatable_exprs does, or more > precisely why do we actually need it. At this point the comment pretty > much says "We need to do translation. This function does translation." > Perhaps it should mention why removing/skipping the AppendPath implies > the need for translation or something? It does mention "Expr translation is required to translate the targetlists of nodes above the Append", but perhaps I can make that a bit more clear. Let's say we have a query such as: SELECT * FROM fact f INNER JOIN dim1 d ON f.dim1_id = d.id WHERE f.date BETWEEN '2018-01-01' AND '2018-01-31'; Which results in a hash join and a single Jan 2018 partition being scanned for "fact". A plan without the Append node having been removed would have the Append targetlist containing the Vars from the "fact" table, as would the Hash Join node... The problem this function is trying to solve is translating the Vars in the Hash Join node (or any node above the Append) to change the "fact" Vars into "fact_jan2018" Vars. In create_plan.c we skip over any isproxy Append paths and instead return the recursive call to the single proxy-path. Without the translation we'd run into problems when trying to find Vars in the targetlists later. > 4) The thread talks about "[Merge]Append" but the patch apparently only > tweaks create_append_path and does absolutely nothing to "merge" case. > While looking into why, I notices that generate_mergeappend_paths always > gets all_child_pathkeys=NULL in this case (with single subpath), and so > we never create the MergePath at all. I suppose there's some condition > somewhere responsible for doing that, but I haven't found it ... Yeah, only Append paths are used as proxy paths. The confusion here is that the path creation logic has also been changed in allpaths.c (see generate_proxy_paths()), so that we no longer build MergeAppend paths when there's a single subpath. It's probably just the fact that Append is being hi-jacked to act as ProxyPath that's causing the confusion there. If you look over generate_proxy_paths and where it gets called from, you'll see that we, instead of creating Append/MergeAppend paths, we just add each path from pathlist and partial_pathlist from the only-child subpath to the Append rel. This what gives the planner the same flexibility to generate a plan as if the only child partition had been written the query, instead of the parent. > 5) The comment before AppendPath typedef says this: > > * An AppendPath with a single subpath can be set up to become a "proxy" > * path. This allows a Path which belongs to one relation to be added to > * the pathlist of some other relation. This is intended as generic > * infrastructure, but its primary use case is to allow Appends with > * only a single subpath to be removed from the final plan. > > I'm not quite sure how adding a 'isproxy' flag into AppendPath node > makes this a generic infrastructure? Wouldn't an explicit ProxyPath be a > better way to achieve that? It's supposed to be generic in the sense that I didn't aim to code it just to allow Appends with a single subpath to build a plan with the single subpath instead of the Append. I'll give an example a bit below. It evolved that way due to an email early on in this thread. I proposed 2 ways to do this, which included ProxyPath as 1 option. Tom proposed hi-jacking Append to save on the additional boilerplate code. At the time to implement that, I imagined that I could have gotten away with just adding the two translation Lists to the AppendPath struct, and have code determine it's a proxy rather than an Append by the fact that the Lists are non-empty, but that fails due to lack of anything to translate for SELECT COUNT(*) FROM ... type queries which have empty target lists. I don't have a huge preference as to which gets used here, but I don't want to change it just to have to change it back again later. Although, I'm not ruling out the fact that Tom might change his mind when he sees that I had to add an "isproxy" flag to AppendPath to get the whole thing to work. It's certainly not quite as convenient as how dummy paths work. > What other proxy paths do you envision, and why should they be > represented as AppendPath? For example: CREATE MATERIALIZED VIEW mytable_count AS SELECT COUNT(*) FROM mytable; which would allow queries such as: SELECT COUNT(*) FROM mytable; to have an isproxy AppendPath added to the pathlist for the RelOptInfo for mytable which just performs a seqscan on mytable_count instead. It would likely come out a much cheaper Path. The translation exprs, in this case would translate the COUNT(*) Aggref into the Var belonging to mytable_count. Of course, there are other unrelated reasons why that particular example can't work yet, but that was just an example of why I tried to keep it generic. > FWIW one advantage of a separate ProxyPath would be that it would be a > much clearer breakage for third-party code (say, extensions tweaking > paths using hooks), either at compile or execution time. With just a new > flag in existing node, that may easily slip through. On the other hand > no one promises the structures to be stable API ... hmm true, but you could probably have said that for overloading an Agg node to become Partial Aggregate and Finalize Aggregate nodes too. > 6) I suggest we add this assert to create_append_path: > > Assert(!isproxy || (isproxy && (list_length(subpaths)==1))); That kind of indirectly exists already as two seperate Asserts(), although, what I have is equivalent to: Assert((!isproxy && translate_from == NIL) || (isproxy && (list_length(subpaths)==1))); ... as two separate Asserts(), which, if I'm not mistaken, is slightly more strict than the one you mentioned. > and perhaps we should do s/isproxy/is_proxy/ which seems like the usual > naming for boolean variables. You're right. I'll change that and post an updated patch. I'll also reprocess your email again and try to improve some comments to make the intent of the code more clear. Thanks for the review! -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services