I apologize for not having paid much attention to this thread so far. It kept getting stuck on my "to look at later" queue. Anyway, I've taken a preliminary look at the v7 patch now.
While the patch seems roughly along the lines of what we talked about last PGCon, I share Stephen's unease about a lot of the details. It's not entirely clear that these hooks are really good for anything, and it's even less clear what APIs the hook functions should be expected to depend on. I really do not like the approach embodied in the later patches of "oh, we'll just expose whatever static planner functions seem convenient". That's not an approach that's available when doing actual external development of an extension, and even if it were that doesn't make it a good idea. The larger the exposed surface of functions the harder it is to know what's safe to change. Anyway, on to specifics: * Please drop the whole register_custom_provider/get_custom_provider API. There is no reason other than debugging for a provider to have a name at all, and if we expect providers to have unique names then that creates a collision risk for independently-created extensions. AFAICS, it's sufficient for a function hooked into one of the add-a-path hooks to include a pointer to a struct-of-function-pointers in the Path object it returns, and similarly the CustomScan Plan object can contain a pointer inserted when it's created. I don't object to having a name field in the function pointer structs for debugging reasons, but I don't want any lookups being done on it. * The function-struct pointers should be marked const in the referencing nodes, to indicate that the core code won't be modifying or copying them. In practice they'd probably be statically allocated constants in the extensions anyway. * The patch does lots of violence to the separation between planner and executor, starting with the decision to include nodes/relation.h in executor.h. That will not do at all. I see that you did that because you wanted to make ExecSupportsMarkRestore take a Path, but we need some other answer. One slightly grotty answer is to invent two different customscan Plan types, one that supports mark/restore and one that doesn't, so that ExecSupportsMarkRestore could still just look at the Plan type tag. (BTW, path->pathtype is supposed to contain the node tag of the Plan node that the path would produce. Putting T_CustomPath in it is entirely tone-deaf.) Another way would be to remove ExecSupportsMarkRestore in favor of some new function in the planner; but it's good to keep it in execAmi.c since that has other knowledge of which plan types support mark/restore. * More generally, I'm not convinced that exactly one Path type and exactly one Plan type is going to get us very far. It seems rather ugly to use the same Plan type for both scan and join nodes, and what will happen if somebody wants to build a custom Append node, or something else that has neither zero nor two subplans? * nodeCustom.h is being completely abused. That should only export the functions in nodeCustom.c, which are going to be pretty much one-liners anyway. The right place to put the function pointer struct definitions is someplace else. I'd be inclined to start by separating the function pointers into two structs, one for functions needed for a Path and one for functions needed for a Plan, so that you don't have this problem of having to import everything the planner knows into an executor header or vice versa. Most likely you could just put the Path function pointer struct declaration next to CustomPath in relation.h, and the one for Plans next to CustomPlan (or the variants thereof) in plannodes.h. * The set of fields provided in CustomScan seems nonsensical. I'm not even sure that it should be derived from Scan; that's okay for plan types that actually are scans of a base relation, but it's confusing overhead for anything that's say a join, or a custom sort node, or anything like that. Maybe one argument for multiple plan node types is that one would be derived from Scan and one directly from Plan. * More generally, what this definition for CustomScan exposes is that we have no idea whatever what fields a custom plan node might need. I'm inclined to think that what we should be assuming is that any custom path or plan node is really an object of a struct type known only to its providing extension, whose first field is the CustomPath or CustomPlan struct known to the core backend. (Think C++ subclassing.) This would imply that copyfuncs/equalfuncs/outfuncs support would have to be provided by the extension, which is in principle possible if we add function pointers for those operations to the struct linked to from the path/plan object. (Notationally this might be a bit of a pain, since the macros that we use in the functions in copyfuncs.c etc aren't public. Not sure if it's worth exposing those somewhere, or if people should just copy/paste them.) This approach would probably also avoid the need for the klugy bitmapset representation you propose in patch 3. * This also implies that create_customscan_plan is completely bogus. A custom plan provider *must* provide a callback function for that, because only it will know how big a node to palloc. There are far too many assumptions in create_customscan_plan anyway; I think there is probably nothing at all in that function that shouldn't be getting done by the custom provider instead. * Likewise, there is way too much hard-wired stuff in explain.c. It should not be optional whether a custom plan provider provides an explain support function, and that function should be doing pretty much everything involved in printing the node. * I don't believe in the hard-wired implementation in setrefs.c either. * Get rid of the CUSTOM_VAR business, too (including custom_tupdesc). That's at best badly thought out, and at worst a source of new bugs. Under what circumstances is a custom plan node going to contain any Vars that don't reduce to either scan or input-plan variables? None, because that would imply that it was doing something unrelated to the requested query. * The API for add_join_path_hook seems overcomplex, as well as too full of implementation details that should remain private to joinpath.c. I particularly object to passing the mergeclause lists, which seem unlikely to be of interest for non-mergejoin plan types anyway. More generally, it seems likely that this hook is at the wrong level of abstraction; it forces the hook function to concern itself with a lot of stuff about join legality and parameterization (which I note your patch3 code fails to do; but that's not an optional thing). * After a bit of reflection I think the best thing to do might be to ditch add_join_path_hook for 9.4, on these grounds: 1. You've got enough to do to get the rest of the patch committable. 2. Like Stephen, I feel that the proposed usage as a substitute for FDW-based foreign join planning is not the direction we want to travel. 3. Without that, the use case for new join nodes seems pretty marginal, as opposed to say alternative sort-node implementations (a case not supported by this patch, except to the extent that you could use them in explicit-sort mergejoins if you duplicated large parts of joinpath.c). * Getting nitpicky ... what is the rationale for the doubled underscore in the CUSTOM__ flag names? That's just a typo waiting to happen IMO. * Why is there more than one call site for add_scan_path_hook? I don't see the need for the calling macro with the randomly inconsistent name, either. * The test arrangement for contrib/ctidscan is needlessly complex, and testing it in the core tests is a bogus idea anyway. Why not just let it contain its own test script like most other contrib modules? That's all I've got for now. I've not really looked at the extension code in either patch2 or patch3, just at the changes in the core code. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers