On Sun, Jun 11, 2023 at 10:23 PM James Coleman <jtc...@gmail.com> wrote: > > ... > > And while trying the v9 patch I came across a crash with the query > > below. > > > > set min_parallel_table_scan_size to 0; > > set parallel_setup_cost to 0; > > set parallel_tuple_cost to 0; > > > > explain (costs off) > > select * from pg_description t1 where objoid in > > (select objoid from pg_description t2 where t2.description = > > t1.description); > > QUERY PLAN > > -------------------------------------------------------- > > Seq Scan on pg_description t1 > > Filter: (SubPlan 1) > > SubPlan 1 > > -> Gather > > Workers Planned: 2 > > -> Parallel Seq Scan on pg_description t2 > > Filter: (description = t1.description) > > (7 rows) > > > > select * from pg_description t1 where objoid in > > (select objoid from pg_description t2 where t2.description = > > t1.description); > > WARNING: terminating connection because of crash of another server process > > > > Seems something is wrong when extracting the argument from the Param in > > parallel worker. > > With what I'm trying to change I don't think this plan should ever be > generated since it means we'd have to pass a param from the outer seq > scan into the parallel subplan, which we can't do (currently). > > I've attached the full backtrace to the email, but as you hinted at > the parallel worker is trying to get the param (in this case > detoasting it), but the param doesn't exist on the worker, so it seg > faults. Looking at this further I think there's an existing test case > that exposes the misplanning here (the one right under the comment > "Parallel Append is not to be used when the subpath depends on the > outer param" in select_parallel.sql), but it doesn't seg fault because > the param is an integer, doesn't need to be detoasted, and therefore > (I think) we skate by (but probably with wrong results in depending on > the dataset). > > Interestingly this is one of the existing test queries my original > patch approach didn't change, so this gives me something specific to > work with improving the path. Thanks for testing this and bringing > this to my attention!
Here's what I've found debugging this: There's only a single gather path ever created when planning this query, making it easy to know which one is the problem. That gather path is created with this stacktrace: frame #0: 0x0000000105291590 postgres`create_gather_path(root=0x000000013081ae78, rel=0x000000013080c8e8, subpath=0x000000013081c080, target=0x000000013081c8c0, required_outer=0x0000000000000000, rows=0x0000000000000000) at pathnode.c:1971:2 frame #1: 0x0000000105208e54 postgres`generate_gather_paths(root=0x000000013081ae78, rel=0x000000013080c8e8, override_rows=false) at allpaths.c:3097:4 frame #2: 0x00000001052090ec postgres`generate_useful_gather_paths(root=0x000000013081ae78, rel=0x000000013080c8e8, override_rows=false) at allpaths.c:3241:2 frame #3: 0x0000000105258754 postgres`apply_scanjoin_target_to_paths(root=0x000000013081ae78, rel=0x000000013080c8e8, scanjoin_targets=0x000000013081c978, scanjoin_targets_contain_srfs=0x0000000000000000, scanjoin_target_parallel_safe=true, tlist_same_exprs=true) at planner.c:7696:3 frame #4: 0x00000001052533cc postgres`grouping_planner(root=0x000000013081ae78, tuple_fraction=0.5) at planner.c:1611:3 frame #5: 0x0000000105251e9c postgres`subquery_planner(glob=0x00000001308188d8, parse=0x000000013080caf8, parent_root=0x000000013080cc38, hasRecursion=false, tuple_fraction=0.5) at planner.c:1062:2 frame #6: 0x000000010526b134 postgres`make_subplan(root=0x000000013080cc38, orig_subquery=0x000000013080ff58, subLinkType=ANY_SUBLINK, subLinkId=0, testexpr=0x000000013080d848, isTopQual=true) at subselect.c:221:12 frame #7: 0x0000000105268b8c postgres`process_sublinks_mutator(node=0x000000013080d6d8, context=0x000000016b0998f8) at subselect.c:1950:10 frame #8: 0x0000000105268ad8 postgres`SS_process_sublinks(root=0x000000013080cc38, expr=0x000000013080d6d8, isQual=true) at subselect.c:1923:9 frame #9: 0x00000001052527b8 postgres`preprocess_expression(root=0x000000013080cc38, expr=0x000000013080d6d8, kind=0) at planner.c:1169:10 frame #10: 0x0000000105252954 postgres`preprocess_qual_conditions(root=0x000000013080cc38, jtnode=0x000000013080d108) at planner.c:1214:14 frame #11: 0x0000000105251580 postgres`subquery_planner(glob=0x00000001308188d8, parse=0x0000000137010d68, parent_root=0x0000000000000000, hasRecursion=false, tuple_fraction=0) at planner.c:832:2 frame #12: 0x000000010525042c postgres`standard_planner(parse=0x0000000137010d68, query_string="explain (costs off)\nselect * from pg_description t1 where objoid in\n (select objoid from pg_description t2 where t2.description = t1.description);", cursorOptions=2048, boundParams=0x0000000000000000) at planner.c:411:9 There aren't any lateral markings on the rels. Additionally the partial path has param_info=null (I found out from Tom in a separate thread [1] that this is only set for outer relations from the same query level). The only param that I could easily find at first was a single param of type PARAM_EXTERN in root->plan_params in make_subplan(). I spent a lot of time trying to figure out where we could find the PARAM_EXEC param that's being fed into the subplan, but it doesn't seem like we have access to any of these things at the point in the path creation process that it's interesting to us when inserting the gather nodes. Given all of that I settled on this approach: 1. Modify is_parallel_safe() to by default ignore PARAM_EXEC params. 2. Add is_parallel_safe_with_params() that checks for the existence of such params. 3. Store the required params in a bitmapset on each base rel. 4. Union the bitmapset on join rels. 5. Only insert a gather node if that bitmapset is empty. I have an intuition that there's some spot (e.g. joins) that we should be removing params from this set (e.g., when we've satisfied them), but I haven't been able to come up with such a scenario as yet. The attached v11 fixes the issue you reported. Thanks, James Coleman
v11-0002-Parallelize-correlated-subqueries.patch
Description: Binary data
v11-0001-Add-tests-before-change.patch
Description: Binary data