Hi, On 2017-12-20 12:12:48 -0500, Tom Lane wrote: > Andres Freund <and...@anarazel.de> writes: > > What's the workload you're testing? I'm mildly surprised to see > > ExecEvalParamExtern() show up, rather than just plpgsql_param_fetch() & > > exec_eval_datum(). Or were you just listing that to specify the > > callpath? > > I'm using several different test cases, but one that shows up the problem > is [...]
Thanks. > The outer do-block is just to get a run long enough for reliable perf > statistics. I find these relevant entries in the report: > > Children Self Samples Shared Object Symbol > + 98.29% 4.44% 10827 postgres [.] ExecInterpExpr > + 61.82% 3.23% 7878 plpgsql.so [.] exec_eval_expr > + 34.64% 3.91% 9521 postgres [.] ExecEvalParamExtern > + 29.78% 4.88% 11892 plpgsql.so [.] plpgsql_param_fetch > + 23.06% 6.90% 16831 plpgsql.so [.] exec_eval_datum > + 6.28% 2.89% 7049 plpgsql.so [.] setup_param_list > + 4.02% 4.01% 9774 postgres [.] bms_next_member > I think the ridiculous "children" percentage for ExecInterpExpr can be > discounted, because that's a result of essentially everything happening > inside the outer call of ptest4_rec. Right. Unfolding this gives: - 79.73% 5.66% postgres postgres [.] ExecInterpExpr - 92.90% ExecInterpExpr plpgsql_call_handler plpgsql_exec_function exec_stmt_block - exec_stmts - 100.00% exec_for_query - 68.11% exec_stmts - exec_assign_expr - 60.96% ExecInterpExpr - 99.10% ExecEvalParamExtern - plpgsql_param_fetch + 61.82% SPI_fnumber 15.87% SPI_getbinval 14.29% nocachegetattr 4.18% bms_is_member 3.84% SPI_gettypeid 0.90% int4pl + 18.95% SPI_plan_get_cached_plan 11.48% bms_next_member + 5.13% exec_assign_value + 3.13% ReleaseCachedPlan + 16.70% SPI_cursor_fetch + 7.19% CreateTupleDescCopy + 5.49% heap_copytuple 1.26% AllocSetFree + 6.13% 0xffffffffffffffff + 0.71% 0x5624318c8d6f Which certainly seems interesting. The outer ExecInterpExpr() indeed doesn't do that much, it's the inner call that's the most relevant piece. That so much time is spent in SPI_fnumber() and the functions it calls, namely strcmp(), certainly doesn't seem right. I suspect that without addressing that cost, a lot of the other potential improvements aren't going to mean much. Looking at the function cost excluding children: + 7.79% postgres plpgsql.so [.] exec_assign_expr + 7.39% postgres plpgsql.so [.] plpgsql_param_fetch - 6.71% postgres libc-2.25.so [.] __strncmp_sse42 - __strncmp_sse42 + 99.97% SPI_fnumber + 5.66% postgres postgres [.] ExecInterpExpr - 4.60% postgres postgres [.] bms_next_member - bms_next_member - 99.98% exec_assign_expr - 4.59% postgres postgres [.] CreateTupleDescCopy - CreateTupleDescCopy - 92.93% exec_for_query exec_stmts exec_stmt_block plpgsql_exec_function plpgsql_call_handler + 4.40% postgres postgres [.] AllocSetAlloc - 3.77% postgres postgres [.] SPI_fnumber + SPI_fnumber + 3.49% postgres plpgsql.so [.] exec_for_query + 2.93% postgres postgres [.] GetCachedPlan + 2.90% postgres postgres [.] nocachegetattr + 2.85% postgres postgres [.] ExecEvalParamExtern + 2.68% postgres postgres [.] heap_getnext + 2.64% postgres postgres [.] SPI_getbinval + 2.39% postgres plpgsql.so [.] exec_assign_value + 2.22% postgres postgres [.] heap_copytuple + 2.21% postgres plpgsql.so [.] exec_stmts The time in exec_assign_expr() is largely spent doing bms_next_member() via the inlined setup_param_list(). Certainly shows that there's some expression eval related overhead. But it seems the lowest hanging fruits aren't quite there, and wouldn't necessarily be addressed by the type of change we're discussing here. > >> So the execution would look more or less like > >> op->eval_param(state, op, econtext); > >> and there'd need to be some extra fields (at least a void*) in the op > >> struct where plpgsql could keep private data. > > > I think I'd redo the parameters to the callback slightly, but generally > > that sounds sane. Was thinking of something more like > > Um, you left out something here? I don't mind changing the callback > signature, but it seems like it generally ought to look the same as > the other out-of-line eval functions. Yes, I did. Intercontinental travel does wonders. I was thinking that it might be better not to expose the exact details of the expression evaluation step struct to plpgsql etc. I'm kinda forseeing a bit of further churn there that I don't think other parts of the code necessarily need to be affected by. We could have the callback have a signature roughly like Datum callback(void *private_data, ExprContext econtext, bool *isnull); > > One question I have is how we will re-initialize the relevant state > > between exec_simple_expr() calls. I guess the most realistic one is that > > the op will have a pointer into an array managed by exec_simple_expr() > > that can get reset? > > I'm not seeing anything that needs to be reset? I was kind of thinking of the params_dirty business in plpgsql_param_fetch(), setup_param_list() etc. But I'm not quite sure how you're thinking of representing parameters on the plpgsql side after changing the callbacks style. > > Hm. We could have a version of ExecInitExpr() / ExecInitExprRec that > > first gives a callback chance to handle an operation. [ ... ] > > Yeah, I thought of that idea too, but at least for what I'm trying to > do here, it doesn't seem all that helpful. [ ... ] Ah, makes sense. Greetings, Andres Freund