On 2018-01-24 15:10:56 -0500, Tom Lane wrote: > Andres Freund <and...@anarazel.de> writes: > > On 2018-01-16 17:05:01 -0500, Tom Lane wrote: > >> I'm curious to know whether Andres has some other ideas, or whether he > >> feels this is all a big wart on the compiled-expression concept. > > > I don't have too many "artistic" concerns from the compiled expression > > POV. The biggest issue I see is that it'll make it a bit harder to > > separate out the expression compilation phase from the expression > > instantiation phase - something I think we definitely want. > > Hmm, there's no such distinction now, so could you explain what you > have in mind there?
There's a few related concerns I have: - For OLTP workloads using prepared statements, we often spend the majority of the time doing ExecInitExpr() and related tasks like computing tupledescs. - For OLTP workloads the low allocation density for things hanging off ExprState's and PlanState nodes is a significant concern. The number of allocations cause overhead, the overhead wastes memory and lowers cache hit ratios. - For JIT we currently end up encoding specific pointer values into the generated code. As these obviously prevent reuse of the generated function, this noticeably reduces the applicability of JITing to fewer usecases. JITing is actually quite beneficial for a lot of OLTP workloads too, but it's too expensive to do every query. To address these, I think we may want to split the the division of labor a bit. Expression instantiation (i.e. ExecReadyExpr()) should happen at executor startup, but in a lot of cases "compiling" the steps itself should happen at plan time. Obviously that means the steps themselves can't contain plain pointers, as the per-execution memory will be located in different places. So I think what we should have is that expression initialization just computes the size of required memory for all steps and puts *offsets* into that in the steps. After that expression instantiation either leaves them alone and evaluation uses relative pointers (cheap-ish e.g. on x86 due to lea), or just turn the relative pointers into absolute ones. That means that all the memory for all steps of an ExprState would be allocated in one chunk, reducing allocation overhead and increasing cache hit ratios considerably. I've experimented a bit with a rough rough hack of the above (purely at execution time), and it doesn't seem too hard. > Keeping the stored value of a CachedExpr in a Param slot is an > interesting idea indeed. We keep coming back to this, IIRC we had a pretty similar discussion around redesigning caseValue_datum/isNull domainValue_datum/isNull to be less ugly. There also was https://www.postgresql.org/message-id/20171116182208.kcvf75nfaldv3...@alap3.anarazel.de where we discussed using something similar to PARAM_EXEC Param nodes to allow inlining of volatile functions. ISTM, there might be some value to consider all of them in the design of the new mechanism. Greetings, Andres Freund