Re: WIP Patch: Precalculate stable functions, infrastructure v1

Andres Freund Wed, 24 Jan 2018 12:37:39 -0800

On 2018-01-24 15:10:56 -0500, Tom Lane wrote:
> Andres Freund <and...@anarazel.de> writes:
> > On 2018-01-16 17:05:01 -0500, Tom Lane wrote:
> >> I'm curious to know whether Andres has some other ideas, or whether he
> >> feels this is all a big wart on the compiled-expression concept.
>
> > I don't have too many "artistic" concerns from the compiled expression
> > POV. The biggest issue I see is that it'll make it a bit harder to
> > separate out the expression compilation phase from the expression
> > instantiation phase - something I think we definitely want.
>
> Hmm, there's no such distinction now, so could you explain what you
> have in mind there?


There's a few related concerns I have:
- For OLTP workloads using prepared statements, we often spend the
  majority of the time doing ExecInitExpr() and related tasks like
  computing tupledescs.
- For OLTP workloads the low allocation density for things hanging off
  ExprState's and PlanState nodes is a significant concern. The number
  of allocations cause overhead, the overhead wastes memory and lowers
  cache hit ratios.
- For JIT we currently end up encoding specific pointer values into the
  generated code. As these obviously prevent reuse of the generated
  function, this noticeably reduces the applicability of JITing to fewer
  usecases. JITing is actually quite beneficial for a lot of OLTP
  workloads too, but it's too expensive to do every query.

To address these, I think we may want to split the the division of labor
a bit. Expression instantiation (i.e. ExecReadyExpr()) should happen at
executor startup, but in a lot of cases "compiling" the steps itself
should happen at plan time. Obviously that means the steps themselves
can't contain plain pointers, as the per-execution memory will be
located in different places.  So I think what we should have is that
expression initialization just computes the size of required memory for
all steps and puts *offsets* into that in the steps. After that
expression instantiation either leaves them alone and evaluation uses
relative pointers (cheap-ish e.g. on x86 due to lea), or just turn the
relative pointers into absolute ones.
That means that all the memory for all steps of an ExprState would be
allocated in one chunk, reducing allocation overhead and increasing
cache hit ratios considerably.

I've experimented a bit with a rough rough hack of the above (purely at
execution time), and it doesn't seem too hard.


> Keeping the stored value of a CachedExpr in a Param slot is an
> interesting idea indeed.

We keep coming back to this, IIRC we had a pretty similar discussion
around redesigning caseValue_datum/isNull domainValue_datum/isNull to be
less ugly. There also was
https://www.postgresql.org/message-id/20171116182208.kcvf75nfaldv3...@alap3.anarazel.de
where we discussed using something similar to PARAM_EXEC Param nodes to
allow inlining of volatile functions.

ISTM, there might be some value to consider all of them in the design of
the new mechanism.

Greetings,

Andres Freund

Re: WIP Patch: Precalculate stable functions, infrastructure v1

Reply via email to