Re: Subplan result caching

Andy Fan Sun, 26 Apr 2020 05:37:39 -0700

On Sun, Apr 26, 2020 at 5:49 PM David Rowley <dgrowle...@gmail.com> wrote:


> On Sun, 26 Apr 2020 at 19:08, Andy Fan <zhihui.fan1...@gmail.com> wrote:
> > If we want to handle this case as well, one of the changes would
> > be it needs to cache multi records for one input parameter, or return
> > one row each time but return mutli times for one input parameter,
> > Tuplestore may be a good option for this case since its full
> functionalities
> > like tuple_puttuple, tuple_gettuple. But if we implement it with
> tuplestore,
> > the next question is how to control the memory usage for this Node.
> > We can use the dedicated memory context to know how many memory
> > this node used in total, but we can't stop the tuplestore from using more
> > memory.  Or we can force set both current tuplestore->state  to
> TTS_WRITEFILE
> > and set the allowedMem to 0 for the following tuplestore, after we find
> too
> > memory is used. However this looks a bit of hack.
>
> I didn't imagine a tuplestore would be that useful for this. A node
> like this will do its best work when the ratio of n_values /
> distinct_values of the parameters is high. The planner can often not
> be that great at knowing the number of distinct values, especially so
> when there is more than one expression to estimate the number of
> distinct values for. (we added extended statistics to try to help with
> that).  I think this node will do its best when the time spent for a
> cache miss it bearly any more expensive than scanning the subnode to
> get the results.  If we can do that then we'll see fewer regressions
> for when we inject one of these nodes where it'll do no good, e.g when
> we'll never get a repeated value.  If we start spilling these tuples
> out to disk then it adds overhead which might never pay off.
>
> I'd suggest a hash table to act as an MRU cache.  We'd just evict old
> values when we run out of space, i.e consume all of work_mem.
>
> I've got a bunch of code locally which is still a work in progress to
> do this. I'll finish it off and post it here.


I was feeling that we may have to maintain some extra status if we use hash
table rather than tuple store, but that might be not a major concern.  I can
wait and see your patch.

Best Regards
Andy Fan

Re: Subplan result caching

Reply via email to