On Sun, Apr 26, 2020 at 5:49 PM David Rowley <dgrowle...@gmail.com> wrote:
> On Sun, 26 Apr 2020 at 19:08, Andy Fan <zhihui.fan1...@gmail.com> wrote: > > If we want to handle this case as well, one of the changes would > > be it needs to cache multi records for one input parameter, or return > > one row each time but return mutli times for one input parameter, > > Tuplestore may be a good option for this case since its full > functionalities > > like tuple_puttuple, tuple_gettuple. But if we implement it with > tuplestore, > > the next question is how to control the memory usage for this Node. > > We can use the dedicated memory context to know how many memory > > this node used in total, but we can't stop the tuplestore from using more > > memory. Or we can force set both current tuplestore->state to > TTS_WRITEFILE > > and set the allowedMem to 0 for the following tuplestore, after we find > too > > memory is used. However this looks a bit of hack. > > I didn't imagine a tuplestore would be that useful for this. A node > like this will do its best work when the ratio of n_values / > distinct_values of the parameters is high. The planner can often not > be that great at knowing the number of distinct values, especially so > when there is more than one expression to estimate the number of > distinct values for. (we added extended statistics to try to help with > that). I think this node will do its best when the time spent for a > cache miss it bearly any more expensive than scanning the subnode to > get the results. If we can do that then we'll see fewer regressions > for when we inject one of these nodes where it'll do no good, e.g when > we'll never get a repeated value. If we start spilling these tuples > out to disk then it adds overhead which might never pay off. > > I'd suggest a hash table to act as an MRU cache. We'd just evict old > values when we run out of space, i.e consume all of work_mem. > > I've got a bunch of code locally which is still a work in progress to > do this. I'll finish it off and post it here. I was feeling that we may have to maintain some extra status if we use hash table rather than tuple store, but that might be not a major concern. I can wait and see your patch. Best Regards Andy Fan