On Thu, 23 Mar 2023 at 19:46, Alexandr Nikulin <alexandr.s.niku...@gmail.com> wrote: > I propose to slightly improve the performance of nested loop join in the case > of partitioned inner table. > As I see in the code, the backend looks for the partition of the inner table > each time after fetch a new row from the outer table. > These searches can take a significant amount of time. > But we can skip this step if the nested loop parameter(s) was(re) not changed > since the previous row fetched from the outer table
I think if we were to do something like that, then it should be done in nodeAppend.c and nodeMergeAppend.c. That way you can limit it to only checking parameters that partition pruning needs to care about. That does mean you'd need to find somewhere to cache the parameter values, however. Doing it in nodeNestloop.c means you're doing it when the inner subplan is something that does not suffer from the additional overhead you want to avoid, e.g an Index Scan. Also, generally, if you want to get anywhere with a performance patch, you should post performance results from before and after your change. Also include your benchmark setup and relevant settings for how you got those results. For this case, you'll want a best case (parameter value stays the same) and a worst case, where the parameter value changes on each outer row. I expect you're going to add overhead to this case as your additional checks will always detect the parameter has changed as that'll always require partition pruning to be executed again. We'll want to know if that overhead is high enough for us not to want to do this. I'll be interested to see a test that as some varlena parameter of say a few hundred bytes to see how much overhead testing if that parameter has changed when the pruning is being done on a HASH partitioned table. HASH partitioning should prune quite a bit faster than both LIST and RANGE as the hashing is effectively O(1) vs O(log2 N) (N being the number of Datums in the partition bounds). I'd expect a measurable additional overhead with the patch when the parameter changes on each outer row. David