On Wed, 27 Oct 2021 at 15:58, Simon Riggs <simon.ri...@enterprisedb.com> wrote:
> The poor performance is traced to the planner cost estimates for > recursive queries. Specifically, the cost of the recursive arm of the > query is evaluated based upon both of these hardcoded assumptions: > > 1. The recursion will last for 10 loops > 2. The average size of the worktable will be 10x the size of the > initial query (non-recursive term). > > Taken together these assumptions lead to a very poor estimate of the > worktable activity (in this case), which leads to the plan changing as > a result. > > The factor 10 is a reasonably safe assumption and helps avoid worst > case behavior in bigger graph queries. However, the factor 10 is way > too large for many types of graph query, such as where the path > through the data is tight, and/or the query is written to prune bushy > graphs, e.g. shortest path queries. The factor 10 should not be > hardcoded in the planner, but should be settable, just as > cursor_tuple_fraction is. If you think this should be derived without parameters, then we would want a function that starts at 1 for 1 input row and gets much larger for larger input. The thinking here is that Graph OLTP is often a shortest path between two nodes, whereas Graph Analytics and so the worktable will get much bigger. So I'm, thinking we use rel->tuples = min(1, cte_rows * cte_rows/2); -- Simon Riggs http://www.EnterpriseDB.com/