On Thu, Feb 15, 2018 at 03:00:17PM +0100, Shay Rojansky wrote: > Just wanted to say that I've seen more than 10% improvement in some real-world > application when preparation was done properly. Also, I'm assuming that > implementing this wouldn't involve "rewriting substantial part of Postgres > core", and that even 10% is quite a big gain, especially if it's a > transparent/ > free one as far as the user is concerned (no application changes).
I would like to step back on this issue. Ideally, every query would get re-optimized because we can only be sure the plan is optimal when we use supplied constants to generate the plan. But, of course, parsing and planning take time, so there ideally would be an way to avoid it. The question is always how much time will be saved by avoiding parsing/planning, and what risk is there of suboptimal plans. Postgres uses a conservative method for reusing plans with previous constants, as described in the PREPARE manual page: https://www.postgresql.org/docs/10/static/sql-prepare.html Prepared statements can use generic plans rather than re-planning with each set of supplied EXECUTE values. This occurs immediately for prepared statements with no parameters; otherwise it occurs only after five or more executions produce plans whose estimated cost average (including planning overhead) is more expensive than the generic plan cost estimate. Once a generic plan is chosen, it is used for the remaining lifetime of the prepared statement. Using EXECUTE values which are rare in columns with many duplicates can generate custom plans that are so much cheaper than the generic plan, even after adding planning overhead, that the generic plan might never be used. While I have heard people complain about how other databases cache prepare plans, I have heard few complaints about the Postgres approach, and I haven't even heard of people asking to control the documented "five or more" behavior. I also know that other database products have more sophisticated prepare usage, but they might have higher parse/plan overhead, or they might be more flexible in handling specialized workloads, which Postgres might not want to handle, given the costs/complexity/overhead. So, the directions for improvement are: 1 Improve the existing "five or more" behavior 2 Automatically prepare queries that are not sent as prepared queries 3 Share plans among sessions While #1 would be nice, #2 increases the number of applications that can silently benefit from prepared queries, and #3 improves the number of cases that query plans can be reused. The issue with #3 is that the constants used are no longer local to the session (which is the same issue with connection poolers reusing prepared plans). When different sessions with potentially more varied constants reuse plans, the probability of suboptimal plans increases. I think the fact that pgbench shows a 2x improvement for prepared statements, and real world reports are a 10% improvement means we need to have a better understanding of exactly what workloads can benefit from this, and a comprehensive analysis of all three areas of improvement. -- Bruce Momjian <br...@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +