Re: [HACKERS] An Idea for planner hints

Mark Dilger Wed, 09 Aug 2006 12:33:54 -0700

Jim C. Nasby wrote:

Been suggested before... the problem is actually doing something useful
with all that data that's collected, as well as how to collect it
without greatly impacting the system.

Identifying the best plan by means of actually running multiple plans and timingthem is useful. That would be the point.

As far as "without greatly impacting the system", I don't think that is a realconcern. The whole idea is to greatly impact the system *once*, sometime whenthe DBA doesn't mind impacting the system (like before you go live on aproduction network, or between midnight and 3 AM, or whatever), and then storethe best plan for future use.

The planner trades-off the desire to find the best plan and the need to find aplan quickly. It also chooses a plan based on statistics and not based onactual runtimes (because there is a chicken-and-egg problem: how do you knowwhich plan has the smallest runtime without running it?), so the chosen planthat looks best based on statistics might not actually be best.

The idea I'm proposing circumvents the whole trade-off problem by explicitlychoosing to do something that makes the planner run really slowly and take areally long time. But it doesn't do it "at runtime", in the sense that youdon't do it for each query. You just do it once up front and be done with it.Of course, this is only useful for people with reasonably static queries andreasonably static table statistics, so that a good plan found up-front continuesto be a good plan as it is repeatedly used.

My personal motivation is that I have tables whose statistics are quite static.The data itself changes, but the statistical distribution from which the datais pulled is unchanging, so the table statistics end up about the same even asthe data itself is added and deleted. On top of that, the planner keepschoosing the wrong plan, which I know to be true because I can make individualqueries run faster by structuring them in ways that the planner can't seethrough and "optimize" away the particular plan that I am effectively giving it.But this is a PITA for me, especially since I don't always know what the bestplan might be and have to try them all until I find the right one. (With theadded complexity that I can't always figure out how to trick the planner intochoosing a specific plan, and hence can't test it.) It would be *so mucheasier* to have an option to tell the planner to try them all.


mark

On Tue, Aug 08, 2006 at 08:23:05AM -0700, Mark Dilger wrote:
If this feature I'm proposing already exists, sorry for the waste ofbandwidth, and could someone please point me to it? :)
What if there were a mode that told postgres to do an exhaustive search (orif not exhaustive, then much more extensive search) of all plans (or manyplans), trying each plan, reporting the performance of each, and discardingthe query results, much like "explain analyze" does. Postgres could thendump the best plan in machine readable (and semi-human readable) form whichthe planner could parse and use at some later date in lieu of a SQL query.
This would allow people with reasonably static table statistics (where thebest plan is not likely to change) to spend upfront cycles investigatingthe best plan and then embed that plan in their business logic. Since thestored plan is both written-by and read-by postgres, it can get quitecomplicated without putting a burden on humans to read and write suchcomplicated things. It would also remove the risk that the planner willoccasionally (due to its nondeterministic workings) choose a really badplan and stall a production system.
mark


---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
      subscribe-nomail command to [EMAIL PROTECTED] so that your
      message can get through to the mailing list cleanly

Re: [HACKERS] An Idea for planner hints

Reply via email to