"Kevin Grittner" <[EMAIL PROTECTED]> writes: > Note that I'm talking about a tool strictly to check the accuracy of > the estimated costs of plans chosen by the planner, nothing else.
We could definitely do with some infrastructure for testing this. I concur with Bruce's suggestion that you should comb the archives for previous discussions --- but if you can work on it, great! > (2) A large database must be created for these tests, since many > issues don't show up in small tables. The same data must be generated > in every database, so results are comparable and reproducable. Reproducibility is way harder than it might seem at first glance. What's worse, the obvious techniques for creating reproducible numbers amount to eliminating variables that are important in the real world. (One of which is size of database --- some people care about performance of DBs that fit comfortably in RAM...) Realistically, the planner is never going to have complete information. We need to design planning models that generally get the right answer, but are not so complicated that they are (a) impossible to maintain or (b) take huge amounts of time to compute. (We're already getting some flak on the time the planner takes.) So there is plenty of need for engineering compromise here. Still, you can't engineer without raw data, so I'm all for creating a tool that lets us gather real-world cost data. The only concrete suggestion I have at the moment is to not design the tool directly around "measure the ratio of real time to cost". That's only meaningful if the planner's cost model is already basically correct and you are just in need of correcting the cost multipliers. What we need for the near term is ways of quantifying cases where the cost models are just completely out of line with reality. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match