I have to keep a very narrow focus on this, or there is likely that nothing will come of it. The particular area which is my target here is the accuracy of the cost values on the subplans considered by the optimizer. As previously stated, we're getting hurt by cases where the optimizer looks at two plans and picks the slower one because the cost numbers are don't correlate well to actual run time, even with accurate statistics. With the query I've been using as an example, the numbers are skewed against the faster plan whether nothing is cached or everything is cached.
Without reasonably accurate costing, all the work the optimizer does to come up with logically equivalent paths to the same result doesn't wind up helping as much as it should. You could have some very clever technique which was faster, but ignored (or slower, yet chosen). Of course, if running with EXPLAIN ANALYZE significantly distorts the run time, the whole effort is doomed at the outset. Can you quantify the distortion you mention? Do you know what sorts of conditions are most affected? At a minimum, it sounds likethe tool should have the capability to repeat the tests with and without EXPLAIN ANALYZE, and use the ratio of the two times as a reliability factor. Unfortunately, that means doubling the number of cache flushes, which is likely to be the most time-consuming part of running the tests. On the bright side, we would capture the top level runtimes you want. You make a good point about the expected versus actual rows. The ratio I've been looking at should perhaps be multipled by (actual rowcount / estimated rowcount). And with that information in the database, there would be more options for massaging it for different perspectives. I was speaking loosely when I said that I would store the derived data -- I would probably start with a view to provide the heavily used derived values, and only materialize them in the base tables if necessary for performance. I would not want the application to calculate the heavily used derived values every time, due to the bloat in development time and maintenance effort to repeat the expression everywhere. I had a thought over lunch regarding something else we might want to capture from the EXPLAIN ANALYZE output -- we could reference enclosing plan level in bill of materials fashion. I don't see an immediate use case for it, but it is sitting there, pretty much free for the taking, and seems potentially useful. -Kevin >>> Tom Lane <[EMAIL PROTECTED]> 10/14/05 1:37 PM >>> "Kevin Grittner" <[EMAIL PROTECTED]> writes: > I propose capturing only three values from the output of explain > analyze, and saving it with many columns of context information. You really have to capture the rowcounts (est and actual) too. Otherwise you can't tell if it's a costing problem or a statistics problem. More generally, I think that depending entirely on EXPLAIN ANALYZE numbers is a bad idea, because the overhead of EXPLAIN ANALYZE is both significant and variable depending on the plan structure. The numbers that I think we must capture are the top-level EXPLAIN cost and the actual runtime of the query (*without* EXPLAIN). Those are the things we would like to get to track closely. EXPLAIN ANALYZE is incredibly valuable as context for such numbers, but it's not the thing we actually wish to optimize. > Besides the additional context info, I expect to be storing the log > of the ratio, since it seems to make more sense to average and > look for outliers based on that than the raw ratio. Why would you store anything but raw data? Easily-derivable numbers should be computed while querying the database, not kept in it. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match