Kevin, > I have to keep a very narrow focus on this, or there is likely that > nothing will come of it. The particular area which is my target > here is the accuracy of the cost values on the subplans > considered by the optimizer.
Sure. What the rest of us are focused on is helping you build a generally useful tool which can be used to solve future problems and unexpected performance issues as well. I really see needing to collect all of the information possible from EXPLAIN ANALYZE ... not collecting just three bits and throwing the rest of the stuff away. I'd use a structure more like: query_instances ( query text run_instance int estimated_cost float analyze_time float actual_time float time_run timestamp GUCs text[] ) query_steps ( query_instance FK step_id SERIAL parent_step FK node_name text node_type text FK cost_start float cost_end float est_rows INT8 time_start float time_end float actual_rows INT8 loops INT8 db_object name condition_type text FK condition_detail text ) so, for example, the query step: " -> Seq Scan on detail0009 (cost=0.00..20500.11 rows=26 width=1005) (actual time=453.000..5983.000 rows=53588 loops=1)" " Filter: ((txcontenttype ~~* '%html%'::text) AND ((vchost)::text ~~* '%www.%'::text))" Could be displayed as: query_instance 12008 step_id 14701 parent_step 14698 node_name Seq Scan on detail0009 node_type Seq Scan cost_start 0 cost_end 20500.11 est_rows 26 time_start 453.0 time_end 5983.0 actual_rows 53588 loops 1 db_object detail009 condition_type Filter condition_detail ((txcontenttype ~~* '%html%'::text) AND ((vchost)::text ~~* '%www.%'::text)) By collecting all of this data, you make it possible to perform other sorts of analysis on the cost estimates. For example, statistical analysis might tell us that 3-or-more-condition filters take significantly longer to execute than single-condition filters, which would be important to know for the cost model. Limiting it to collecting only 3 of the 13 bits of node data produced by EA would very much limit the usefulness of the tool and the reliability of its statistics. -- Josh Berkus Aglio Database Solutions San Francisco ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match