On Mon, 15 Jan 2007, Tom Lane wrote: > So I've been working on the scheme I suggested a few days ago of > representing "equivalence classes" of variables explicitly, and avoiding > the current ad-hocery of generating and then removing redundant clauses > in favor of generating only the ones we want in the first place. Any > clause that looks like an equijoin gets sent to the EquivalenceClass > machinery by distribute_qual_to_rels, and not put into the > restrictlist/joinlist data structure at all. Then we make passes over > the EquivalenceClass lists at appropriate times to generate the clauses > we want. This is turning over well enough now to pass the regression > tests,
That was quick... > In short, this approach results in a whole lot less stability in the > order in which WHERE clauses are evaluated. That might be a killer > objection to the whole thing, but on the other hand we've never made > any strong promises about WHERE evaluation order. Showing my ignorance here, but I've never been a fan of "syntax based optimization," though it is better than no optimization. If people are counting on order for optimization, then, hmmm... If you can provide a way to at least _try_ to do better, then don't worry about it. It will improve with time. > Instead, I'm thinking it might be time to re-introduce some notion of > function execution cost into the system, and make use of that info to > sort WHERE clauses into a reasonable execution order. Ingres did/does it that way, IIRC. It's a solid strategy. > This example > would be fixed with even a very stupid rule-of-thumb about SQL functions > being more expensive than C functions, but if we're going to go to the > trouble it seems like it'd be a good idea to provide a way to label > user-defined functions with execution costs. > > Would a simple constant value be workable, or do we need some more > complex model (and if so what)? Ingres would, if I'm not mistaken, gain through historical use through histograms. Short of that, you've got classes of functions, agregations, for example, and there's sure to be missing information to make a great decision at planning time. However, I take it that the cost here is primarily CPU and not I/O. I therefore propose that the engine evaluate - benchmark, if you will - all functions as they are ingested, or vacuum-like at some later date (when valid data for testing may exist), and assign a cost relative to what it already knows - the built-ins, for example. Doing so could allow this strategy to be functional in short order and be improved with time so all the work doesn't have to be implemented on day 1. And, DBA/sys-admin tweaking can always be done by updating the catalogues. HTH, Richard -- Richard Troy, Chief Scientist Science Tools Corporation 510-924-1363 or 202-747-1263 [EMAIL PROTECTED], http://ScienceTools.com/ ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly