On Wed, Jan 25, 2012 at 1:24 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > Also, you're assuming that the changes have no upside whatsoever, which > I fondly hope is not the case. Large join problems tend not to execute > instantaneously --- so nobody is going to complain if the planner takes > awhile longer but the resulting plan is enough better to buy that back. > In my test cases, the planner *is* finding better plans, or at least > ones with noticeably lower estimated costs. It's hard to gauge how > much that translates to in real-world savings, since I don't have > real data loaded up. I also think, though I've not tried to measure, > that I've made planning cheaper for very simple queries by eliminating > some overhead in those cases.
I had a 34-table join on one of the last applications I maintained that planned and executed in less than 2 seconds. That was pushing it, but I had many joins in the 10-20 table range that planned and executed in 100-200 ms. I agree that if you are dealing with a terabyte table - or even a gigabyte table - then the growth of planning time will probably not bother anyone even if you fail to find a better plan, and will certainly make the user very happy if you do. But on tables with only a megabyte of data, it's not nearly so clear-cut. In an ideal world, I'd like the amount of effort we spend planning to be somehow tied to the savings we can expect to get, and deploy optimizations like this only in cases where we have a reasonable expectation of that effort being repaid. AIUI, this is mostly going to benefit cases like small LJ (big1 IJ big2) and, of course, those cases aren't going to arise if your query only involves small tables, or even if you have something like big IJ small1 IJ small2 IJ small3 IJ small4 LJ small5 LJ small6 IJ small7, which is a reasonably common pattern for me. Now, if you come back and say, ah, well, those cases aren't the ones that are going to be harmed by this, then maybe we should have a more detailed conversation about where the mines are. Or maybe it is helping in more cases than I'm thinking about at the moment. >> To be clear, I'd love to have this feature. But if there is a choice >> between reducing planning time significantly for everyone and NOT >> getting this feature, and increasing planning time significantly for >> everyone and getting this feature, I think we will make more people >> happy by doing the first one. > > We're not really talking about "are we going to accept or reject a > specific feature". We're talking about whether we're going to decide > that the last two years worth of planner development were headed in > the wrong direction and we're now going to reject that and try to > think of some entirely new concept. This isn't an isolated patch, > it's the necessary next step in a multi-year development plan. The > fact that it's a bit slower at the moment just means there's still > work to do. I'm not proposing that you should never commit this. I'm proposing that any commit by anyone that introduces a 35% performance regression is unwise, and doubly so at the end of the release cycle. I have every confidence that you can improve the code further over time, but the middle of the last CommitFest is not a great time to commit code that, by your own admission, needs a considerable amount of additional work. Sure, there are some things that we're not going to find out until the code goes into production, but it seems to me that you've already uncovered a fairly major performance problem that is only partially fixed. Once this is committed, it's not coming back out, so we're either going to have to figure out how to fix it before we release, or release with a regression in certain cases. If you got it down to 10% I don't think I'd be worried, but a 35% regression that we don't know how to fix seems like a lot. On another note, nobody besides you has looked at the code yet, AFAIK... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers