Re: [PERFORM] Setting Statistics on Functional Indexes

2012-11-14 Thread Tom Lane
Robert Haas writes: > Shouldn't there be a separate estimator for scalarlesel? Or should > the existing estimator be adjusted to handle the two cases > differently? Well, it does handle it differently to some extent, in that the operator itself is invoked when checking the MCV values, so we get

Re: [PERFORM] Setting Statistics on Functional Indexes

2012-11-14 Thread Claudio Freire
On Wed, Nov 14, 2012 at 5:36 PM, Robert Haas wrote: > Shouldn't there be a separate estimator for scalarlesel? Or should > the existing estimator be adjusted to handle the two cases > differently? Woulnd't adding eqsel to scalar(lt|gt)sel work? (saving duplication with mvc_selectivity) -- Sen

Re: [PERFORM] Setting Statistics on Functional Indexes

2012-11-14 Thread Robert Haas
On Fri, Oct 26, 2012 at 5:08 PM, Tom Lane wrote: > So the bottom line is that this is a case where you need a lot of > resolution in the histogram. I'm not sure there's anything good > we can do to avoid that. I spent a bit of time thinking about whether > we could use n_distinct to get some id

Re: [PERFORM] Setting Statistics on Functional Indexes

2012-10-29 Thread Kevin Grittner
Shaun Thomas wrote: > I know that current_date seems like an edge case, but I can't see > how getting the most recent activity for something is an uncommon > activity. Tip tracking is actually the most frequent pattern in the > systems I've seen. Yeah, this has been a recurring problem with datab

Re: [PERFORM] Setting Statistics on Functional Indexes

2012-10-29 Thread Shaun Thomas
On 10/26/2012 04:08 PM, Tom Lane wrote: So the bottom line is that this is a case where you need a lot of resolution in the histogram. I'm not sure there's anything good we can do to avoid that. I kinda hoped it wouldn't be something like that. For the particularly painful instance, it was e

Re: [PERFORM] Setting Statistics on Functional Indexes

2012-10-26 Thread Claudio Freire
On Fri, Oct 26, 2012 at 7:04 PM, Claudio Freire wrote: > On Fri, Oct 26, 2012 at 7:01 PM, Tom Lane wrote: >> Claudio Freire writes: >>> Because once you've accessed that last index page, it would be rather >>> trivial finding out how many duplicate tids are in that page and, with >>> a small CPU

Re: [PERFORM] Setting Statistics on Functional Indexes

2012-10-26 Thread Claudio Freire
On Fri, Oct 26, 2012 at 7:01 PM, Tom Lane wrote: > Claudio Freire writes: >> Because once you've accessed that last index page, it would be rather >> trivial finding out how many duplicate tids are in that page and, with >> a small CPU cost (no disk access if you don't query other index pages) >>

Re: [PERFORM] Setting Statistics on Functional Indexes

2012-10-26 Thread Tom Lane
Claudio Freire writes: > Because once you've accessed that last index page, it would be rather > trivial finding out how many duplicate tids are in that page and, with > a small CPU cost (no disk access if you don't query other index pages) > you could verify the assumption of near-uniqueness. I

Re: [PERFORM] Setting Statistics on Functional Indexes

2012-10-26 Thread Claudio Freire
On Fri, Oct 26, 2012 at 6:08 PM, Tom Lane wrote: > > Interestingly, this is a case where the get_actual_variable_range patch > (commit 40608e7f, which appeared in 9.0) makes the results worse. > Before that, there was a (very arbitrary) lower bound on what we'd > believe as the selectivity of a >=

Re: [PERFORM] Setting Statistics on Functional Indexes

2012-10-26 Thread Tom Lane
Shaun Thomas writes: > But I just noticed the lag in your response. :) It turns out, even > though I was substituting 2012-10-24 or 2012-10-25, what I really meant > was current_date. That does make all the difference, actually. Ah. [ pokes at that for awhile... ] OK, this has nothing to do w

Re: [PERFORM] Setting Statistics on Functional Indexes

2012-10-26 Thread Shaun Thomas
On 10/26/2012 02:35 PM, Tom Lane wrote: So I'm wondering exactly what "9.1" version you're using, and also whether you've got any nondefault planner cost parameters. Just a plain old 9.1.6 from Ubuntu 12.04. Only thing I personally changed was the default_statistics_target. Later, I bumped up

Re: [PERFORM] Setting Statistics on Functional Indexes

2012-10-26 Thread Tom Lane
Shaun Thomas writes: > On 10/24/2012 02:31 PM, Shaun Thomas wrote: >> The main flaw with my example is that it's random. But I swear I'm not >> making it up! :) > And then I find a way to make it non-random. Hooray: I can't reproduce this. In 9.1 for instance, I get Sort (cost=9.83..9.83 row

Re: [PERFORM] Setting Statistics on Functional Indexes

2012-10-24 Thread Shaun Thomas
On 10/24/2012 02:31 PM, Shaun Thomas wrote: The main flaw with my example is that it's random. But I swear I'm not making it up! :) And then I find a way to make it non-random. Hooray: CREATE TABLE date_test ( id SERIAL, col1 varchar, col2 numeric, action_date TIMESTAMP WITHOUT TIME Z

Re: [PERFORM] Setting Statistics on Functional Indexes

2012-10-24 Thread Shaun Thomas
On 10/24/2012 02:11 PM, Tom Lane wrote: It's not particularly (not that you've even defined what you think "optimistic" is, much less mentioned what baseline you're comparing to). The main flaw with my example is that it's random. But I swear I'm not making it up! :) There seems to be a par

Re: [PERFORM] Setting Statistics on Functional Indexes

2012-10-24 Thread Tom Lane
Shaun Thomas writes: > 1. Is there any way to specifically set stats on a functional index? Sure, the same way you would for a table. regression=# create table foo (f1 int, f2 int); CREATE TABLE regression=# create index fooi on foo ((f1 + f2)); CREATE INDEX regression=# \d fooi Index "pub

[PERFORM] Setting Statistics on Functional Indexes

2012-10-24 Thread Shaun Thomas
Hey everyone, So recently we upgraded to 9.1 and have noticed a ton of our queries got much worse. It turns out that 9.1 is *way* more optimistic about our functional indexes, even when they're entirely the wrong path. So after going through the docs, I see that the normal way to increase stat