Re: [HACKERS] Better estimates of index correlation

2011-03-15 Thread Jeff Davis
On Sun, 2011-03-13 at 19:40 -0400, Tom Lane wrote: > It strikes me that it'd be possible to have btvacuumcleanup directly > measure order correlation when it's processing a btree index, yielding a > reliable answer for any btree index regardless of number of columns. > We could do that by comparing

Re: [HACKERS] Better estimates of index correlation

2011-03-14 Thread Josh Berkus
> I don't understand, are they going years between vacuums because their > data is static? In which case the index correlation won't change. Or > is it append-only, in which case I suspect the newly appended data is > likely to have the same correlation as the old data. Append-only. And yes, on

Re: [HACKERS] Better estimates of index correlation

2011-03-14 Thread Greg Stark
On Tue, Mar 15, 2011 at 12:27 AM, Josh Berkus wrote: > Note that if this requires VACUUM rather than ANALYZE, it introduces a > problem for data warehousing users, who can go years between vacuums of > their largest tables. I don't understand, are they going years between vacuums because their da

Re: [HACKERS] Better estimates of index correlation

2011-03-14 Thread Josh Berkus
On 3/14/11 5:51 PM, Tom Lane wrote: > Josh Berkus writes: >>> I'm not convinced you can get a sufficiently good estimate from a small >>> subset of pages. > >> Note that if this requires VACUUM rather than ANALYZE, it introduces a >> problem for data warehousing users, who can go years between va

Re: [HACKERS] Better estimates of index correlation

2011-03-14 Thread Tom Lane
Josh Berkus writes: >> I'm not convinced you can get a sufficiently good estimate from a small >> subset of pages. > Note that if this requires VACUUM rather than ANALYZE, it introduces a > problem for data warehousing users, who can go years between vacuums of > their largest tables. It's likel

Re: [HACKERS] Better estimates of index correlation

2011-03-14 Thread Josh Berkus
>> As Heikki says, maybe this wouldn't be an issue at all if we can do it >> during ANALYZE instead, but I don't know if that works. > > I'm not convinced you can get a sufficiently good estimate from a small > subset of pages. Note that if this requires VACUUM rather than ANALYZE, it introduces

Re: [HACKERS] Better estimates of index correlation

2011-03-14 Thread Tom Lane
Robert Haas writes: > On Mon, Mar 14, 2011 at 10:38 AM, Tom Lane wrote: >> Note that we could remove the correlation >> calculations from ANALYZE altogether. > Only if you don't mind having them only get updated when somebody > vacuums. If a table is mostly getting inserted into, it may not get

Re: [HACKERS] Better estimates of index correlation

2011-03-14 Thread Robert Haas
On Mon, Mar 14, 2011 at 10:38 AM, Tom Lane wrote: > Note that we could remove the correlation > calculations from ANALYZE altogether. Only if you don't mind having them only get updated when somebody vacuums. If a table is mostly getting inserted into, it may not get vacuumed very often (or poss

Re: [HACKERS] Better estimates of index correlation

2011-03-14 Thread Tom Lane
Alvaro Herrera writes: > Excerpts from Robert Haas's message of lun mar 14 11:18:24 -0300 2011: >> Does it really matter? What Tom was describing sounded embarassingly cheap. That was my thought exactly. If you could even measure the added cost of doing that, I'd be astonished. It'd be adding

Re: [HACKERS] Better estimates of index correlation

2011-03-14 Thread Robert Haas
On Mon, Mar 14, 2011 at 10:25 AM, Alvaro Herrera wrote: > Excerpts from Robert Haas's message of lun mar 14 11:18:24 -0300 2011: >> On Mon, Mar 14, 2011 at 10:09 AM, Alvaro Herrera >> wrote: > >> > It sure would be nice to be able to do it only during the last scan. >> >> Does it really matter?  

Re: [HACKERS] Better estimates of index correlation

2011-03-14 Thread Alvaro Herrera
Excerpts from Robert Haas's message of lun mar 14 11:18:24 -0300 2011: > On Mon, Mar 14, 2011 at 10:09 AM, Alvaro Herrera > wrote: > > It sure would be nice to be able to do it only during the last scan. > > Does it really matter? What Tom was describing sounded embarassingly cheap. Well, you

Re: [HACKERS] Better estimates of index correlation

2011-03-14 Thread Robert Haas
On Mon, Mar 14, 2011 at 10:09 AM, Alvaro Herrera wrote: > Excerpts from Joshua D. Drake's message of dom mar 13 23:20:01 -0300 2011: >> On Sun, 2011-03-13 at 19:40 -0400, Tom Lane wrote: >> >> > I'm not planning to do anything about this idea right now, since I'm >> > still hip-deep in collations,

Re: [HACKERS] Better estimates of index correlation

2011-03-14 Thread Heikki Linnakangas
On 14.03.2011 16:09, Alvaro Herrera wrote: Excerpts from Joshua D. Drake's message of dom mar 13 23:20:01 -0300 2011: On Sun, 2011-03-13 at 19:40 -0400, Tom Lane wrote: I'm not planning to do anything about this idea right now, since I'm still hip-deep in collations, but I thought I'd throw it

Re: [HACKERS] Better estimates of index correlation

2011-03-14 Thread Alvaro Herrera
Excerpts from Joshua D. Drake's message of dom mar 13 23:20:01 -0300 2011: > On Sun, 2011-03-13 at 19:40 -0400, Tom Lane wrote: > > > I'm not planning to do anything about this idea right now, since I'm > > still hip-deep in collations, but I thought I'd throw it out to get > > it on the record. >

Re: [HACKERS] Better estimates of index correlation

2011-03-13 Thread Joshua D. Drake
On Sun, 2011-03-13 at 19:40 -0400, Tom Lane wrote: > I'm not planning to do anything about this idea right now, since I'm > still hip-deep in collations, but I thought I'd throw it out to get > it on the record. > > Comments? One question: Where is the overhead increase? JD > >

[HACKERS] Better estimates of index correlation

2011-03-13 Thread Tom Lane
Currently, we don't measure any statistics about the ordering correlation of multi-column indexes, which means that btcostestimate has to pick a number out of the air if there's more than one column. We've been around on that at least once already: it used to use first column's correlation divided