Re: [PERFORM] index v. seqscan for certain values

2004-04-15 Thread Manfred Koizar
On Tue, 13 Apr 2004 13:55:49 -0400, Tom Lane <[EMAIL PROTECTED]> wrote: >Possibly the >nonuniform clumping of CID has something to do with the poor results. It shouldn't. The sampling algorithm is designed to give each tuple the same chance of ending up in the sample, and tuples are selected inde

Re: [PERFORM] index v. seqscan for certain values

2004-04-13 Thread Robert Treat
On Tue, 2004-04-13 at 14:04, Jeremy Dunn wrote: > > > There's a hard limit of 1000, I believe. Didn't it give you > > a warning saying so? > > No warning at 2000, and no warning at 100,000 either! > > Remember we are still on 7.2.x. The docs here > http://www.postgresql.org/docs/7.2/static/sql

Re: [PERFORM] index v. seqscan for certain values

2004-04-13 Thread Jeremy Dunn
> There's a hard limit of 1000, I believe. Didn't it give you > a warning saying so? No warning at 2000, and no warning at 100,000 either! Remember we are still on 7.2.x. The docs here http://www.postgresql.org/docs/7.2/static/sql-altertable.html don't say anything about a limit. This is go

Re: [PERFORM] index v. seqscan for certain values

2004-04-13 Thread Tom Lane
"Jeremy Dunn" <[EMAIL PROTECTED]> writes: > Interestingly, I tried increasing the stat size for the CID column to > 2000, analyzing, and checking the accuracy of the stats again. There's a hard limit of 1000, I believe. Didn't it give you a warning saying so? At 1000 the ANALYZE sample size woul

Re: [PERFORM] index v. seqscan for certain values

2004-04-13 Thread Jeremy Dunn
> > When I just tried it again with a value of 300, analyze, > then run the query, I get a *worse* result for an estimate. I don't understand > > this. > > That's annoying. How repeatable are these results --- if you > do ANALYZE over again several times, how much does the row > count estima

Re: [PERFORM] index v. seqscan for certain values

2004-04-12 Thread Tom Lane
"Jeremy Dunn" <[EMAIL PROTECTED]> writes: > Agreed. However, given that count(*) is a question that can be answered > _solely_ using the index (without reference to the actual data blocks), As Bruno noted, that is not the case in Postgres; we must visit the table rows anyway. > When I just tried

Re: [PERFORM] index v. seqscan for certain values

2004-04-12 Thread Bruno Wolff III
On Mon, Apr 12, 2004 at 15:05:02 -0400, Jeremy Dunn <[EMAIL PROTECTED]> wrote: > > Agreed. However, given that count(*) is a question that can be answered > _solely_ using the index (without reference to the actual data blocks), > I'd expect that the break-even point would be considerably highe

Re: [PERFORM] index v. seqscan for certain values

2004-04-12 Thread Jeremy Dunn
> "Jeremy Dunn" <[EMAIL PROTECTED]> writes: > > The question: why does the planner consider a sequential scan to be > > better for these top 10 values? > > At some point a seqscan *will* be better. In the limit, if > the key being sought is common enough to occur on every page > of the table,

Re: [PERFORM] index v. seqscan for certain values

2004-04-12 Thread Tom Lane
"Jeremy Dunn" <[EMAIL PROTECTED]> writes: > The question: why does the planner consider a sequential scan to be > better for these top 10 values? At some point a seqscan *will* be better. In the limit, if the key being sought is common enough to occur on every page of the table, it's certain that

Re: [PERFORM] index v. seqscan for certain values

2004-04-12 Thread Stephan Szabo
On Mon, 12 Apr 2004, Jeremy Dunn wrote: >explain analyze select count(*) from xxx where cid=6223341; >Aggregate (cost=74384.19..74384.19 rows=1 width=0) (actual > time=11614.89..11614.89 rows=1 loops=1) > -> Index Scan using xxx_cid on emailrcpts (cost=0.00..74329.26 > rows=21974

Re: [PERFORM] index v. seqscan for certain values

2004-04-12 Thread Jeremy Dunn
] index v. seqscan for certain values Quick bit of input, since you didn't mention it. How often do you run ANALYZE? I found it interesting that a database I was doing tests on sped up by a factor of 20 after ANALYZE. If your data changes a lot, you should probably schedule ANALYZE to run

Re: [PERFORM] index v. seqscan for certain values

2004-04-12 Thread Bill Moran
Quick bit of input, since you didn't mention it. How often do you run ANALYZE? I found it interesting that a database I was doing tests on sped up by a factor of 20 after ANALYZE. If your data changes a lot, you should probably schedule ANALYZE to run with VACUUM. Jeremy Dunn wrote: I've searche

[PERFORM] index v. seqscan for certain values

2004-04-12 Thread Jeremy Dunn
Title: Message I've searched the archives and can't find an answer to this seemingly simple question.  Apologies if it's too common.   The table in question has ~1.3M rows.  It has 85 columns, 5 of which have single-column indexes.   The column in question (CID) has 183 distinct values.  Fo