On Tue, 13 Apr 2004 13:55:49 -0400, Tom Lane <[EMAIL PROTECTED]> wrote:
>Possibly the
>nonuniform clumping of CID has something to do with the poor results.
It shouldn't. The sampling algorithm is designed to give each tuple the
same chance of ending up in the sample, and tuples are selected
inde
On Tue, 2004-04-13 at 14:04, Jeremy Dunn wrote:
>
> > There's a hard limit of 1000, I believe. Didn't it give you
> > a warning saying so?
>
> No warning at 2000, and no warning at 100,000 either!
>
> Remember we are still on 7.2.x. The docs here
> http://www.postgresql.org/docs/7.2/static/sql
> There's a hard limit of 1000, I believe. Didn't it give you
> a warning saying so?
No warning at 2000, and no warning at 100,000 either!
Remember we are still on 7.2.x. The docs here
http://www.postgresql.org/docs/7.2/static/sql-altertable.html don't say
anything about a limit.
This is go
"Jeremy Dunn" <[EMAIL PROTECTED]> writes:
> Interestingly, I tried increasing the stat size for the CID column to
> 2000, analyzing, and checking the accuracy of the stats again.
There's a hard limit of 1000, I believe. Didn't it give you a warning
saying so?
At 1000 the ANALYZE sample size woul
> > When I just tried it again with a value of 300, analyze,
> then run the query, I get a *worse* result for an estimate. I don't
understand
> > this.
>
> That's annoying. How repeatable are these results --- if you
> do ANALYZE over again several times, how much does the row
> count estima
"Jeremy Dunn" <[EMAIL PROTECTED]> writes:
> Agreed. However, given that count(*) is a question that can be answered
> _solely_ using the index (without reference to the actual data blocks),
As Bruno noted, that is not the case in Postgres; we must visit the
table rows anyway.
> When I just tried
On Mon, Apr 12, 2004 at 15:05:02 -0400,
Jeremy Dunn <[EMAIL PROTECTED]> wrote:
>
> Agreed. However, given that count(*) is a question that can be answered
> _solely_ using the index (without reference to the actual data blocks),
> I'd expect that the break-even point would be considerably highe
> "Jeremy Dunn" <[EMAIL PROTECTED]> writes:
> > The question: why does the planner consider a sequential scan to be
> > better for these top 10 values?
>
> At some point a seqscan *will* be better. In the limit, if
> the key being sought is common enough to occur on every page
> of the table,
"Jeremy Dunn" <[EMAIL PROTECTED]> writes:
> The question: why does the planner consider a sequential scan to be
> better for these top 10 values?
At some point a seqscan *will* be better. In the limit, if the key
being sought is common enough to occur on every page of the table,
it's certain that
On Mon, 12 Apr 2004, Jeremy Dunn wrote:
>explain analyze select count(*) from xxx where cid=6223341;
>Aggregate (cost=74384.19..74384.19 rows=1 width=0) (actual
> time=11614.89..11614.89 rows=1 loops=1)
> -> Index Scan using xxx_cid on emailrcpts (cost=0.00..74329.26
> rows=21974
] index v. seqscan for certain values
Quick bit of input, since you didn't mention it.
How often do you run ANALYZE? I found it interesting that a database I
was doing tests on sped up by a factor of 20 after ANALYZE. If your
data changes a lot, you should probably schedule ANALYZE to run
Quick bit of input, since you didn't mention it.
How often do you run ANALYZE? I found it interesting that a database I
was doing tests on sped up by a factor of 20 after ANALYZE. If your
data changes a lot, you should probably schedule ANALYZE to run with
VACUUM.
Jeremy Dunn wrote:
I've searche
Title: Message
I've searched the
archives and can't find an answer to this seemingly simple question.
Apologies if it's too common.
The table in
question has ~1.3M rows. It has 85 columns, 5 of which have single-column
indexes.
The column in
question (CID) has 183 distinct values. Fo
13 matches
Mail list logo