Re: [GENERAL] select distinct and index usage

2008-04-08 Thread Alvaro Herrera
Gregory Stark escribió: > I thought our DISTINCT index scan does do that but it still has to read the > index leaf pages sequentially. It doesn't back-track up the tree structure and > refind the next key. The way to back-track is to start the scan over from the root page down, keeping a stack of

Re: [GENERAL] select distinct and index usage

2008-04-08 Thread Gregory Stark
"Tom Lane" <[EMAIL PROTECTED]> writes: > Alvaro Herrera <[EMAIL PROTECTED]> writes: >> Tom Lane escribió: >>> What I think you'll find, though, is that once you do force an indexscan >>> to be picked it'll be slower. Full-table index scans are typically >>> worse than seqscan+sort, unintuitive th

Re: [GENERAL] select distinct and index usage

2008-04-08 Thread David Wilson
On Mon, Apr 7, 2008 at 9:11 PM, Stephen Denne <[EMAIL PROTECTED]> wrote: > > You may be able to make use of an index by rearranging your query to > generate a series between your min & max values, testing whether each value > is in your table. > > You've got 4252 distinct values, but what is th

Re: [GENERAL] select distinct and index usage

2008-04-07 Thread Martijn van Oosterhout
On Sun, Apr 06, 2008 at 07:32:58PM -0400, David Wilson wrote: > I have a reasonably large table (~75m rows,~18gb) called "vals". It > includes an integer datestamp column with approximately 4000 unique > entries across the rows; there is a normal btree index on the > datestamp column. When I attemp

Re: [GENERAL] select distinct and index usage

2008-04-07 Thread Stephen Denne
Alban Hertroys wrote > Something that might help you, but I'm not sure whether it > might hurt > the performance of other queries, is to cluster that table on > val_datestamp_idx. That way the records are already (mostly) sorted > on disk in the order of the datestamps, which seems to be the

Re: [GENERAL] select distinct and index usage

2008-04-07 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes: > Tom Lane escribió: >> What I think you'll find, though, is that once you do force an indexscan >> to be picked it'll be slower. Full-table index scans are typically >> worse than seqscan+sort, unintuitive though that may sound. > Hmm, should we switch

Re: [GENERAL] select distinct and index usage

2008-04-07 Thread Alvaro Herrera
Tom Lane escribió: > What I think you'll find, though, is that once you do force an indexscan > to be picked it'll be slower. Full-table index scans are typically > worse than seqscan+sort, unintuitive though that may sound. Hmm, should we switch the CLUSTER code to do that? -- Alvaro Herrera

Re: [GENERAL] select distinct and index usage

2008-04-07 Thread Tom Lane
"David Wilson" <[EMAIL PROTECTED]> writes: > It appears to be doing a sequential scan regardless of the set, as if > it doesn't believe it can use the index for some reason More likely, it's getting a cost estimate for the indexscan that's so bad that it even exceeds the 1-unit thumb on th

Re: [GENERAL] select distinct and index usage

2008-04-07 Thread Stephen Denne
> Still doing the sequential scan on the table, but at least it's > avoiding the expensive disk merge sort. It still seems as if I ought > to be able to coax it into using an index for this type of query, > though- especially since it's using one on the other table. Is there > perhaps some way to r

Re: [GENERAL] select distinct and index usage

2008-04-07 Thread Alvaro Herrera
David Wilson escribió: > explain analyze select datestamp from vals group by datestamp; > QUERY > PLAN >

Re: [GENERAL] select distinct and index usage

2008-04-07 Thread David Wilson
On Mon, Apr 7, 2008 at 7:57 PM, Gregory Stark <[EMAIL PROTECTED]> wrote: > > You could try changing it to the equivalent GROUP BY query. The planner, > unfortunately, doesn't know they're equivalent and has two separate sets of > plans available. In this case where there are only 4,000 distinct

Re: [GENERAL] select distinct and index usage

2008-04-07 Thread Gregory Stark
"David Wilson" <[EMAIL PROTECTED]> writes: > I appreciate the responses so far! I'm used to several minutes for > some of the complex queries on this DB, but 12.5 minutes for a select > distinct just seems wrong. :) You could try changing it to the equivalent GROUP BY query. The planner, unfortun

Re: [GENERAL] select distinct and index usage

2008-04-07 Thread David Wilson
On Mon, Apr 7, 2008 at 1:42 PM, Alban Hertroys <[EMAIL PROTECTED]> wrote: > > Have you tried this query with enable_seqscan=off? If my guess is right > (and the planners, in that case) it'd be even slower. set enable_seqscan=off; explain select distinct datestamp from vals;

Re: [GENERAL] select distinct and index usage

2008-04-07 Thread Joshua D. Drake
On Mon, 7 Apr 2008 19:42:02 +0200 Alban Hertroys <[EMAIL PROTECTED]> wrote: > > explain analyze select distinct datestamp from vals; > >QUERY > > PLAN > > -- > >

Re: [GENERAL] select distinct and index usage

2008-04-07 Thread Alban Hertroys
On Apr 7, 2008, at 9:47 AM, David Wilson wrote: On Mon, Apr 7, 2008 at 2:05 AM, Alban Hertroys <[EMAIL PROTECTED]> wrote: The databases estimates seem consistent with yours, so why is it doing this? Could you provide an EXPLAIN ANALYSE? It shows the actual numbers next to the estimates, alth

Re: [GENERAL] select distinct and index usage

2008-04-07 Thread David Wilson
On Mon, Apr 7, 2008 at 2:05 AM, Alban Hertroys <[EMAIL PROTECTED]> wrote: > On Apr 7, 2008, at 1:32 AM, David Wilson wrote: > > > > > The databases estimates seem consistent with yours, so why is it doing > this? Could you provide an EXPLAIN ANALYSE? It shows the actual numbers next > to the estim

Re: [GENERAL] select distinct and index usage

2008-04-06 Thread Alban Hertroys
On Apr 7, 2008, at 1:32 AM, David Wilson wrote: I have a reasonably large table (~75m rows,~18gb) called "vals". It includes an integer datestamp column with approximately 4000 unique entries across the rows; there is a normal btree index on the datestamp column. When I attempt something like "se

Re: [GENERAL] select distinct and index usage

2008-04-06 Thread David Wilson
Both tables are vacuumed and analyzed. I have other queries that are using various indices on the vals table in an intelligent fashion. I can try increasing the stats, certainly, although they're at the defaults for both tables. The variation is definitely identical- the set of datestamps in the v

Re: [GENERAL] select distinct and index usage

2008-04-06 Thread Craig Ringer
David Wilson wrote: On another table in the same database with a much smaller number of total rows (~15m rows), I have the exact same situation- but in this case the index on the datestamp column *is* used: Have you run ANALYZE on both tables? It might be worth increasing the stats collected

[GENERAL] select distinct and index usage

2008-04-06 Thread David Wilson
I have a reasonably large table (~75m rows,~18gb) called "vals". It includes an integer datestamp column with approximately 4000 unique entries across the rows; there is a normal btree index on the datestamp column. When I attempt something like "select distinct datestamp from vals", however, expla