Re: [GENERAL] SELECT DISTINCT very slow

2009-07-10 Thread Jeff Davis
On Fri, 2009-07-10 at 01:36 +0100, Greg Stark wrote: > Arguably the missing feature here is skip-scans where we scan the > index but only pull out one record for each distinct value. I'm not > sure there's anything particularly stopping Postgres from being able > to do them, but it might be a lot o

Re: [GENERAL] SELECT DISTINCT very slow

2009-07-10 Thread Greg Stark
On Fri, Jul 10, 2009 at 1:41 PM, Ben Harper wrote: > > Unfortunately I can't use GROUP BY, because what I'm really doing is > SELECT DISTINCT ON(unique_field) id FROM table; You could do that using GROUP BY if you define a first() aggregate. In this case that would just be SELECT first(id) AS id f

Re: [GENERAL] SELECT DISTINCT very slow

2009-07-10 Thread Ben Harper
Thanks for all the feedback. Using GROUP BY is indeed much faster (about 1 second). Unfortunately I can't use GROUP BY, because what I'm really doing is SELECT DISTINCT ON(unique_field) id FROM table; I'm not familiar with the Postgres internals, but in my own DB system that I have written, I do

Re: [GENERAL] SELECT DISTINCT very slow

2009-07-09 Thread Tom Lane
Greg Stark writes: > Not really. The OP doesn't say how wide the record rows are but unless > they're very wide it wouldn't pay to use an index for this even if you > didn't have to access the heap also. It's going to be faster to scan > the whole heap and either sort or use a hash. Currently ther

Re: [GENERAL] SELECT DISTINCT very slow

2009-07-09 Thread Greg Stark
On Thu, Jul 9, 2009 at 4:47 PM, Andres Freund wrote: > AFAIK the primary cause is that indexes in pg do not store visibility > information. Not really. The OP doesn't say how wide the record rows are but unless they're very wide it wouldn't pay to use an index for this even if you didn't have to a

Re: [GENERAL] SELECT DISTINCT very slow

2009-07-09 Thread Pavel Stehule
Hello when you use older pg than 8.3, please, use GROUP BY. SELECT field FROM table GROUP BY field. Regards Pavel Stehule 2009/7/9 Ben Harper : > Hi, > Can anybody explain this: > > Records: 600,000 > Field Width: Approximately 20 UTF8 characters - type is VARCHAR(25) > Field is Indexed. > > SE

Re: [GENERAL] SELECT DISTINCT very slow

2009-07-09 Thread Andres Freund
On Thursday 09 July 2009 17:09:13 Ben Harper wrote: > Hi, > Can anybody explain this: > > Records: 600,000 > Field Width: Approximately 20 UTF8 characters - type is VARCHAR(25) > Field is Indexed. > > SELECT DISTINCT field FROM table; > > Takes about 6 seconds. There are 111 distinct items. > > On

Re: [GENERAL] SELECT DISTINCT very slow

2009-07-09 Thread Bill Moran
In response to Ben Harper : > Hi, > Can anybody explain this: > > Records: 600,000 > Field Width: Approximately 20 UTF8 characters - type is VARCHAR(25) > Field is Indexed. > > SELECT DISTINCT field FROM table; > > Takes about 6 seconds. There are 111 distinct items. What's the output of EXPLAI