Re: Faster distinct query?

Israel Brewster Thu, 23 Sep 2021 13:18:02 -0700

> On Sep 23, 2021, at 10:36 AM, Geoff Winkless <pgsqlad...@geoff.dj> wrote:
> 
> On Wed, 22 Sept 2021 at 21:05, Israel Brewster <ijbrews...@alaska.edu 
> <mailto:ijbrews...@alaska.edu>> wrote:
> I was wondering if there was any way to improve the performance of this query:
> 
> SELECT station,array_agg(distinct(channel)) as channels FROM data GROUP BY 
> station;
> 
> If you have tables of possible stations and channels (and if not, why not?), 
> then an EXISTS query, something like
> 
> SELECT stations.name <http://stations.name/>, ARRAY_AGG(channels.name 
> <http://channels.name/>) 
>   FROM stations, channels 
>   WHERE EXISTS 
>    (SELECT FROM data WHERE data.channels=channels.name 
> <http://channels.name/> AND data.station=stations.name 
> <http://stations.name/>)
> GROUP BY stations.name <http://stations.name/>
> 
> will usually be much faster, because it can stop scanning after the first 
> match in the index.


So that one ran in about 5 minutes as well - apparently the time it takes to 
scan the index, given the similarity of run times for each of the different 
queries: https://explain.depesz.com/s/w46h <https://explain.depesz.com/s/w46h>

It’s making me think though, because this similar (though incomplete, and 
therefore incorrect result) query runs in only 19ms 
(https://explain.depesz.com/s/iZnN <https://explain.depesz.com/s/iZnN>):

    SELECT
        stations.name,
        array_agg(channels.channel)
    FROM stations,channels
    WHERE EXISTS (SELECT
                                  FROM data
                                  WHERE data.station=stations.id)
GROUP BY stations.name

It’s only when I add in the AND data.channels=channels.channel that the query 
time blows up to 5+ minutes. I personally don’t understand why there would be 
such a large difference between the two queries - something wrong with my 
indexes?

---
Israel Brewster
Software Engineer
Alaska Volcano Observatory 
Geophysical Institute - UAF 
2156 Koyukuk Drive 
Fairbanks AK 99775-7320
Work: 907-474-5172
cell:  907-328-9145

> 
> Geoff

Re: Faster distinct query?

Reply via email to