On 20 Jul 2012, at 22:30, David Johnston wrote:

> Hi!
>  
> Can someone please point me to a resource (or suggest a solution) that will 
> improve the performance of this query?  I have some thoughts but figure I 
> should avoid reinventing the wheel since this seems like something that has 
> to have been solved already. 
>  
> I am working on a query where I have a list of identifiers (sample set has 
> about 8,500 records) and I have three other queries that return a subset of 
> these 8,500 identifiers
>  
> Basic query is designed as such:
>  
> WITH
>   full_set AS ( ) -- 8,500 records
> , sub_1 AS () -- also about 8,500
> , sub_2 AS () -- maybe 5,000
> , sub_3 AS () - - maybe 3,000
> SELECT full_set.*
> , COALESCE(sub_1.field, FALSE)
> , COALESCE(sub_2.field, FALSE)
> , COALESCE(sub_2.field, FALSE)
>  
> FROM full_set
> LEFT JOIN sub_1
> LEFT JOIN sub_2
> LEFT JOIN sub_3
>  
> The goal is to output a boolean for each record in “full_set” specifying 
> whether a corresponding records exists in the sub-set.  If the record exists 
> “sub_x.field” is defined to be TRUE and thus is output otherwise sub_x.field 
> is NULL and coalesce returns FALSE.

You are creating a product of the result sets for sub_1 to _3 there, while you 
only seem to need the union of the three.

Perhaps something like this is what you're after?

WITH
  full_set AS ( )
, subs AS (
  SELECT 1 AS sub, TRUE AS field, ... FROM sub_1
  UNION ALL
  SELECT 2 AS sub, TRUE AS field, ... FROM sub_2
  UNION ALL
  SELECT 3 AS sub, TRUE AS field, ... FROM sub_3
)
SELECT ...
FROM full_set
LEFT JOIN subs

If you need those rows to be distinct, use UNION instead of UNION ALL, but the 
database needs to do more work for that.


Alban Hertroys

--
Screwing up is an excellent way to attach something to the ceiling.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Reply via email to