So far that one was the worst in terms of cost and time. Here are all the plans with buffers, more records, and results being returned. At this point I have good enough performance with my UNION approach but I'm just trying to learn now. WHY is the union approach the fastest? I would have expected the EXISTS or IN approaches to be faster or at least have the SAME cost? At this point I just want to understand.
dft1fjfv106r48=> explain (analyze, buffers, verbose) select * from contacts c where EXISTS( (select 1 from phone_numbers p where c.id = p.contact_id and p.national = 5038904993 and p.e164 = '+15038904993') union (select 1 id = e.contact_id and e.email = 'robert.x...@gmail.com')); QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Seq Scan on public.contacts c (cost=0.00..23238.90 rows=1425 width=95) (actual time=2.241..46.817 rows=7 loops=1) Output: c.id, c.owner_id, c.user_id, c.device_id, c.last_call, c.record_id, c.dtype, c.blocked, c.details_hash, c.fname, c.lname, c.fb_id Filter: (SubPlan 1) Rows Removed by Filter: 2843 Buffers: shared hit=11497 SubPlan 1 -> Unique (cost=8.13..8.13 rows=2 width=0) (actual time=0.015..0.015 rows=0 loops=2850) Output: (1) Buffers: shared hit=11440 -> Sort (cost=8.13..8.13 rows=2 width=0) (actual time=0.013..0.013 rows=0 loops=2850) Output: (1) Sort Key: (1) Sort Method: quicksort Memory: 25kB Buffers: shared hit=11440 -> Append (cost=0.06..8.13 rows=2 width=0) (actual time=0.009..0.009 rows=0 loops=2850) Buffers: shared hit=11440 -> Index Only Scan using idx_phone on public.phone_numbers p (cost=0.06..4.06 rows=1 width=0) (actual time=0.003..0.003 rows=0 loops=2850) Output: 1 Index Cond: ((p.contact_id = c.id) AND (p."national" = 5038904993::bigint) AND (p.e164 = '+15038904993'::text)) Heap Fetches: 11 Buffers: shared hit=5721 -> Index Only Scan using idx_email_full on public.email_addresses e (cost=0.06..4.06 rows=1 width=0) (actual time=0.003..0.003 rows=0 loops=2850) Output: 1 Index Cond: ((e.contact_id = c.id) AND (e.email = 'robert.x...@gmail.com'::text)) Heap Fetches: 5 Buffers: shared hit=5719 Total runtime: 46.897 ms (27 rows) dft1fjfv106r48=> explain (analyze, buffers, verbose) select * from contacts where id IN ( (select c.id from contacts c join phone_numbers p on c.id = p.contact_id and p.national = 5038904993 and p.e164 = '+15038904993') union (select c.id from contacts c join email_addresses e on c.id = e.contact_id and e.email = 'robert.x...@gmail.com')); QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=29.38..53.74 rows=6 width=95) (actual time=0.356..0.418 rows=7 loops=1) Output: contacts.id, contacts.owner_id, contacts.user_id, contacts.device_id, contacts.last_call, contacts.record_id, contacts.dtype, contacts.blocked, contacts.details_hash, contacts.fname, contacts.lname, contacts.fb_id Buffers: shared hit=87 -> HashAggregate (cost=29.32..29.34 rows=6 width=8) (actual time=0.347..0.354 rows=7 loops=1) Output: c.id Buffers: shared hit=66 -> Append (cost=0.11..29.32 rows=6 width=8) (actual time=0.047..0.316 rows=16 loops=1) Buffers: shared hit=66 -> Nested Loop (cost=0.11..8.12 rows=1 width=8) (actual time=0.045..0.169 rows=11 loops=1) Output: c.id Buffers: shared hit=43 -> Index Scan using idx_phone_address on public.phone_numbers p (cost=0.06..4.06 rows=1 width=8) (actual time=0.027..0.047 rows=11 loops=1) Output: p.id, p.contact_id, p."national", p.e164, p.raw_number Index Cond: ((p."national" = 5038904993::bigint) AND ((p.e164)::text = '+15038904993'::text)) Buffers: shared hit=9 -> Index Only Scan using idx_contacts_pkey_owner on public.contacts c (cost=0.06..4.06 rows=1 width=8) (actual time=0.005..0.006 rows=1 loops=11) Output: c.id, c.owner_id, c.user_id Index Cond: (c.id = p.contact_id) Heap Fetches: 11 Buffers: shared hit=34 -> Nested Loop (cost=2.12..21.17 rows=5 width=8) (actual time=0.057..0.114 rows=5 loops=1) Output: c_1.id Buffers: shared hit=23 -> Bitmap Heap Scan on public.email_addresses e (cost=2.06..8.85 rows=5 width=8) (actual time=0.044..0.055 rows=5 loops=1) Output: e.id, e.contact_id, e.email Recheck Cond: ((e.email)::text = ' robert.x...@gmail.com'::text) Buffers: shared hit=7 -> Bitmap Index Scan on idx_email_address (cost=0.00..2.06 rows=5 width=0) (actual time=0.031..0.031 rows=6 loops=1) Index Cond: ((e.email)::text = ' robert.x...@gmail.com'::text) Buffers: shared hit=2 -> Index Only Scan using idx_contacts_pkey_owner on public.contacts c_1 (cost=0.06..2.46 rows=1 width=8) (actual time=0.005..0.006 rows=1 loops=5) Output: c_1.id, c_1.owner_id, c_1.user_id Index Cond: (c_1.id = e.contact_id) Heap Fetches: 5 Buffers: shared hit=16 -> Index Scan using idx_contacts_pkey_owner on public.contacts (cost=0.06..4.06 rows=1 width=95) (actual time=0.003..0.004 rows=1 loops=7) Output: contacts.id, contacts.owner_id, contacts.user_id, contacts.device_id, contacts.last_call, contacts.record_id, contacts.dtype, contacts.blocked, contacts.details_hash, contacts.fname, contacts.lname, contacts.fb_id Index Cond: (contacts.id = c.id) Buffers: shared hit=21 Total runtime: 0.535 ms (40 rows) dft1fjfv106r48=> explain (analyze, buffers, verbose) select c.* from contacts c where EXISTS(select 1 from phone_numbers p where (p.national = 5038904993 and p.e164 = '+15038904993') and p.contact_id = c.id) or EXISTS(select 1 from email_addresses e where e.email = ' robert.x...@gmail.com' and e.contact_id = c.id); QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------- Seq Scan on public.contacts c (cost=0.00..23213.25 rows=2138 width=95) (actual time=0.209..1.290 rows=7 loops=1) Output: c.id, c.owner_id, c.user_id, c.device_id, c.last_call, c.record_id, c.dtype, c.blocked, c.details_hash, c.fname, c.lname, c.fb_id Filter: ((alternatives: SubPlan 1 or hashed SubPlan 2) OR (alternatives: SubPlan 3 or hashed SubPlan 4)) Rows Removed by Filter: 2843 Buffers: shared hit=73 SubPlan 1 -> Index Only Scan using idx_phone on public.phone_numbers p (cost=0.06..4.06 rows=1 width=0) (never executed) Index Cond: ((p.contact_id = c.id) AND (p."national" = 5038904993::bigint) AND (p.e164 = '+15038904993'::text)) Heap Fetches: 0 SubPlan 2 -> Index Scan using idx_phone_address on public.phone_numbers p_1 (cost=0.06..4.06 rows=1 width=8) (actual time=0.033..0.056 rows=11 loops=1) Output: p_1.contact_id Index Cond: ((p_1."national" = 5038904994::bigint) AND ((p_1.e164)::text = '+15038904993'::text)) Buffers: shared hit=9 SubPlan 3 -> Index Only Scan using idx_email_full on public.email_addresses e (cost=0.06..4.06 rows=1 width=0) (never executed) Index Cond: ((e.contact_id = c.id) AND (e.email = ' robert.x...@gmail.com'::text)) Heap Fetches: 0 SubPlan 4 -> Bitmap Heap Scan on public.email_addresses e_1 (cost=2.06..8.85 rows=5 width=8) (actual time=0.040..0.050 rows=5 loops=1) Output: e_1.contact_id Recheck Cond: ((e_1.email)::text = 'robert.x...@gmail.com'::text) Buffers: shared hit=7 -> Bitmap Index Scan on idx_email_address (cost=0.00..2.06 rows=5 width=0) (actual time=0.030..0.030 rows=6 loops=1) Index Cond: ((e_1.email)::text = 'robert.x...@gmail.com '::text) Buffers: shared hit=2 Total runtime: 1.395 ms (27 rows) dft1fjfv106r48=> explain (analyze, buffers, verbose) select c.* from contacts c where c.id IN ( select p.contact_id from phone_numbers p where (p.national = 5038904993 and p.e164 = '+15038904993')) or c.id IN ( select e.contact_id from email_addresses e where e.email = ' robert.x...@gmail.com'); QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------- Seq Scan on public.contacts c (cost=12.92..81.32 rows=2138 width=95) (actual time=0.208..1.283 rows=7 loops=1) Output: c.id, c.owner_id, c.user_id, c.device_id, c.last_call, c.record_id, c.dtype, c.blocked, c.details_hash, c.fname, c.lname, c.fb_id Filter: ((hashed SubPlan 1) OR (hashed SubPlan 2)) Rows Removed by Filter: 2843 Buffers: shared hit=73 SubPlan 1 -> Index Scan using idx_phone_address on public.phone_numbers p (cost=0.06..4.06 rows=1 width=8) (actual time=0.032..0.054 rows=11 loops=1) Output: p.contact_id Index Cond: ((p."national" = 5038904993::bigint) AND ((p.e164)::text = '+15038904993'::text)) Buffers: shared hit=9 SubPlan 2 -> Bitmap Heap Scan on public.email_addresses e (cost=2.06..8.85 rows=5 width=8) (actual time=0.040..0.049 rows=5 loops=1) Output: e.contact_id Recheck Cond: ((e.email)::text = 'robert.x...@gmail.com'::text) Buffers: shared hit=7 -> Bitmap Index Scan on idx_email_address (cost=0.00..2.06 rows=5 width=0) (actual time=0.031..0.031 rows=6 loops=1) Index Cond: ((e.email)::text = 'robert.x...@gmail.com '::text) Buffers: shared hit=2 Total runtime: 1.371 ms (19 rows) dft1fjfv106r48=> On Fri, Nov 22, 2013 at 7:54 AM, Merlin Moncure <mmonc...@gmail.com> wrote: > On Thu, Nov 21, 2013 at 2:31 PM, desmodemone <desmodem...@gmail.com> > wrote: > > Hi Robert, could you try with "exists" ? > > > > SELECT c.* > > FROM contacts c > > WHERE exists ( SELECT 1 FROM phone p WHERE p.addr =? and > > p.contact_id=c.id ) > > OR exists (SELECT 1 FROM email e WHERE e.addr = ? and e.contact_id= > c.id ); > > hm, how about: > SELECT c.* > FROM contacts c > WHERE exists ( > SELECT 1 > FROM phone p > WHERE p.addr =? AND p.contact_id=c.id > UNION ALL > SELECT 1 FROM email e > WHERE e.addr = ? AND e.contact_id=c.id > ); > > merlin >