Hmmmm...I'm not sure why the buffers option didn't work for me, maybe the heroku psql is out of date. No, the query gets slower with a high load of data and runs pretty often.
I just created a small test dataset for this. When I have a larger one I will post new explain plans but the timings seem pretty consistent regardless of the results returns (usually only 2-200) even when there are millions of records in "contacts", "phone_numbers", and "email_addresses". In this case doesn't the correlated query have to do more work and access more columns than the subselect approach? On Thu, Nov 21, 2013 at 1:22 PM, Elliot <yields.falseh...@gmail.com> wrote: > On 2013-11-21 16:12, Robert DiFalco wrote: > > Sorry I couldn't get buffers to work but here is the explain analyze > verbose: > > dft1fjfv106r48=> explain analyze verbose select c.* > > from contacts c > > where c.id IN ( > > select > p.contact_id > > from phone_numbers p > > where (p.national = 5038904993 and p.e164 = ' > +15038904993')) > or c.id IN ( > > select > e.contact_id > > from email_addresses e > > where e.email = 'robert.xx...@gmail.com') > > ; > QUERY > PLAN > > ---------------------------------------------------------------------------------------------------------------------------------------------------- > Seq Scan on public.contacts c (cost=8.12..75.73 rows=1988 width=95) > (actual time=0.410..0.410 rows=0 loops=1) > Output: c.id, c.owner_id, c.user_id, c.device_id, c.last_call, > c.record_id, c.dtype, c.blocked, c.details_hash, c.fname, c.lname, c.fb_id > Filter: ((hashed SubPlan 1) OR (hashed SubPlan 2)) > Rows Removed by Filter: 2849 > SubPlan 1 > -> Index Scan using idx_phone_address on public.phone_numbers p > (cost=0.06..4.06 rows=1 width=8) (actual time=0.015..0.015 rows=0 loops=1) > Output: p.contact_id > Index Cond: ((p."national" = 5038904993::bigint) AND > ((p.e164)::text = '+15038904993'::text)) > SubPlan 2 > -> Index Scan using idx_email_address on public.email_addresses e > (cost=0.06..4.06 rows=1 width=8) (actual time=0.018..0.018 rows=0 loops=1) > Output: e.contact_id > Index Cond: ((e.email)::text = 'robert.xx...@gmail.com'::text) > Total runtime: 0.489 ms > (13 rows) > > dft1fjfv106r48=> explain analyze verbose select c.* > > from contacts c > > where exists( > > select 1 > > from > phone_numbers p > > where (p.national = 5038904993 and p.e164 = '+15038904993') and > p.contact_id = c.id) > or EXISTS( > > select 1 > > from > email_addresses e > > where e.email = 'robert.xx...@gmail.com' and e.contact_id = c.id) > > ; > > QUERY PLAN > > > ------------------------------------------------------------------------------------------------------------------------------------------------------ > Seq Scan on public.contacts c (cost=0.00..21596.38 rows=1988 width=95) > (actual time=0.479..0.479 rows=0 loops=1) > Output: c.id, c.owner_id, c.user_id, c.device_id, c.last_call, > c.record_id, c.dtype, c.blocked, c.details_hash, c.fname, c.lname, c.fb_id > Filter: ((alternatives: SubPlan 1 or hashed SubPlan 2) OR > (alternatives: SubPlan 3 or hashed SubPlan 4)) > Rows Removed by Filter: 2849 > SubPlan 1 > -> Index Scan using idx_phone_address on public.phone_numbers p > (cost=0.06..4.06 rows=1 width=0) (never executed) > Index Cond: ((p."national" = 5038904993::bigint) AND > ((p.e164)::text = '+15038904993'::text)) > Filter: (p.contact_id = c.id) > SubPlan 2 > -> Index Scan using idx_phone_address on public.phone_numbers p_1 > (cost=0.06..4.06 rows=1 width=8) (actual time=0.010..0.010 rows=0 loops=1) > Output: p_1.contact_id > Index Cond: ((p_1."national" = 5038904993::bigint) AND > ((p_1.e164)::text = '+15038904993'::text)) > SubPlan 3 > -> Index Scan using idx_email_address on public.email_addresses e > (cost=0.06..4.06 rows=1 width=0) (never executed) > Index Cond: ((e.email)::text = 'robert.xx...@gmail.com'::text) > Filter: (e.contact_id = c.id) > SubPlan 4 > -> Index Scan using idx_email_address on public.email_addresses e_1 > (cost=0.06..4.06 rows=1 width=8) (actual time=0.016..0.016 rows=0 loops=1) > Output: e_1.contact_id > Index Cond: ((e_1.email)::text = 'robert.xx...@gmail.com > '::text) > Total runtime: 0.559 ms > (21 rows) > > dft1fjfv106r48=> explain analyze verbose select * from contacts where id > IN ( > (select c.id > > from contacts c > > join phone_numbers p on > c.id = p.contact_id and p.national = 5038904993 and p.e164 = '+15038904993') > union > (select c.id > > from contacts c > > join email_addresses e on c.id = e.contact_id > and e.email = 'robert.xx...@gmail.com')); > > > QUERY PLAN > > --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > Nested Loop (cost=16.31..24.39 rows=2 width=95) (actual > time=0.060..0.060 rows=0 loops=1) > Output: contacts.id, contacts.owner_id, contacts.user_id, > contacts.device_id, contacts.last_call, contacts.record_id, contacts.dtype, > contacts.blocked, contacts.details_hash, contacts.fname, contacts.lname, > contacts.fb_id > -> Unique (cost=16.26..16.26 rows=2 width=8) (actual > time=0.057..0.057 rows=0 loops=1) > Output: c.id > -> Sort (cost=16.26..16.26 rows=2 width=8) (actual > time=0.055..0.055 rows=0 loops=1) > Output: c.id > Sort Key: c.id > Sort Method: quicksort Memory: 25kB > -> Append (cost=0.11..16.25 rows=2 width=8) (actual > time=0.034..0.034 rows=0 loops=1) > -> Nested Loop (cost=0.11..8.12 rows=1 width=8) > (actual time=0.013..0.013 rows=0 loops=1) > Output: c.id > -> Index Scan using idx_phone_address on > public.phone_numbers p (cost=0.06..4.06 rows=1 width=8) (actual > time=0.011..0.011 rows=0 loops=1) > Output: p.id, p.contact_id, > p."national", p.e164, p.raw_number > Index Cond: ((p."national" = > 5038904993::bigint) AND ((p.e164)::text = '+15038904993'::text)) > -> Index Only Scan using > idx_contacts_pkey_owner on public.contacts c (cost=0.06..4.06 rows=1 > width=8) (never executed) > Output: c.id, c.owner_id, c.user_id > Index Cond: (c.id = p.contact_id) > Heap Fetches: 0 > -> Nested Loop (cost=0.11..8.12 rows=1 width=8) > (actual time=0.018..0.018 rows=0 loops=1) > Output: c_1.id > -> Index Scan using idx_email_address on > public.email_addresses e (cost=0.06..4.06 rows=1 width=8) (actual > time=0.016..0.016 rows=0 loops=1) > Output: e.id, e.contact_id, e.email > Index Cond: ((e.email)::text = ' > robert.xx...@gmail.com'::text) > -> Index Only Scan using > idx_contacts_pkey_owner on public.contacts c_1 (cost=0.06..4.06 rows=1 > width=8) (never executed) > Output: c_1.id, c_1.owner_id, c_1.user_id > Index Cond: (c_1.id = e.contact_id) > Heap Fetches: 0 > -> Index Scan using idx_contacts_pkey_owner on public.contacts > (cost=0.06..4.06 rows=1 width=95) (never executed) > Output: contacts.id, contacts.owner_id, contacts.user_id, > contacts.device_id, contacts.last_call, contacts.record_id, contacts.dtype, > contacts.blocked, contacts.details_hash, contacts.fname, contacts.lname, > contacts.fb_id > Index Cond: (contacts.id = c.id) > Total runtime: 0.332 ms > (31 rows) > > > > The buffers option is 9.0+ and is used like "explain (analyze, verbose, > buffers) select 1". > To your original question, the union output there runs slightly faster > than the "in" approach, although this may not be a good example - your > inputs don't return any data, so this might not be realistic - and those > numbers are so low that the difference might just be noise. > Are you tuning a <0.5ms-runtime query? Or is this just curiosity? FWIW I > tend to write queries like this using an exists check first, then if that's > still not good enough (all things like proper indexing taken in to account) > I'll try an in check, then finally a union if that's still not good enough. > >