Thanks Kevin, the blocked should not be NULLABLE. I will fix that. This is with a pretty tiny dataset. I'm a little paranoid that with a large one I will have issues.
Believe it or not the query became faster when I put the tests for user_id IS NOT NULL in there (and added an index for that) then without the tests and index. It kinda makes me wonder if (from a performance perspective) I should change the schema to pull user_id out of contacts and created a related table with {contacts.id, user_id} where user_id is never null. On Mon, Jun 3, 2013 at 7:26 AM, Kevin Grittner <kgri...@ymail.com> wrote: > Robert DiFalco <robert.difa...@gmail.com> wrote: > > > CREATE TABLE contacts > > ( > > id BIGINT PRIMARY KEY NOT NULL, // generated > > > > blocked BOOL, > > owner_id BIGINT NOT NULL, > > user_id BIGINT, > > FOREIGN KEY ( owner_id ) REFERENCES app_users ( id ) ON DELETE > CASCADE, > > > > FOREIGN KEY ( user_id ) REFERENCES app_users ( id ) ON DELETE SET > NULL > > ); > > CREATE INDEX idx_contact_owner ON contacts ( owner_id ); > > CREATE INDEX idx_contact_mutual ON contacts ( owner_id, user_id ) WHERE > user_id IS NOT NULL AND NOT blocked; > > Well, the first thing I note is that "blocked" can be NULL. You > exclude rows from the result where it IS NULL in either row. That > may be what you really want, but it seems worth mentioning. If you > don't need to support missing values there, you might want to add a > NOT NULL constraint. If it should be NULL when user_id is, but not > otherwise, you might want a row-level constraint. You might shave > a tiny amount off the runtime by getting rid of the redundant tests > for NOT NULL on user_id; it cannot compare as either TRUE on either > = or <> if either (or both) values are NULL. > > > explain analyze verbose > > select c.user_id > > from contact_entity c left outer join contact_entity c1 on c1.owner_id = > c.user_id and c1.user_id = c.owner_id > > where NOT c.blocked AND NOT c1.blocked AND c.owner_id = 24 AND c.user_id > != 24 > > AND c.user_id IS NOT NULL AND c1.user_id IS NOT NULL > > group by c.user_id; > > > Group (cost=0.00..9.00 rows=1 width=8) (actual time=0.170..0.301 rows=8 > loops=1) > > Output: c.user_id > > -> Merge Join (cost=0.00..9.00 rows=1 width=8) (actual > time=0.166..0.270 rows=17 loops=1) > > Output: c.user_id > > Merge Cond: (c.user_id = c1.owner_id) > > -> Index Scan using idx_contact_mutual on public.contact_entity > c (cost=0.00..5.10 rows=2 width=16) (actual time=0.146..0.164 rows=11 > loops=1) > > Output: c.id, c.blocked, c.first_name, c.last_name, > c.owner_id, c.user_id > > Index Cond: ((c.owner_id = 24) AND (c.user_id IS NOT NULL)) > > Filter: (c.user_id <> 24) > > Rows Removed by Filter: 1 > > -> Index Scan using idx_contact_mutual on public.contact_entity > c1 (cost=0.00..6.45 rows=1 width=16) (actual time=0.012..0.049 rows=18 > loops=1) > > Output: c1.id, c1.blocked, c1.first_name, c1.last_name, > c1.owner_id, c1.user_id > > Index Cond: ((c1.user_id IS NOT NULL) AND (c1.user_id = > 24)) > > Total runtime: 0.388 ms > > > explain analyze verbose > > select c.user_id > > from contact_entity c > > where c.owner_id=24 and c.user_id<>24 and c.user_id IS NOT NULL and NOT > c.blocked and (exists( > > select 1 > > from contact_entity c1 > > where NOT c1.blocked and c1.owner_id=c.user_id and c1.user_id IS NOT > NULL and c1.user_id=c.owner_id)) > > group by c.user_id; > > > Group (cost=0.00..9.00 rows=1 width=8) (actual time=0.048..0.159 rows=8 > loops=1) > > Output: c.user_id > > -> Merge Semi Join (cost=0.00..9.00 rows=1 width=8) (actual > time=0.044..0.137 rows=9 loops=1) > > Output: c.user_id > > Merge Cond: (c.user_id = c1.owner_id) > > -> Index Scan using idx_contact_mutual on public.contact_entity > c (cost=0.00..5.10 rows=2 width=16) (actual time=0.024..0.042 rows=11 > loops=1) > > Output: c.id, c.blocked, c.first_name, c.last_name, > c.owner_id, c.user_id > > Index Cond: ((c.owner_id = 24) AND (c.user_id IS NOT NULL)) > > Filter: (c.user_id <> 24) > > Rows Removed by Filter: 1 > > -> Index Scan using idx_contact_mutual on public.contact_entity > c1 (cost=0.00..6.45 rows=1 width=16) (actual time=0.011..0.047 rows=16 > loops=1) > > Output: c1.id, c1.blocked, c1.first_name, c1.last_name, > c1.owner_id, c1.user_id > > Index Cond: ((c1.user_id IS NOT NULL) AND (c1.user_id = > 24)) > > Total runtime: 0.224 ms > > So, it looks like you can get about 3000 to 4000 of these per > second on a single connection -- at least in terms of server-side > processing. Were you expecting more than that? > > -- > Kevin Grittner > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company >