Thanks Kevin, the blocked should not be NULLABLE. I will fix that.  This is
with a pretty tiny dataset. I'm a little paranoid that with a large one I
will have issues.

Believe it or not the query became faster when I put the tests for user_id
IS NOT NULL in there (and added an index for that) then without the tests
and index.

It kinda makes me wonder if (from a performance perspective) I should
change the schema to pull user_id out of contacts and created a related
table with {contacts.id, user_id} where user_id is never null.




On Mon, Jun 3, 2013 at 7:26 AM, Kevin Grittner <kgri...@ymail.com> wrote:

> Robert DiFalco <robert.difa...@gmail.com> wrote:
>
> > CREATE TABLE contacts
> > (
> >     id BIGINT PRIMARY KEY NOT NULL, // generated
> >
> >     blocked BOOL,
> >     owner_id BIGINT NOT NULL,
> >     user_id BIGINT,
> >     FOREIGN KEY ( owner_id ) REFERENCES app_users ( id ) ON DELETE
> CASCADE,
> >
> >     FOREIGN KEY ( user_id ) REFERENCES app_users ( id ) ON DELETE SET
> NULL
> > );
> > CREATE INDEX idx_contact_owner ON contacts ( owner_id );
> > CREATE INDEX idx_contact_mutual ON contacts ( owner_id, user_id ) WHERE
> user_id IS NOT NULL AND NOT blocked;
>
> Well, the first thing I note is that "blocked" can be NULL.  You
> exclude rows from the result where it IS NULL in either row.  That
> may be what you really want, but it seems worth mentioning.  If you
> don't need to support missing values there, you might want to add a
> NOT NULL constraint.  If it should be NULL when user_id is, but not
> otherwise, you might want a row-level constraint.  You might shave
> a tiny amount off the runtime by getting rid of the redundant tests
> for NOT NULL on user_id; it cannot compare as either TRUE on either
> = or <> if either (or both) values are NULL.
>
> > explain analyze verbose
> > select c.user_id
> > from contact_entity c left outer join contact_entity c1 on c1.owner_id =
> c.user_id and c1.user_id = c.owner_id
> > where NOT c.blocked AND NOT c1.blocked AND c.owner_id = 24 AND c.user_id
> != 24
> > AND c.user_id IS NOT NULL AND c1.user_id IS NOT NULL
> > group by c.user_id;
>
> > Group  (cost=0.00..9.00 rows=1 width=8) (actual time=0.170..0.301 rows=8
> loops=1)
> >   Output: c.user_id
> >   ->  Merge Join  (cost=0.00..9.00 rows=1 width=8) (actual
> time=0.166..0.270 rows=17 loops=1)
> >         Output: c.user_id
> >         Merge Cond: (c.user_id = c1.owner_id)
> >         ->  Index Scan using idx_contact_mutual on public.contact_entity
> c  (cost=0.00..5.10 rows=2 width=16) (actual time=0.146..0.164 rows=11
> loops=1)
> >               Output: c.id, c.blocked, c.first_name, c.last_name,
> c.owner_id, c.user_id
> >               Index Cond: ((c.owner_id = 24) AND (c.user_id IS NOT NULL))
> >               Filter: (c.user_id <> 24)
> >               Rows Removed by Filter: 1
> >         ->  Index Scan using idx_contact_mutual on public.contact_entity
> c1  (cost=0.00..6.45 rows=1 width=16) (actual time=0.012..0.049 rows=18
> loops=1)
> >               Output: c1.id, c1.blocked, c1.first_name, c1.last_name,
> c1.owner_id, c1.user_id
> >               Index Cond: ((c1.user_id IS NOT NULL) AND (c1.user_id =
> 24))
> > Total runtime: 0.388 ms
>
> > explain analyze verbose
> > select c.user_id
> > from contact_entity c
> > where c.owner_id=24 and c.user_id<>24 and c.user_id IS NOT NULL and NOT
> c.blocked and (exists(
> >   select 1
> >   from contact_entity c1
> >   where NOT c1.blocked and c1.owner_id=c.user_id and c1.user_id IS NOT
> NULL and c1.user_id=c.owner_id))
> > group by c.user_id;
>
> > Group  (cost=0.00..9.00 rows=1 width=8) (actual time=0.048..0.159 rows=8
> loops=1)
> >   Output: c.user_id
> >   ->  Merge Semi Join  (cost=0.00..9.00 rows=1 width=8) (actual
> time=0.044..0.137 rows=9 loops=1)
> >         Output: c.user_id
> >         Merge Cond: (c.user_id = c1.owner_id)
> >         ->  Index Scan using idx_contact_mutual on public.contact_entity
> c  (cost=0.00..5.10 rows=2 width=16) (actual time=0.024..0.042 rows=11
> loops=1)
> >               Output: c.id, c.blocked, c.first_name, c.last_name,
> c.owner_id, c.user_id
> >               Index Cond: ((c.owner_id = 24) AND (c.user_id IS NOT NULL))
> >               Filter: (c.user_id <> 24)
> >               Rows Removed by Filter: 1
> >         ->  Index Scan using idx_contact_mutual on public.contact_entity
> c1  (cost=0.00..6.45 rows=1 width=16) (actual time=0.011..0.047 rows=16
> loops=1)
> >               Output: c1.id, c1.blocked, c1.first_name, c1.last_name,
> c1.owner_id, c1.user_id
> >               Index Cond: ((c1.user_id IS NOT NULL) AND (c1.user_id =
> 24))
> > Total runtime: 0.224 ms
>
> So, it looks like you can get about 3000 to 4000 of these per
> second on a single connection -- at least in terms of server-side
> processing.  Were you expecting more than that?
>
> --
> Kevin Grittner
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

Reply via email to