Hi -

I have a table of words and a table linking words in various ways:

create table allWords (
  wordID        serial  PRIMARY KEY,
  word          text
);
create unique index ix_allwords_word ON allwords (word);

create table allWordRelations (
  word1ID       integer references allWords,
  word2ID       integer references allWords,
  pos1          integer references posTypes,
  pos2          integer references posTypes,
  relID         integer references allRelationTypes,
  confidence    float,
  primary key (word1ID, word2ID, pos1, pos2, relID)
);
create index ix_allWordRelations_word1_pos1 on allWordRelations (word1ID, pos1); create index ix_allWordRelations_word2_pos2 on allWordRelations (word2ID, pos2);

I have two queries for looking up related words which I think should be equivalent, but 7.4.8 comes up with very different plans. The first query joins the word table to itself explicitly via the relations table - this is very fast. The second query uses an IN against the link table in the where clause, and is very slow. I'm sure I can affect this by adding indexes, but I'm mainly trying to understand what difference the planner is seeing. EXPLAIN ANALYZE output is below - can anyone explain? Are my two queries subtly different in terms of NULLs, or something like that? Thanks.

- John Burger
  MITRE


explain analyze select w2.word from allwords w1 join allwordrelations as r on (w1.wordid = r.word1id) join allwords w2 on (w2.wordid = r.word2id) where w1.word = 'dogging'; QUERY PLAN ------------------------------------------------------------------------ ------------------------------------------------------------------------ -------------------- Nested Loop (cost=0.00..579.05 rows=81 width=15) (actual time=0.607..30.509 rows=59 loops=1) -> Nested Loop (cost=0.00..333.94 rows=81 width=4) (actual time=0.564..29.032 rows=59 loops=1) -> Index Scan using ix_allwords_word on allwords w1 (cost=0.00..3.49 rows=1 width=4) (actual time=0.326..0.329 rows=1 loops=1)
               Index Cond: (word = 'dogging'::text)
-> Index Scan using ix_allwordrelations_word1_pos1 on allwordrelations r (cost=0.00..329.36 rows=87 width=8) (actual time=0.220..28.564 rows=59 loops=1)
               Index Cond: ("outer".wordid = r.word1id)
-> Index Scan using allwords_pkey on allwords w2 (cost=0.00..3.01 rows=1 width=19) (actual time=0.018..0.020 rows=1 loops=59)
         Index Cond: (w2.wordid = "outer".word2id)
Total runtime: 30.713 ms



explain analyze select w2.word from allwords w1, allwords w2 where (w1.wordid, w2.wordid) in (select word1id, word2id from allwordrelations ) and w1.word = 'dogging'; QUERY PLAN ------------------------------------------------------------------------ ------------------------------------------------------------------------ ----- Nested Loop (cost=760422.86..817628.29 rows=1 width=15) (actual time=99277.403..111291.862 rows=59 loops=1) -> Hash Join (cost=760422.86..817625.27 rows=1 width=4) (actual time=99277.110..111270.093 rows=59 loops=1)
         Hash Cond: ("outer".word1id = "inner".wordid)
-> Unique (cost=760419.36..794740.32 rows=4576128 width=8) (actual time=96713.791..107843.446 rows=4302242 loops=1) -> Sort (cost=760419.36..771859.68 rows=4576128 width=8) (actual time=96713.785..102973.088 rows=4576035 loops=1) Sort Key: allwordrelations.word1id, allwordrelations.word2id -> Seq Scan on allwordrelations (cost=0.00..79409.28 rows=4576128 width=8) (actual time=0.008..8668.255 rows=4576035 loops=1) -> Hash (cost=3.49..3.49 rows=1 width=4) (actual time=0.078..0.078 rows=0 loops=1) -> Index Scan using ix_allwords_word on allwords w1 (cost=0.00..3.49 rows=1 width=4) (actual time=0.067..0.070 rows=1 loops=1)
                     Index Cond: (word = 'dogging'::text)
-> Index Scan using allwords_pkey on allwords w2 (cost=0.00..3.01 rows=1 width=19) (actual time=0.360..0.363 rows=1 loops=59)
         Index Cond: (w2.wordid = "outer".word2id)
Total runtime: 111292.449 ms



---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
      subscribe-nomail command to [EMAIL PROTECTED] so that your
      message can get through to the mailing list cleanly

Reply via email to