Re: [PERFORM] "SELECT .. WHERE NOT IN" query running for hours

2011-01-14 Thread Robert Haas
2011/1/10 Mladen Gogala : > Well, I really hoped that Bruce, Robert or Greg would take on this one, but > since there are no more qualified takers, I'll take a shot at this one. For > the "NOT IN (result of a correlated sub-query)", the sub-query needs to be > executed for every row matching the co

Re: [PERFORM] "SELECT .. WHERE NOT IN" query running for hours

2011-01-10 Thread Scott Carey
On 1/10/11 12:37 PM, "Kevin Grittner" wrote: >Scott Carey wrote: > >> Often, the best query plans result from 'LEFT JOIN WHERE right >> side is NULL' rather than NOT EXISTS however. I often get >> performance gains by switching NOT EXISTS queries to LEFT JOIN >> form. > >Even in 8.4 and lat

Re: [PERFORM] "SELECT .. WHERE NOT IN" query running for hours

2011-01-10 Thread Kevin Grittner
Scott Carey wrote: > Often, the best query plans result from 'LEFT JOIN WHERE right > side is NULL' rather than NOT EXISTS however. I often get > performance gains by switching NOT EXISTS queries to LEFT JOIN > form. Even in 8.4 and later? I would think that the anti-join that Tom added in 8

Re: [PERFORM] "SELECT .. WHERE NOT IN" query running for hours

2011-01-10 Thread Scott Carey
On 1/7/11 1:29 AM, "??? " wrote: > >So my follow-up question on the subject is this: > >Are there any particular semantics for the "NOT IN" statement that cause >the correlated query to execute for every row of the outter query, as >opposed to the "NOT EXISTS" ? => select * from

Re: [PERFORM] "SELECT .. WHERE NOT IN" query running for hours

2011-01-10 Thread Mladen Gogala
Γιωργος Βαλκανας wrote: Are there any particular semantics for the "NOT IN" statement that cause the correlated query to execute for every row of the outter query, as opposed to the "NOT EXISTS" ? Or are there any other practical reasons, related to "IN / NOT IN", for this to be happening? O

Re: [PERFORM] "SELECT .. WHERE NOT IN" query running for hours

2011-01-07 Thread Γιωργος Βαλκανας
Fair enough! I also turned seqscan off, so the new plan (for the NOT EXISTS) is: Merge Anti Join (cost=0.00..212686.89 rows=1 width=313) (actual time=0.426..14921.344 rows=63836 loops=1) Merge Cond: ((d2.hwdocid)::text = (d.hwdocid)::text) -> Index Scan using hwdocid2_uniq on "Doc2" d2 (cost=0.

Re: [PERFORM] "SELECT .. WHERE NOT IN" query running for hours

2011-01-06 Thread Mladen Gogala
On 1/6/2011 9:36 PM, Γιωργος Βαλκανας wrote: 1) Why is it taking *so* long for the first query (with the "NOT IN" ) to do even the simple select? Because NOT IN has to execute the correlated subquery for every row and then check whether the requested value is in the result set, usually by doi

[PERFORM] "SELECT .. WHERE NOT IN" query running for hours

2011-01-06 Thread Γιωργος Βαλκανας
Hi all, I'm using postgres 8.4.2 on a Ubuntu Linux machine. I have several tables, one of which is named Document, which of course represents information I need about my documents. I also have another table, similar to the first one, called Doc2. The schema of both tables is the following: CREAT