Re: [PERFORM] Using ctid column changes plan drastically

2012-08-01 Thread Kevin Grittner
Thomas Kellerer wrote: > I finally found a solution that runs fine: > > DELETE FROM dupes a > WHERE EXISTS (SELECT 1 >FROM dupes b >WHERE b.first_name = a.first_name > AND b.last_name = a.last_name > AND b.ctid > a.ctid); How

Re: [PERFORM] Using ctid column changes plan drastically

2012-07-25 Thread Thomas Kellerer
Tom Lane, 24.07.2012 19:12: Well, it would only help if you're running a PG version that's new enough to recognize the NOT EXISTS as an anti-join; and even then, it's possible that joining on a tid column forecloses enough plan types that you don't get any real benefit. But I'm just guessing. Ca

Re: [PERFORM] Using ctid column changes plan drastically

2012-07-24 Thread Tom Lane
Thomas Kellerer writes: > Tom Lane wrote on 24.07.2012 17:55: >> FWIW, it might be helpful to cast this as a NOT EXISTS rather than >> NOT IN subquery. > Hmm. How would you change that into an NOT EXISTS clause (so that one of the > duplicates remains) > Everything I come up with is in fact slow

Re: [PERFORM] Using ctid column changes plan drastically

2012-07-24 Thread Thomas Kellerer
Tom Lane wrote on 24.07.2012 17:55: Joins on tid columns just aren't supported very well at the moment. Partly that's from lack of round tuits, and partly it's because it doesn't seem all that wise to encourage people to use them. There are gotchas if any of the rows receive concurrent updates.

Re: [PERFORM] Using ctid column changes plan drastically

2012-07-24 Thread Tom Lane
Thomas Kellerer writes: > Tom Lane, 24.07.2012 16:23: >> IIRC, type tid doesn't have any hash support. > So the "bad" plan is expected? Joins on tid columns just aren't supported very well at the moment. Partly that's from lack of round tuits, and partly it's because it doesn't seem all that wis

Re: [PERFORM] Using ctid column changes plan drastically

2012-07-24 Thread Thomas Kellerer
Tom Lane, 24.07.2012 16:23: Thomas Kellerer writes: DELETE FROM dupes WHERE id NOT IN (SELECT min(b.id) FROM dupes b GROUP BY first_name, last_Name HAVING count(*) > 1); Doesn't that kill the non-duplicates too? Ah right - another g

Re: [PERFORM] Using ctid column changes plan drastically

2012-07-24 Thread Tom Lane
Thomas Kellerer writes: > DELETE FROM dupes > WHERE id NOT IN (SELECT min(b.id) > FROM dupes b > GROUP BY first_name, last_Name > HAVING count(*) > 1); Doesn't that kill the non-duplicates too? > Why does the usage of the CTID column change