Thomas Kellerer wrote:
> I finally found a solution that runs fine:
>
> DELETE FROM dupes a
> WHERE EXISTS (SELECT 1
>FROM dupes b
>WHERE b.first_name = a.first_name
> AND b.last_name = a.last_name
> AND b.ctid > a.ctid);
How
Tom Lane, 24.07.2012 19:12:
Well, it would only help if you're running a PG version that's new
enough to recognize the NOT EXISTS as an anti-join; and even then,
it's possible that joining on a tid column forecloses enough plan
types that you don't get any real benefit. But I'm just guessing.
Ca
Thomas Kellerer writes:
> Tom Lane wrote on 24.07.2012 17:55:
>> FWIW, it might be helpful to cast this as a NOT EXISTS rather than
>> NOT IN subquery.
> Hmm. How would you change that into an NOT EXISTS clause (so that one of the
> duplicates remains)
> Everything I come up with is in fact slow
Tom Lane wrote on 24.07.2012 17:55:
Joins on tid columns just aren't supported very well at the moment.
Partly that's from lack of round tuits, and partly it's because it
doesn't seem all that wise to encourage people to use them. There
are gotchas if any of the rows receive concurrent updates.
Thomas Kellerer writes:
> Tom Lane, 24.07.2012 16:23:
>> IIRC, type tid doesn't have any hash support.
> So the "bad" plan is expected?
Joins on tid columns just aren't supported very well at the moment.
Partly that's from lack of round tuits, and partly it's because it
doesn't seem all that wis
Tom Lane, 24.07.2012 16:23:
Thomas Kellerer writes:
DELETE FROM dupes
WHERE id NOT IN (SELECT min(b.id)
FROM dupes b
GROUP BY first_name, last_Name
HAVING count(*) > 1);
Doesn't that kill the non-duplicates too?
Ah right - another g
Thomas Kellerer writes:
> DELETE FROM dupes
> WHERE id NOT IN (SELECT min(b.id)
> FROM dupes b
> GROUP BY first_name, last_Name
> HAVING count(*) > 1);
Doesn't that kill the non-duplicates too?
> Why does the usage of the CTID column change