Hi,

I face the following problem: I have a large table with 12 million addresses, 
referenced by 20 other tables (some containing about one million entries). 
There are indexes on the foreign keys.

Now I wanted to delete about 10 million addresses (that are not referenced 
anymore from anywhere), and have a statement like:

DELETE FROM address
WHERE id NOT IN (SELECT address_id FROM bank where address_id IS NOT NULL)
   AND id NOT IN (SELECT poboxaddress_id FROM bank where poboxaddress_id IS NOT 
NULL)
   AND id NOT IN (SELECT address_id FROM bankconnection where address_id IS NOT 
NULL)
...lots more...

This takes more than 10 hours here (I had to cancel the statement).

I have two suggestions:


1.       Currently for each row to be deleted, a SELECT is done in each column 
referencing the deleted entry. This takes really a lot of time. It is possible 
to check in an elegant way if an entry can be deleted, like in the above query. 
 I know it is not easy to autocreate such a statement, but this would make 
deletions much faster.

2.       I would have loved a special option "UNREREFENCED" given to the delete 
statement, so all rows referenced from anywhere would automagically be excluded 
from my delete statement. When this keyword is given, no FK checks have to be 
done, because FK referenciality cannot be violated anyway.



DELETE UNREFERENCED FROM address WHERE ...;

Thanks for your time and this great database product.

Regards,
Daniel Migowski

Reply via email to