I am very likely going to change the ORM to use EXISTS subqueries
instead of IN subqueries. I know this is a good idea on PostgreSQL but
I don't have enough experience of other databases to know if this is a
good idea or not.

There are two main reasons for doing this. First, exists should
perform better on some databases, and exists allows for filter
conditions other than single column equality on all databases. So,
EXISTS subqueries are needed in the ORM in any case, the question is
if they should be the only option.

The semantics of NOT IN are harder to optimize for the DB than NOT
EXISTS, and this can result in large performance differences. See for
example this post from pgsql-hacker mailing list:
http://www.postgresql.org/message-id/[email protected]

It is easy to construct cases where NOT IN results in runtime of days
and NOT EXISTS in runtime of seconds. Just have a large enough table
in the subquery and PostgreSQL will choke.

Quick testing indicates that Oracle and MySQL seem to perform about
the same for IN and EXISTS variants, and SQLite seems to be a bit
faster when using EXISTS over IN. The docs of MySQL suggests using
EXISTS: 
https://dev.mysql.com/doc/refman/5.5/en/subquery-optimization-with-exists.html
(see the part about "very useful optimization"). My experience of
using these databases is very limited, so I might be missing some
known problematic cases.

So, the question is if there are situations where performance of
EXISTS is a lot worse than IN?

It will be possible to have a
connection.features.prefers_exists_subqueries flag and use that to
decide if the query should be generated as IN or EXISTS subquery.
However, always using EXISTS is a lot simpler.

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to