I have no idea how EXISTS performs on MySQL, however I can say that IN + subqueries on MySQL are so atrocious that we outright banned that where I work, so I don't see how it could be worse :)
Alex On Mon, Mar 25, 2013 at 8:37 AM, Anssi Kääriäinen <[email protected]>wrote: > On 25 maalis, 13:23, Simon Riggs <[email protected]> wrote: > > On 25 March 2013 10:58, Tim Chase <[email protected]> > wrote: > > > > > On 2013-03-25 03:40, Anssi Kääriäinen wrote: > > >> I am very likely going to change the ORM to use EXISTS subqueries > > >> instead of IN subqueries. I know this is a good idea on PostgreSQL > > >> but I don't have enough experience of other databases to know if > > >> this is a good idea or not. > > > > > I can only speak for testing IN-vs-EXISTS speed on MSSQLServer at > > > $OLD_JOB, but there it's usually about the same, occasionally with IN > > > winning out. However, the wins were marginal, and MSSQL is a 2nd-class > > > citizen in the Django world, so I'm +1 on using EXISTS instead of IN, > > > if the results are assured to be the same. > > > > The results are definitely different because NOT IN has some quite > > strange characteristics: if the subquery returns a NULL then the whole > > result is "unknown". It is that weirdness that makes it hard to > > optimize for, or at least, not-yet-optimized for in PostgreSQL. > > > > In most cases it is the NOT EXISTS behaviour that people find natural > > and normal anyway and that is the best mechanism to use. > > When doing an .exclude() that requires subquery Django automatically > generates the queries so that the inner query's select clause can't > contain nulls. For example: > >>> print D.objects.exclude(e__id__gte=0).query > SELECT `table_d`.`id`, `table_d`.`a`, `table_d`.`b` FROM `table_d` > WHERE NOT (`table_d`.`id` IN (SELECT U1.`d_id` FROM `table_e` U1 WHERE > (U1.`id` >= 0 AND U1.`d_id` IS NOT NULL))) > > However it is possible to generate NOT IN query where the SQL > semantics are in effect when using __in lookup: > >>> print > D.objects.exclude(id__in=E.objects.filter(id__gte=0).values_list('d_id')).query > SELECT `table_d`.`id`, `table_d`.`a`, `table_d`.`b` FROM `table_d` > WHERE NOT (`table_d`.`id` IN (SELECT U0.`d_id` FROM `table_e` U0 WHERE > U0.`id` >= 0 )) > > The results of the latter case could change (assuming d_id can contain > null values). > > I think that this could be considered a bug fix. Django's ORM doesn't > try to mimic SQL semantics, it tries to have Python semantics for the > query. So an exclude(__in) lookup should behave like Python's "value > not in list", not like SQL's NOT IN. > > On the other hand having __in lookups that do EXISTS in SQL might be a > bit surprising. The way __in works is documented as generating SQL IN > lookup: https://docs.djangoproject.com/en/dev/ref/models/querysets/#in. > > I feel pretty strongly that NOT EXISTS semantics are wanted. The NOT > IN semantics are likely there just because that is how the > implementation was originally done, not because there was any decision > to choose those semantics. Also, multicolumn NOT IN lookups aren't > supported on all databases (SQLite at least), so for that case NOT > EXISTS semantics is going to happen anyways. > > - Anssi > > -- > You received this message because you are subscribed to the Google Groups > "Django developers" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at http://groups.google.com/group/django-developers?hl=en > . > For more options, visit https://groups.google.com/groups/opt_out. > > > -- "I disapprove of what you say, but I will defend to the death your right to say it." -- Evelyn Beatrice Hall (summarizing Voltaire) "The people's good is the highest law." -- Cicero -- You received this message because you are subscribed to the Google Groups "Django developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/django-developers?hl=en. For more options, visit https://groups.google.com/groups/opt_out.
