Hello, everyone, I just want to get some more feedback on this, before 
posting it into the tracker.

So according to the docs 
<https://docs.djangoproject.com/en/1.11/topics/db/aggregation/#order-of-annotate-and-filter-clauses>,
 
the order of *annotate* and *filter* matters, since they are not 
commutative operations.

The *annotation* is over the full queryset if applied before *.filter*, and 
the *annotation* gets filtered if applied after 
*.filter*We can even use both strategies together, i.e.

publishers = (Publisher.objects
                        .annotate(num_books=Count('book', distinct=True))
                        .filter(book__rating__gt=3.0)
                        .annotate(num_rated=Count('book', distinct=True))
                        .filter(num_books=F('num_rated'))
)

will produce the following SQL

SELECT 
    "store_publisher"."id",
    "store_publisher"."name",
    "store_publisher"."num_awards",
    COUNT(DISTINCT "store_book"."id") AS "num_books",
    COUNT(DISTINCT T3."id") AS "num_rated"
FROM "store_publisher" 
LEFT OUTER JOIN "store_book" ON ("store_publisher"."id" = "store_book".
"publisher_id")
INNER JOIN "store_book" T3 ON ("store_publisher"."id" = T3."publisher_id")
WHERE 
    T3."rating" > 3.0 
GROUP BY 
    "store_publisher"."id",
    "store_publisher"."name",
    "store_publisher"."num_awards" 
HAVING COUNT(DISTINCT "store_book"."id") = (COUNT(DISTINCT T3."id"))

which is exactly what I expected. 

However, in this example the *annotation* + *filter* is done over a 
ForeignKey, if do this over a M2M field:

stores = (Store.objects
                .annotate(num_books=Count('books', distinct=True))
                .filter(books__rating__gt=3.0)
                .annotate(num_rated=Count('books', distinct=True))
                .filter(num_books=F('num_rated'))
)

we get the following SQL:

SELECT 
    "store_store"."id",
    "store_store"."name",
    "store_store"."registered_users",
    COUNT(DISTINCT "store_store_books"."book_id") AS "num_books",
    COUNT(DISTINCT "store_store_books"."book_id") AS "num_rated"
FROM "store_store" 
LEFT OUTER JOIN "store_store_books" ON ("store_store"."id" = 
"store_store_books"."store_id") 
INNER JOIN "store_store_books" T4 ON ("store_store"."id" = T4."store_id") 
INNER JOIN "store_book" T5 ON (T4."book_id" = T5."id") 
WHERE 
    T5."rating" > 3.0 
GROUP BY 
    "store_store"."id",
    "store_store"."name",
    "store_store"."registered_users" 
HAVING 
    COUNT(DISTINCT "store_store_books"."book_id") = (COUNT(DISTINCT 
"store_store_books"."book_id"))

which is incorrect compared to the first one.
The second annotation is not over the filtered set (i.e. its not using the 
T4/T5 tables).
What I would have expected here, is the same result as the previous query 
with publishers.

Any thoughts?

PS. I'm testing this on Django 1, 11, 1

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at https://groups.google.com/group/django-users.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/88096e99-d3d8-4b56-8934-21e23fbbe2c8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to