On Mon, 2009-04-27 at 08:51 -0700, NoviceSortOf wrote:
> 
> Thanks for the as_sql() function as can now can
> see the SQL detail. Which reveals a problem in
> getting DISTINCT to draw from a fields or fields that can in fact
> return the unique result of the query. Perhaps it is my usage
> of the function so I'm listing an examble
> below.
> 
> To further simplify my example and testing I'm  using an internal
> product number that in this instance refers to all editions of the
> book Moby Dick by Herman Melville.
> 
> qset = (Q(editionid__contains=q_bookid))
> 
> results = Titles.objects.filter(qset).order_by().distinct()
> assert False, resultsa.query.as_sql()
> 
> Here is the SQL it generates results...
> 
> ('SELECT DISTINCT "books_titles"."id", "books_titles"."keywords",
> "books_titles"."editionid", "books_titles"."author",
> "books_titles"."title", "books_titles"."price",
> "books_titles"."bookid" FROM "books_titles" WHERE
> "books_titles"."author"::text LIKE %s ', (u'%842838%',))
> 
> Now with the above if I read correctly the DISTINCT is being placed on
> book_titles.id

That isn't correct. That SQL is saying each set of selected results will
be distinct. The "DISTINCT" modifier is applied to the tuple of output
columns and any tuples that occur more than once have duplicates
discarded from the result set.

[...]

> So for output I get something like...
> 
> EditionId BookID Author    Title
> ========= ====== ========= =====================
> 2001      200    Melville  Moby does Captain Bly
> 2002      200    Melville  Moby does Captain Bly
> 2004      200    Melville  Moby does Captain Bly

Those are all distinct rows. The EditionID value is different in each
set of output columns. There's no error there. I think you're
misunderstanding how distinct works in SQL and Django.

> When in fact all I need for the seeding the
> initial search results is
> 
> EditionId BookID Author    Title
> ========= ====== ========= =====================
> 2001      200    Melville  Moby does Captain Bly

How is Django meant to know that it should throw away a whole bunch of
rows and only keep the first EditionId value and not the others? It
can't.

Your original queryset explicitly asked for things matching a particular
set of editionid values ("all the editioid values in this set"). Even I,
as a human, not a computer, would have given you the first set of
results and not the second, since you indicated the different edition id
values were significant. Your expectation isn't particularly intuitive
even in human terms.

If you only care about distinct (BookID, Author, Title) tuples, then use
the values() method on the queryset. The distinct() modifier will be
applied only to those values (plus any ordering), since they're then the
only columns selected.

Regards,
Malcolm


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to