On Mon, 2008-03-03 at 06:04 -0800, Matt Hoskins wrote:
> > Having read this again in light of your complaint in #6701, I should
> > point out that it's not going to work like this if you're always
> > filtering on Contract.
> 
> Not really a complaint in #6701 - you didn't say at first that you
> were going to remove support for m2m order_by, so I was concerned that
> in the future people would do m2m order_bys and then hit the issue
> that .count() wouldn't match len(queryset). My apologies for the drift
> into discussing the meaning of m2m order_by though - I just replied
> there 'cos that's where you said up that you didn't think it made
> sense for django to do m2m order_by.

It's more than a preference and it's very important to realise that. It
is not possible for it to work unless you pick a behaviour arbitrarily.

Suppose you have the following objects:

        A : related to x1, x2, x3
        B : related to x1 and x3
        C : related to nothing.
        
One possible "ordering" on the related m2m field here is C, B, A: using
a "result set size" ordering. Two others are "C, A, B" and "A, B, C",
using lexicographic ordering, after ordering the m2m set and with two
alternatives because there are two possibilities for how to order the
empty set. Another possibility is "A, B, C" based on doing it all in one
SQL query and picking the first row (so only the first many-to-many
result will have an effect) and where "A" happens to sort before "B" for
some reason. You happen to want A, B, A, A, B with C either first or
last (plus a couple of permutations depending upon how A and B order
themselves).

All of these orderings are possible, although your one is probably the
least logical on the grounds that it changes the result set. The problem
is that none of them are particularly canonically correct.

This isn't some personal preference issue where I think that we could
support it but it isn't worth it. There is no right answer to doing
this. It's simply not well-defined at the logical and relational levels.

> > A queryset is going to return one object for each
> > distinct object if you do the equivalent of Contract.objects.all(), not
> > on per many-to-many result.
> 
> So the .distinct() method is going to become redundant in qf-rf
> because it will effectively always be the case?

Not at all. I explicitly said the all() query, which is essentially what
you are doing. It's impossible to automatically tell if distinct() will
be wanted/needed or not, so it cannot be removed for the general
filtering case. There are still going to be cases when a query returns
multiple results unless you use distinct(). However, adjusting the
order_by() fragment will not be any of those. It only orders the result
set (at the SQL level), it does not change the result set.

Queyrset-refactor doesn't introduce new magical powers to querysets.
They will still behave basically the way they did before, only with less
bugs. Your current attempt to order by a many-to-many happened to work
by accident, mostly because I didn't insert all the extra overhead to do
that checking (constructing a query is expensive enough as it is) and I
expected most people would work out that it isn't a well-defined
operation. At the time I punted on whether to enforce it or document it
(keep in mind that the branch isn't finished yet), since I wasn't sure
which way it would pan out. Recently I had to add support for another
code path that should allow us to error out correctly when somebody
tries to add a multi-valued field to the ordering, so we'll probably go
with the error path.

> > So if you want to order by suppliers like this, you'll need to do a
> > queryset based on suppliers and pull back their related contract objects
> > and then you can order on suppliers however you like.
> 
> So in the future with qs-rf I won't even be able to achieve this using
> on a query of contracts by using .extras() to join in the suppliers
> table for the purposes of expanding/sorting the results?

The extras() call will work as it does now, except that it has its own
ordering clause, because mixing it with the queryset's ordering was too
fragile in a lot of cases. But using extra() is entirely different form
accessing a many-to-many relation. I suggested one approach to solving
this, which would seem to be the most logical for most situations: if
you're driving off the suppliers, use that as the queryset root, rather
than trying to get multiple results for the contracts. Using extra()
might be a different solution; it depends on your circumstances and
query performance characteristics.

*All* I am trying to point out in this sub-thread is that expecting the
objects that are returned in the result to change because you put in an
order_by() is mistaken. Using order_by() only orders existing results
and it only does so at the SQL level. The fact that it didn't raise an
error when you passed in a many-to-many doesn't mean that should work.
It just confirms that there are many logically undefined things you can
do in programming and raising an error for every single situation isn't
always going to happen. The solution is "don't do that".

Regards,
Malcolm

-- 
Works better when plugged in. 
http://www.pointy-stick.com/blog/


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to