On Fri, 2008-11-21 at 22:37 -0500, Michael Keselman wrote: > Hello, > > http://pastie.org/private/xvqf5tgjnimiolakhawgg (relevant code) > > Django's QuerySets allow you to do ORs (unions) pretty nicely with > Model.manager.filter(Q(...) | Q(...)). However, ANDs don't work quite > as nicely in my situation because I want AND to do what > Model.manager.filter(Q(...)).filter(Q(...)) would do, but > Model.manager.filter(Q(...) & Q(...)) does something very different. > It does not compare fields with other rows in the table but rather > with the same row against itself. For example, I have a Tag model with > a CharField I call 'name'. If I were to do > Tag.objects.filter(Q(name='text') & Q(name='password')), it would > compare each tag to check if the tag's name equals "text" AND that the > same tag's name equals "password".
Yeah, this is the edge-case that isn't handled out of the box. On the (likely) probability that it's much less common than the alternative behaviour, we decided to go with the current approach. You use-case isn't invalid, but it's never the goal to support every possible use-case. Speaking as an implementor, I'll also note that it's the really, really hard case to make it work efficiently in all cases (particularly with highly nested trees of ANDs and ORs), so it falls close to, or possibly over the line marked "out of scope for the ORM; custom SQL is your friend". I can think of four(!) possible approaches for your problem. In rough order from least intrusive to the one requiring the most changes (and roughly least efficient to most efficient), we have the following. I haven't written code for any of these, so I'm using English rather than Python. But my intuition is that they should all work. (1) After you have parsed the query, rewrite it in conjunctive normal form (CNF). That's a fairly routine transformation. Then each clause in the CNF can be written as Q(...)|Q(...)|... and can be put in a separate call to filter() (so you end up with one filter() call per clause in the CNF). Guaranteed to give the right answer and requires no understanding of Django's internals. The downside is that more comparisons than are possibly necessary will be made. Still, in practice, the difference possibly won't be noticed on a well-index table. (2) Write your own variation on Q() -- let's call it Q1 here, for brevity. This requires understanding what's going on when you call filter() a bit, but since you are looking at a case that isn't normally handled, you need to roll up your sleeves and dive in. It's not that bad, really. Basically, filter() wraps up its parameters in a Q() object and then calls Query.add_q(). So calling Queryset.filter() is equivalent to an outer call to Query.add_q(). The way a single call to add_q() knows that it can (and should) reuse certain aliases is the used_aliases parameter passed into add_q(). So you could force a particular alias not to be reused if you adjusted used_aliases. Notice that used_aliases is also passed to the add_to_query() method of any Q-like object you might pass in. In practice, that means you write a class Q1 that subclasses Q and which has an add_to_query() method. In your code that constructs the filter call, you use Q1 when you are creating a conjunction of terms, rather than Q -- for example, Q1(Q(name="a") & Q(name="b")). Your Q1.add_to_query() method will then generally call Q.add_q(), except that after each return it will remove any newly added aliases from used_aliases so that they aren't reused inside that Q1. You can do this by deep copying used_aliases before and restoring it afterwards. Just before finally returning from your add_to_query() method, you might want to add back all the used aliases so that any outer calls can reuse them. It's this nested removing and readding of aliases that makes this a hard problem (I would strongly suggest experimenting with something like (A & (B|(C&D)) to make sure things work as expected). (3) You could write your own equivalent to add_q() and call that for your case. Since filter() calls add_q(), subclass QuerySet and add your own disjoint_filter() method, say, that calls your own add_q() method -- which need not even be a separate method if you don't want to subclass Query as well and don't mind calling setup_filter() directly. In your own add_q() version, you might choose to not update used_aliases for some or all situations. I think not updating used_aliases at all will lead to very inefficient queries -- lots of unnecessary tables involved -- so this could well reduce to option (2), above, in which case I'd go with option (2). But have a think about it and see. (4) Since it looks like you're doing the equivalent of text searching with boolean operators, my fourth solution would be to use a text searching tool. Solr or Sphinx or something like that which indexes the tag field in question. The reason for suggesting this option is that most of those search engines have built-in support for boolean logic combinators. It would require custom SQL to do the searching (or you could look at the djangosearch project), but might well save a lot of effort. Maybe that will give you some ideas. Regards, Malcolm --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~----------~----~----~----~------~----~------~--~---