Hello,

  I'm using django to redevelop an old PHP app I wrote a while back. 
Some of the queries I had written use the GROUP BY and HAVING clauses
in my sql queries (which may indicate insanity on my part, but it
seemed like a good idea at the time).  The app itself is a
non-hierarchical keyword-based bookmark manager (I find it useful :).

  Simplified, my model looks like:


class Bookmark(meta.Model):
    title = meta.CharField(maxlength=64)
    url = meta.CharField(maxlength=256)

class Tag(meta.Model):
    name = meta.CharField(maxlength=32, unique=True)
    bookmarks = meta.ManyToManyField(Bookmark, filter_interface=meta.HORIZONTAL)


  If I select a couple of tags then I only want to see the bookmarks
that have all of those tags.  So, I can do this easily enough
(non-essentials simplified):


from django.models.samwise2 import bookmarks, tags
def listing(request, **kw):
    cur_tags_set = set(kw.get('cur_tags','').split('/'))   # grab the
tags off the URL and parse

    marks = bookmarks.get_list(select_related=True)  # grab *all* bookmarks
    marks_tags = [ (mark,set([ t.name for t in mark.get_tag_list() ]))
for mark in marks ]
    if cur_tags:
        marks = [ mark for (mark,tagset) in marks_tags if
cur_tags_set.issubset(tagset) ]
        marks_tags = [ (mark,tagset) for (mark,tagset) in marks_tags
if mark in marks ]


  And that gives me the result I want (modulo typos).  But, I expect
these tables to be relatively large, and the listing is probably the
most common action, so in the old PHP version, I did an SQL query that
looked like:

SELECT B.*
  FROM bookmarks B, tags T, bookmark_tags BT
 WHERE B.id=BT.bookmark_id AND BT.tag_id=T.id AND T.name IN ('tag1',
'tag2', 'tag3')
GROUP BY B.id
HAVING COUNT(B.id)=3

  Well, the actual query was much less pretty, but it dealt with
permissions and ownership and whatnot.  The general idea was to let
the DB do the hard work of sorting out the rows, because the DB has
got to be faster than PHP (and, ugly as SQL can be, it's better than
PHP for this kind of thing).

  Admittedly, the python equivalent above isn't too bad (three cheers
for python's set datatype), but I suppose what I'm looking for is
something like:


from django.models.samwise2 import bookmarks,tags
def listing(request, **kw):
    cur_tags_set = set(kw.get('cur_tags','').split('/'))
    cur_tags = list(cur_tags_set)

    if cur_tags:
        marks = bookmarks.get_list(tables=['samwise2_tags',
'samwise2_tags_bookmarks'],
                                   where=['samwise2_tags.name IN %s' %
(tuple(cur_tags),),
                                         
'samwise2_bookmarks.id=samwise2_tags_bookmarks.bookmark_id',
                                         
'samwise2_tags.id=samwise2_tags_bookmarks.tag_id'],
                                   group_by=['samwise2_bookmarks.id'],
                                  
having=['COUNT(samwise2_bookmarks.id)=%d' % len(cur_tags)],
                                   select_related=True)
    else:
        marks = bookmarks.get_list(select_related=True)


  Are there any plans to incorporate the GROUP BY and HAVING clauses
into the model API somehow, or is current best practice to just write
python to do it?

cheers,
--joey
(I hope this makes sense to someone)

Reply via email to