On Mon, 2008-12-08 at 07:43 -0800, DavidA wrote: > If I undestand the problem correctly, in MySQL you could do this in > one query as: > > select > m.*, > (select min(created) from model2 where id = m.model2_id) as > first_created, > (select max(created) from model2 where id = m.model2_id) as > last_created > from model1 m > ;
That will give the correct answer, however it's a bit inefficient, since the inner queries are run once for each model id (and in the original post model2 and model3 were used, so it's worse than it looks here). Since that means 6000 rows, based on Eric's original numbers, it's worth looking deeper. Let me cut-and-paste the original description for reference: (behold the power of a fully-functional, non top-posting email client!) > > > objects = Model1.objects.filter(user=3).select_related() #about 6,000 > > > objects > > > > > > data = {} > > > for o in objects: > > > data[o.name] = [o.field1, o.field2] > > > data[o.name].append(o.field3.model2_set.all().latest('created')) > > > #get latest row from related model2 > > > data[o.name].append(o.model3_set.all().order_by('created')[0]) > > > #get earliest row from related model3 Ideally, there would be a way to run one inner query that does something like select max(created) from ... where model1.user_id = 3 group by model1.id except that isn't quite right because if the outer query only selected results with those created dates, it will get a superset of the right answer (some model2.id could have a created date that matches some other maximum but isn't the maximum for that particular model2.id, so you need to tie the maximum date to the model id). If something along those lines can be made to work, that inner query only has to run once (sure it's a big query, with a sort, but it's only one query and it's in the sweet spot of what databases can do). It's the difference between a correlated inner query (which you have -- so it runs once for each outer row) and an uncorrelated query like the above. Still, using the above query and then filtering out the dupes at the Python level might well be a good alternate solution. So you end up with: select model2.*, model1.id, ... from ... where model2.created in (select max(model2.created) from .... where model1.user_id=3 group by model1.id) order by model2.created asc Then you process these results, storing them in a dictionary keyed from model1.id. When you get duplicate results for the same model1.id (the above superset situation), the 'latest' result will be after the earlier ones and will overwrite them. The above is the query to select all the model2 fields. There's another query to select all the model3 fields. So that reduces the original 12001 queries to 3 and in the two with nested inner queries, the nested queries only run once each (or once per user id, which seemed to be fixed in the original problem statement). You could fairly close to doing this through the ORM as well, although it's not clear that it's worth it. Even with aggregate support, here's one approach (I'll only worry about the two model situation; extending to three is analogous). I'll assume models that look like this (and I'm going to change the names to something comprehensible so that we can develop some intuition about what's going on): class Blog(models.Model): author = models.ForeignKey(User) class Entry(models.Model): blog = models.ForeignKey(Blog) created = models.DateTimeField(default=datetime.datetime.now) So each user has many blogs with many entries and we want to pull the latest entries for each blog for a particular user (user_id 3, in this case). The inner query and grouping stuff here requires poking at the internal API (although the inner query bit is, sadly, documented): blog_qs = Blog.objects.filter(user=3) inner_query = Entry.objects.filter(blog__user=3).extra(select={"max": "max(created)"}).values("max").query inner_query.group_by = ["blog_id"] entry_qs = Entry.objects.filter(created__in=inner_query).order_by("created") # Now merge the results into a dictionary keyed by blog_id. data = dict([(b.id, b) for b in blog_qs]) for entry in entry_qs: data[entry.blog_id] = entry Regards, Malcolm --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~----------~----~----~----~------~----~------~--~---