On Friday, March 2, 2012 7:24:00 PM UTC+2, Tom Evans wrote:
>
> Hi all
>
> I have a particular query that requires me to use a RawQuerySet - I
> need to left join the table to itself in order to select the highest
> 'priority' row from the table for each distinct value of a foreign key
> on the model, and I need to join to the related table in order to
> filter the results.
>
> Having generated the queryset, I then want to make a dictionary of {
> foreign_key_object : model_object }. So I have this working, but I
> cannot use select_related() with a RawQuerySet, and so this runs N+1
> queries, where N is the number of distinct foreign keys.
>
> Here is some code, which might explain it better:
>
> connection.queries=[]
> base_products_qs = Product.objects.raw(
>     """
>     SELECT idp_product.*, idp_productclass.*
>     FROM idp_product
>     JOIN idp_productclass
>         ON idp_product.product_class_id = idp_productclass.id
>     LEFT JOIN idp_product p2
>         ON idp_product.product_class_id = p2.product_class_id
>             AND p2.class_priority < idp_product.class_priority
>     WHERE p2.id IS NULL and idp_productclass.product_type != 4
>     """)
> base_products = dict([ (p.product_class, p) for p in base_products_qs ])
> len(connection.queries) # 7 queries (6 product classes)
>
> Is there any simple way around this? I can reduce it to two queries
> already, but it seems wrong to select out the info I want, throw it
> away, and then fetch it again.
>
 Some suggestions:
  - If the 2-query version is fast enough, use it. Premature optimization 
and all that... :)
  - You could of course manually iterate through the SQL and instantiate 
the models by hand.
  - Beware of the left join, it seems if you have a lot of rows with 
different class_priorities for each product_class_id you might be in 
trouble. The filtering on IS NULL happens after doing the join, so before 
the filter you could have a lot of intermediate rows. It might be there is 
no problem depending of the amount of rows and SQL vendor. Hard to know 
without testing. There are a couple of other ways to do the query, either 
by subquery doing a select min(class_priority), product_class_id or if you 
are using PostgreSQL, distinct on would do the job for you, too.

All that being said, I would really like the ability to have select_related 
available for RawQuerySet. It might be judged to be a feature not needed 
commonly enough, so it is possible it would not get in even with a good 
quality patch. However, if it had a nice API, and the implementation reused 
the code of QuerySet select_related, I think it could have a chance to get 
in.

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/django-users/-/18iQZDj801EJ.
To post to this group, send email to django-users@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.

Reply via email to