Hi all, I think I've found a strange case where QuerySet.count() does not match len(queryset), and the behaviour is certainly unexpected. But I'm not sure whether this is a bug, some mistake on my part, or a known limitation of combining .extra(), .distinct() and .count(). I am aware of the default ordering interfering with .distinct(), and already add .order_by() to get rid of it.
I have a model called RadioSignal, with an integer field "rssi". I'm interested in finding out how many distinct values for "rssi / 10" there are (-10, -20, -30, etc). Here is a commented "./manage.py shell" record: >>> RadioSignal.objects.count() 523 >>> RadioSignal.objects.order_by().values('rssi').distinct().count() 49 >>> connection.queries[-1]['sql'] 'SELECT COUNT(DISTINCT "maps_radiosignal"."rssi") FROM "maps_radiosignal"' Looks okay so far. But I'm interested in each distinct tens of RSSI values, not every single value. I can compute these with .extra(): >>> len(RadioSignal.objects.order_by().extra({'tens': 'rssi / 10'}).values('tens').distinct()) 6 >>> RadioSignal.objects.order_by().extra({'tens': 'rssi / 10'}).values('tens').distinct() [{'tens': -8}, {'tens': -4}, {'tens': -5}, {'tens': -9}, {'tens': -6}, {'tens': -7}] >>> connection.queries[-1]['sql'] 'SELECT DISTINCT (rssi / 10) AS "tens" FROM "maps_radiosignal" LIMIT 21' Also looks good so far. But running len() on a queryset is unnecessary if I only need the count. >>> RadioSignal.objects.order_by().extra({'tens': 'rssi / 10'}).values('tens').distinct().count() 523 >>> connection.queries[-1]['sql'] 'SELECT COUNT(DISTINCT "maps_radiosignal"."id") FROM "maps_radiosignal"' Uhoh. Somehow .count() keeps the .distinct() part, but replaces the .extra() and .values() parts with counting primary keys? I tried it with values('tens'), values_list('tens'), and values_list('tens', flat=True), as well no change. So far I've tested Django 1.6 with Python 2.7 and PostgreSQL 9.3, and Django 1.7 with Python 3.4 and PostgreSQL 9.1, all seem to behave the same. I can work around this, since the possible resulting querysets are pretty small, and evaluating them with len() isn't too slow. But I'm wondering if it's a bug in Django, or something else I've missed? Mattias -- You received this message because you are subscribed to the Google Groups "Django users" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-users+unsubscr...@googlegroups.com. To post to this group, send email to django-users@googlegroups.com. Visit this group at http://groups.google.com/group/django-users. To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/21085de5-a899-437c-b13a-8584104df65f%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.