Thanks Tom, that's a great explanation! Furbeenator
On Wed, Nov 2, 2011 at 10:46 AM, Tom Evans <tevans...@googlemail.com> wrote: > On Wed, Nov 2, 2011 at 5:30 PM, Ian Clelland <clell...@gmail.com> wrote: > > On Wed, Nov 2, 2011 at 8:25 AM, Thomas Guettler <h...@tbz-pariv.de> wrote: > >> > >> # This is my current solution > >> if get_special_objects().filter(pk=obj.pk).count(): > >> # yes, it is special > >> > > > > I can't speak to the "why" of this situation; it seems to me that this > could > > always be converted into a more efficient database query without any > > unexpected side-effects (and if I really wanted the side effects, I would > > just write "if obj in list(qs)" instead). In this case, though, I would > > usually write something like this: > > if get_special_objects().filter(pk=obj.pk).exists(): > > # yes, it is special > > I believe that in some cases, the exists() query can be optimized to > return > > faster than a count() aggregation, and I think that the intent of the > code > > appears more clearly. > > Ian > > OK, take this example. I have a django model table with 70 million > rows in it. Doing any kind of query on this table is slow, and > typically the query is date restrained - which mysql will use as the > optimum key, meaning any further filtering is a table scan on the > filtered rows. > > Pulling a large query (say, all logins in a month, ~1 million rows) > takes only a few seconds longer than counting the number of rows the > query would find - after all, the database still has to do precisely > the same amount of work, it just doesn't have to deliver the data. > > Say I have a n entries I want to test are in that resultset, and I > also want to iterate through the list, calculating some data and > printing out the row, I can do the existence tests either in python or > in the database. If I do it in the database, I have n+1 expensive > queries to perform. If I do it in python, I have 1 expensive query to > perform, and (worst case) n+1 full scans of the data retrieved (and I > avoid locking the table for n+1 expensive queries). > > Depending on the size of the data set, as the developer I have the > choice of which will be more appropriate for my needs. Sometimes I > need "if qs.filter(pk=obj.pk).exists()", sometimes I need "if obj in > qs". > > Cheers > > Tom > > -- > You received this message because you are subscribed to the Google Groups > "Django users" group. > To post to this group, send email to django-users@googlegroups.com. > To unsubscribe from this group, send email to > django-users+unsubscr...@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/django-users?hl=en. > > -- You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com. To unsubscribe from this group, send email to django-users+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-users?hl=en.