On Monday 03 June 2013, Patryk Zawadzki wrote:
> Here's the ticket:
> 
> https://code.djangoproject.com/ticket/6785
> 
> tl;dr: Calling .get() on a badly filtered queryset can result in lots
> of database rows being fetched and wrapped in Python objects for no
> gain.
> 

tl;dr: There's a general, valuable optimization to be made here, but it should 
be implemented at a lower level.

I have raised a related issue about a year and a half ago[0], thinking 
(mistakenly) that it was mostly an Oracle issue. Ian Kelly had pointed me in 
the right direction, but then life happened... The problem is that for all 
single-row queries except aggregates, Django uses the same strategy as used in 
get (and which you kept using in your patch): limit the query (usually) 
properly, then fetch until no more records are retrieved (fetching in done in 
chunks using fetchmany(), and not using fetchall()).

For single-row queries, this means two fetches; an unnecessary network 
roundtrip, for each such query, unless the backend or underlying driver take 
care to prevent it. On Oracle, at least, nobody does.

What we should do instead, in my opinion, is stop fetching as soon as a chunk 
arrives that isn't full. This alone should reduce the number of network 
accesses significantly on most Django projects.

While we're doing that, we could also define a new mode of sql execution (see 
SQLCompiler.results_iter(), in django.db.models.sql.compiler): Next to the 
current SINGLE (which uses fetchone()) and MULTI (the chunked loop described 
above) we could have SINGLE_VERIFY, which always fetches a single chunk of 
size 2. Make get() use that, and #6785 is fixed, while improving performance -- 
rather than hurting it on correct usage, like the proposed patch[1] does.

Assuming nobody raises objections, I intend to propose a patch along these 
lines sometime in the coming weeks (it's one of a few things on my to-do 
list), but I wouldn't mind at all if someone beat me to it.

Hope this helps,
        Shai.

[0] https://groups.google.com/d/msg/django-developers/SQ0Pltt_f0M/3ccQn0aauiEJ
[1] https://github.com/django/django/pull/1139

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to