Re: Mysql sleeping queries

Malcolm Tredinnick Thu, 21 Jun 2007 20:22:59 -0700

Hi Graham,

On Fri, 2007-06-22 at 02:44 +0000, Graham Dumpleton wrote:
> On Jun 22, 9:52 am, Malcolm Tredinnick <[EMAIL PROTECTED]>
> wrote:
> > On Thu, 2007-06-21 at 07:45 -0700, Ilya Semenov wrote:
> > > Malcolm,
> >
> > > I traced the problem and submitted the patch, see details at
> > >http://code.djangoproject.com/ticket/4650
> > > I'm not completely sure about the logic of signals though, the change
> > > may affect some middleware depending on it.
> >
> > Thanks, Ilya. I'd gotten that far, too. Unfortunately, though, it isn't
> > quite that easy. Well, it is that easy formod_python, however for WSGI
> > compliance, we can't do that (since the WSGI handler returns an
> > iterable).
> >
> > There are a few possibilities I've been bouncing around, but they all
> > have downsides: introducing greater coupling between templates, HTTP and
> > database so that we know when rendering stops (which will make it
> > possible to behave similarly to the modpython case) or making WSGI
> > return a single item iterable as recommended in PEP 333 for performance
> > (downside is that the greater part of the memory savings are lost).
> 
> Sorry, showing my ignorance here. Are you saying that Django can
> currently return to the WSGI adapter for the web server a Python
> string in some cases?


Not at all. That would be bad (as you note below). If the HttpResponse
class is created with a string, the __iter__ method returns that as a
single chunk (we iterate over the list [content], where "content" is the
string). This is set up in HttpResponse.__init__ so that the _container
attribute on the instance is always an iterable.

My comment above was referring to the fact that HttpResponse can also be
initialised with an iterator (usually a Template.iter_render() result
after [5482]) and will use that in the __iter__ method in that case. One
solution to the problems we are seeing would involve collecting all the
output from that iterator and then returning a new iterator that gave
back all the data in one go. The advantage of that approach is better
control over signalling when the request is finished and possibly
slightly better network performance due to larger chunks being available
to the WSGI server. The downside is larger memory usage, which was what
the original template rendering change was designed to alleviate.

> Looking at the submitted patch it almost seems that that is possible
> in some cases, otherwise I don't understand the need for the check:
> 
>       197                 if isinstance(self._response, basestring):
>       198                     return chain([self._response],
> IterConsumedSignal)

Brian might just be being careful. It's relatively harmless.

> I only ask as returning a string means that the WSGI adapter would
> treat it as an array of single character strings and by the WSGI
> specification the WSGI adapter is required to write each character out
> one at a time and perform a flush between each character. If doing
> this in Apache this would result in absolutely dreadful performance.

Yep. Understood. Fear not; we've haven't made a design decision to bring
the web server to its knees on every response. :-)

> As it stands, this patch would actually change the behaviour of a WSGI
> adapter in that the string would then be written out in one go and not
> one character at a time as it would if just the string were passed
> through. 

Technically, yes; unfortunately implemented adaptors will no longer
behave badly. However, it's going to be used with Django's HttpResponse,
which doesn't iterate over strings one character at a time in the output
stream.

> Another issue with this check is that it checks against
> 'basestring' whereas WSGI is only supposed to be given 'string'
> objects. The check would allow through 'unicode' string objects.
> 
> Also, I am presuming 'chain' comes from itertools but don't see that
> module being imported. Nor is it prefixed by the module name anyway.

Indeed. Both are problems. We'll fix those.

> BTW, why does one need to be using chain of iterables anyway. The WSGI
> specification requires that a WSGI adapter call close() on the
> iterable when it is done. Ie., quoting WSGI specification:

Yes. Brian (Harring -- the author of the second patch) also mentioned
this on the django-developers list shortly after he wrote the initial
patch. Sounds like the right thing to do.

[...]
> > For portability reasons, supporting WSGI is very important, so any
> > solution that only works with modpython is unfortunately not an option.
> 
> Why was the patch only suitable for mod_python?

(Note that I was talking about Ilya's original patch; not Brian's
subsequent one with WSGI support.) With mod_python, we write the full
response content out before we return from Django's handler (see line
169 in django/core/handlers/modpython.py), so we know we have finished
all the database accesses necessary for template rendering. Thus we can
move the "request_finished" signal to the end of that method. Ilya's
patch is correct, but notice that it didn't address WSGI at all. That
was all my comment was about: we can't just fix one half of the problem.

An identical small change doesn't work for WSGI. In the WSGI handler, we
pass the iterator upstream and the database accesses will be triggered
when the iterator is consumed by the WSGI server. The wrapping in the
second patch is so that we can tell when the rendering has been
finished. The close() method is the better approach, though, you're
right.

>  I admit I don't
> understand fully the underlying problem you are trying to solve,

For better or worse (I have opinions about this, not relevant to this
thread), we are using one database connection per request and closing it
upon completion. With the move to iterators and because template
rendering can require database accesses, we are currently needlessly
closing the database connection too early, which means a new one is
being opened during rendering and, for some reason, we are losing track
of it. Like i said in another post, I don't understand why the code
isn't automatically reusing it the next time that particular process or
thread is used and there is possibly some bullet-proofing in the backend
wrappers that is needed in addition to these changes, but I can't argue
with the evidence in this list thread.

Thanks for looking over this and asking the questions, anyway. It would
be nice if we were able to convince some onlookers that the approach
might work.

Regards,
Malcolm



--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~----------~----~----~----~------~----~------~--~---

Re: Mysql sleeping queries

Reply via email to