Re: MySQL connection pooling - preferred method??

Anssi Kääriäinen Sat, 18 Feb 2012 02:07:04 -0800

On Feb 18, 1:01 am, Florian Apolloner <[email protected]> wrote:
> Yes, ABORT + DISCARD should do it for postgres (or ABORT; RESET ALL; SET
> SESSION AUTHORIZATION DEFAULT if pg < 8.2)


Inspired by this thread, I did some work for 3rd party database
connection pooling. What I have is at https://github.com/akaariai/django_pooled.
Quick summary: seems to work, except for Django's tests.

Now, there is a problem regarding connection state initialization.
Django doesn't separate between creating a connection and initializing
its state. All is done in ._cursor(). If the above were to work
reliably, the implementation of ._cursor() should be:
def _cursor(self):
     if not self.connection:
          self.connection = self.new_connection()
          self.initialize_connection()
     return CursorWrapper(self.connection.cursor())

Now a pooling connection wrapper could just override new_connection()
in a subclass and everything should work. The connection returned from
the pool would still get properly initialized. This change would make
sense from code-clarity and consistency between backends point of
views, too. So, I think doing this refactoring would be a good idea.

Note that the connection-state initialization problem doesn't really
matter in the normal usage. However, in Django's test suite, where the
connection initialization will do different things depending on
overridden settings (settings.USE_TZ for example) things will break.

The above mentioned change is what I have meant when I have said that
Django should encourage extensibility: create nicely extensible
implementations. They need not be public API.

BTW You should not run ABORT + DISCARD ALL as connection reset string
in PostgreSQL from Python. Two reasons: abort == rollback, and this
means psycopg2 will lose track of transaction state. In addition
DISCARD ALL will reset the connection state, and due to the problem of
not separating new connections and initialization of connection state,
this will mean connection state will be incorrect for second
connection onwards. Just do connection.rollback(). ABORT + DISCARD ALL
is still the right thing to do in external poolers (pgpool2, pgbouncer
etc).

I think what I have should work for MySQL, too. I have tested it for
PostgreSQL and SQLite3, where things seem to work. Except for the
above mentioned state-init problem.

So, anybody interested in connection pooling should in my opinion work
for making Django's backends more extensible, and then creating a 3rd
party connection pooler. What I have might be a good starting point,
or at least it might give some pointers of what to do.

Note that connection pooling in Python for speed reasons does not make
sense. You will get much better results from external pools, which can
view the application as a whole. In-Django pool is limited to one
process at a time view, which isn't good at all. However, there are
some other nice things you could do: reporting of most time consuming/
used queries. Rewrite normal queries to prepared statements/procedure
calls. Track where you have left transactions open. Share connections
in auto-commit mode (this would actually make a _lot_ of sense from
performance standpoint in read-only views). I did some of those in
another pooler experiment: https://github.com/akaariai/django-psycopg-pooled

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.

Re: MySQL connection pooling - preferred method??

Reply via email to