On Mon, Jun 05, 2006 at 07:34:05PM +0100, Steve Holden wrote: > amberite wrote: > > [EMAIL PROTECTED] wrote: > > > >>I am using cx_Oracle and MySQLdb to pull a lot of data from some tables > >>and I find that the cursor.execute method uses a lot of memory that > >>never gets garbage collected. Using fetchmany instead of fetchall does > >>not seem to make any difference, since it's the execute that uses > >>memory. Breaking the query down to build lots of small tables doesn't > >>help, since execute doesn't give its memory back, after reading enough > >>small tables execute returns a memory error. What is the trick to get > >>memory back from execute in cx_Oracle and MySQLdb? > > > > > > cx_Oracle and MySQLdb must be handled differently, due to the fact that > > MySQL does not actually have cursors (MySQLdb fakes them for you). > > > > To handle large resultsets efficiently in cx_Oracle simply use the > > cursor iteration idiom: > > > > for row in cursor: > > # do stuff with the row > > > > cx_Oracle takes care of the fetching for you, and your memory usage > > should remain fairly constant when using this idiom. > > > > To handle large resultsets in MySQLdb, you have to resort to multiple > > queries: > > > > l = 1000 > > o = 0 > > > > cursor.execute('SELECT foo FROM Bar LIMIT %d OFFSET %d', (l, o)) > > rows = cursor.fetchall() > > while len(rows) > 0: > > # process the fetched rows > > o += l > > cursor.execute('SELECT foo FROM Bar LIMIT %d OFFSET %d', (l, o)) > > rows = cursor.fetchall() > > > > cursor.close() > > > > As you can see, the MySQLdb version is more involved, due to the lack > > of real cursor support in the MySQL database. Any database with good > > cursor support will likely have good cursor iteration support in the > > corresponding DBAPI driver. > > > > Hope this helps, > > > > L. Daniel Burr > > > The MySQLdb solution you give is way more complicated than it needs to > be, thereby skewing your opinion towards cx_Oracle unnecessarily. > > Look up the .fetchmany() method of cursors in the DB API. There is only > any need to execute a single query no matter how large the result set: > you simply need to keep calling .fetchmany(N) (where N is whatever > you've decided by testing is your optimum chunk size) until it returns > less than N rows, at which point you have exhausted the query. > > It's very little more effort to wrap this all up as a generator that > effectively allows you to use the same solution as you quote for cx_Oracle.
MySQL will keep table locks until the results are all fetched so even though the DB API allows fetchone() or fetchmany() using those with MySQLdb is dangerous. -Jack -- http://mail.python.org/mailman/listinfo/python-list