amberite wrote: > [EMAIL PROTECTED] wrote: > >>I am using cx_Oracle and MySQLdb to pull a lot of data from some tables >>and I find that the cursor.execute method uses a lot of memory that >>never gets garbage collected. Using fetchmany instead of fetchall does >>not seem to make any difference, since it's the execute that uses >>memory. Breaking the query down to build lots of small tables doesn't >>help, since execute doesn't give its memory back, after reading enough >>small tables execute returns a memory error. What is the trick to get >>memory back from execute in cx_Oracle and MySQLdb? > > > cx_Oracle and MySQLdb must be handled differently, due to the fact that > MySQL does not actually have cursors (MySQLdb fakes them for you). > > To handle large resultsets efficiently in cx_Oracle simply use the > cursor iteration idiom: > > for row in cursor: > # do stuff with the row > > cx_Oracle takes care of the fetching for you, and your memory usage > should remain fairly constant when using this idiom. > > To handle large resultsets in MySQLdb, you have to resort to multiple > queries: > > l = 1000 > o = 0 > > cursor.execute('SELECT foo FROM Bar LIMIT %d OFFSET %d', (l, o)) > rows = cursor.fetchall() > while len(rows) > 0: > # process the fetched rows > o += l > cursor.execute('SELECT foo FROM Bar LIMIT %d OFFSET %d', (l, o)) > rows = cursor.fetchall() > > cursor.close() > > As you can see, the MySQLdb version is more involved, due to the lack > of real cursor support in the MySQL database. Any database with good > cursor support will likely have good cursor iteration support in the > corresponding DBAPI driver. > > Hope this helps, > > L. Daniel Burr > The MySQLdb solution you give is way more complicated than it needs to be, thereby skewing your opinion towards cx_Oracle unnecessarily.
Look up the .fetchmany() method of cursors in the DB API. There is only any need to execute a single query no matter how large the result set: you simply need to keep calling .fetchmany(N) (where N is whatever you've decided by testing is your optimum chunk size) until it returns less than N rows, at which point you have exhausted the query. It's very little more effort to wrap this all up as a generator that effectively allows you to use the same solution as you quote for cx_Oracle. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Love me, love my blog http://holdenweb.blogspot.com Recent Ramblings http://del.icio.us/steve.holden -- http://mail.python.org/mailman/listinfo/python-list