In article <mailman.3977.1364605026.2939.python-l...@python.org>, Chris Angelico <ros...@gmail.com> wrote:
> On Sat, Mar 30, 2013 at 11:41 AM, Roy Smith <r...@panix.com> wrote: > > In article <mailman.3971.1364595940.2939.python-l...@python.org>, > > Dennis Lee Bieber <wlfr...@ix.netcom.com> wrote: > > > >> If using MySQLdb, there isn't all that much difference... MySQLdb is > >> still compatible with MySQL v4 (and maybe even v3), and since those > >> versions don't have "prepared statements", .executemany() essentially > >> turns into something that creates a newline delimited "list" of > >> "identical" (but for argument substitution) statements and submits that > >> to MySQL. > > > > Shockingly, that does appear to be the case. I had thought during my > > initial testing that I was seeing far greater throughput, but as I got > > more into the project and started doing some side-by-side comparisons, > > it the differences went away. > > How much are you doing per transaction? The two extremes (everything > in one transaction, or each line in its own transaction) are probably > the worst for performance. See what happens if you pepper the code > with 'begin' and 'commit' statements (maybe every thousand or ten > thousand rows) to see if performance improves. > > ChrisA We're doing it all in one transaction, on purpose. We start with an initial dump, then get updates about once a day. We want to make sure that the updates either complete without errors, or back out cleanly. If we ever had a partial daily update, the result would be a mess. Hmmm, on the other hand, I could probably try doing the initial dump the way you describe. If it fails, we can just delete the whole thing and start again. -- http://mail.python.org/mailman/listinfo/python-list