In article <mailman.3977.1364605026.2939.python-l...@python.org>,
 Chris Angelico <ros...@gmail.com> wrote:

> On Sat, Mar 30, 2013 at 11:41 AM, Roy Smith <r...@panix.com> wrote:
> > In article <mailman.3971.1364595940.2939.python-l...@python.org>,
> >  Dennis Lee Bieber <wlfr...@ix.netcom.com> wrote:
> >
> >> If using MySQLdb, there isn't all that much difference... MySQLdb is
> >> still compatible with MySQL v4 (and maybe even v3), and since those
> >> versions don't have "prepared statements", .executemany() essentially
> >> turns into something that creates a newline delimited "list" of
> >> "identical" (but for argument substitution) statements and submits that
> >> to MySQL.
> >
> > Shockingly, that does appear to be the case.  I had thought during my
> > initial testing that I was seeing far greater throughput, but as I got
> > more into the project and started doing some side-by-side comparisons,
> > it the differences went away.
> 
> How much are you doing per transaction? The two extremes (everything
> in one transaction, or each line in its own transaction) are probably
> the worst for performance. See what happens if you pepper the code
> with 'begin' and 'commit' statements (maybe every thousand or ten
> thousand rows) to see if performance improves.
> 
> ChrisA

We're doing it all in one transaction, on purpose.  We start with an 
initial dump, then get updates about once a day.  We want to make sure 
that the updates either complete without errors, or back out cleanly.  
If we ever had a partial daily update, the result would be a mess.

Hmmm, on the other hand, I could probably try doing the initial dump the 
way you describe.  If it fails, we can just delete the whole thing and 
start again.
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to