Re: How to find bad row with db api executemany()?

Roy Smith Fri, 29 Mar 2013 19:48:17 -0700

In article <roy-a61512.20410329032...@news.panix.com>,
 Roy Smith <r...@panix.com> wrote:

> In article <mailman.3971.1364595940.2939.python-l...@python.org>,
>  Dennis Lee Bieber <wlfr...@ix.netcom.com> wrote:
> 
> > If using MySQLdb, there isn't all that much difference... MySQLdb is
> > still compatible with MySQL v4 (and maybe even v3), and since those
> > versions don't have "prepared statements", .executemany() essentially
> > turns into something that creates a newline delimited "list" of
> > "identical" (but for argument substitution) statements and submits that
> > to MySQL.
> 
> Shockingly, that does appear to be the case.  I had thought during my 
> initial testing that I was seeing far greater throughput, but as I got 
> more into the project and started doing some side-by-side comparisons, 
> it the differences went away.

OMG, this is amazing.

http://stackoverflow.com/questions/3945642/

It turns out, the MySQLdb executemany() runs a regex over your SQL and 
picks one of two algorithms depending on whether it matches or not.

restr = (r"\svalues\s*"
        r"(\(((?<!\\)'[^\)]*?\)[^\)]*(?<!\\)?'"
        r"|[^\(\)]|"
        r"(?:\([^\)]*\))"
        r")+\))")

Leaving aside the obvious line-noise aspects, the operative problem here 
is that it only looks for "values" (in lower case).

I've lost my initial test script which convinced me that executemany() 
would be a win; I'm assuming I used lower case for that.  Our production 
code uses "VALUES".

The slow way (i.e. "VALUES"), I'm inserting 1000 rows about every 2.4 
seconds.  When I switch to "values", I'm getting more like 1000 rows in 
100 ms!

A truly breathtaking bug.
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: How to find bad row with db api executemany()?

Reply via email to