In article <roy-a61512.20410329032...@news.panix.com>, Roy Smith <r...@panix.com> wrote:
> In article <mailman.3971.1364595940.2939.python-l...@python.org>, > Dennis Lee Bieber <wlfr...@ix.netcom.com> wrote: > > > If using MySQLdb, there isn't all that much difference... MySQLdb is > > still compatible with MySQL v4 (and maybe even v3), and since those > > versions don't have "prepared statements", .executemany() essentially > > turns into something that creates a newline delimited "list" of > > "identical" (but for argument substitution) statements and submits that > > to MySQL. > > Shockingly, that does appear to be the case. I had thought during my > initial testing that I was seeing far greater throughput, but as I got > more into the project and started doing some side-by-side comparisons, > it the differences went away. OMG, this is amazing. http://stackoverflow.com/questions/3945642/ It turns out, the MySQLdb executemany() runs a regex over your SQL and picks one of two algorithms depending on whether it matches or not. restr = (r"\svalues\s*" r"(\(((?<!\\)'[^\)]*?\)[^\)]*(?<!\\)?'" r"|[^\(\)]|" r"(?:\([^\)]*\))" r")+\))") Leaving aside the obvious line-noise aspects, the operative problem here is that it only looks for "values" (in lower case). I've lost my initial test script which convinced me that executemany() would be a win; I'm assuming I used lower case for that. Our production code uses "VALUES". The slow way (i.e. "VALUES"), I'm inserting 1000 rows about every 2.4 seconds. When I switch to "values", I'm getting more like 1000 rows in 100 ms! A truly breathtaking bug. -- http://mail.python.org/mailman/listinfo/python-list