Ben Lee wrote: > hi folks -- > > a quick python and sqlite3 performance question. i find that > inserting a million rows of in-memory data into an in-memory database > via a single executemany() is about 30% slower than using the sqlite3 > CLI and the .import command (reading the same data from a disk file, > even.) i find this surprising, executemany() i assume is using a > prepared statement and this is exactly what the .import command does > (based on my quick perusal of the source.) > > is this discrepancy to be expected? where is the overhead coming > from? [...]
Ok, I'll bite. Well, first, the a 30 % slowdown with a Python DB-API wrapper compared to the native commandline tool of the database is to be considered still quite reasonable, in my opinion. About a year ago I compared the performance of pysqlite vs. the other SQLite wrapper, APSW. At the time, APSW was a bit faster, not excessively, but measurable. In meaningless benchmarks like yours ;-) So I changed pysqlite here and there to get the same performance as APSW. Only minor tweaks, nothing spectacular. And a few hardcore tricks as well, like special-casing *not-subclassed* classes. The result was pysqlite 2.3.5: http://oss.itsystementwicklung.de/trac/pysqlite/wiki/2.3.5_Changelog These optmizations are not yet in Python 2.5.x, but they can be found in the sqlite3 module of the Python 2.6/3.0 betas. Well, here are my results of your benchmark :-) -- Gerhard # with Python 2.5.2's sqlite3 module [EMAIL PROTECTED]:~/tmp$ python t.py generating data... done! testing 1000000 inserts... [42.795290946960449, 44.337385892868042, 46.35642409324646] # with pysqlite 2.5.0, which I released earlier today [EMAIL PROTECTED]:~/tmp$ python t.py generating data... done! testing 1000000 inserts... [33.027599096298218, 32.73675012588501, 32.823790073394775] # SQLite commandline [EMAIL PROTECTED]:~/tmp$ time sqlite3 -init sqlcmds ':memory:' '.quit' real 0m32.514s [EMAIL PROTECTED]:~/tmp$ time sqlite3 -init sqlcmds ':memory:' '.quit' real 0m32.576s [EMAIL PROTECTED]:~/tmp$ time sqlite3 -init sqlcmds ':memory:' '.quit' real 0m32.604s -- http://mail.python.org/mailman/listinfo/python-list