On 31/05/12 06:15, John Nagle wrote:
On 5/30/2012 6:57 PM, duncan smith wrote:
Hello,
I have been attempting to speed up some code by using an sqlite
database, but I'm not getting the performance gains I expected.

SQLite is a "lite" database. It's good for data that's read a
lot and not changed much. It's good for small data files. It's
so-so for large database loads. It's terrible for a heavy load of
simultaneous updates from multiple processes.


Once the table is created the data will not be changed at all. Corresponding integer codes will have to be generated for columns. (I want to do this lazily because some columns might never be needed for output files, and processing all columns was relatively expensive for my initial solution.) After that it's a series of 'SELECT a, b, ... FROM table WHERE f="g" ORDER by a, b, ...' style queries dumped to space separated text files.

However, wrapping the inserts into a transaction with BEGIN
and COMMIT may help.


Unfortunately there's no discernible difference.

If you have 67 columns in a table, you may be approaching the
problem incorrectly.


Quite possibly. I have defined start and end points. The data are contained in text files. I need to do the mapping to integer codes and generate output files for subsets of variables conditional on the levels of other variables. (I was doing the subsequent sorting separately, but if I'm using SQL I guess I might as well include that in the query.) The output files are inputs for other (C++) code that I have no control over.

Any approach that doesn't consume large amounts of memory will do. Cheers.

Duncan

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to