-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 pyt...@bdurham.com wrote: > Thank you for your suggestion about looking at SQLite. I haven't > compared the performance of SQLite to Python dictionaries, but I'm > skeptical that SQLite would be faster than in-memory Python dictionaries > for the type of analysis I'm doing.
I'd recommend at least trying a test just to see. As an example SQLite uses indices which will be faster than Python dicts for some set operations. (And if you aren't careful, your various Python based optimizations will end up duplicating what SQLite does internally anyway :-) > Prior to my use of Python, my > customer used a very expensive Oracle system to analyze their log files. > My simple Python scripts are 4-20x faster than the Oracle PL/SQL they > are replacing - and run on much cheaper hardware. SQLite is not like Oracle or any similar database system. It does not operate over the network or similar connection. It is a library in your process that has an optimized disk storage format (single file) and a SQL parser that generates bytecode for a special purpose virtual machine in pretty much the same way CPython operates. The performance improvements you are seeing with Python over Oracle are exactly the same range people see with SQLite over Oracle. One common usage reported on the SQLite mailing list is people copying data out of Oracle and running their analysis in SQLite because of the performance advantages. > Note: Memory is currently not a concern for me so I don't need SQLite's > ability to work with data sets larger than my physical memory. The pragmas tune things like cache sizes. The SQLite default is 2MB, relying on the operating system for caching beyond that. Bumping up that kind of size was my suggestion :-) Roger -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) iEYEARECAAYFAklR9+UACgkQmOOfHg372QSMbwCdGS5S2/96fWW8knjfBVqReAfV AEwAn2Yc+L9BEZgT69OjwtyqxLtifVpU =mPfy -----END PGP SIGNATURE----- -- http://mail.python.org/mailman/listinfo/python-list