Hello --
I writing some python code to do some analysis of my mail logs. I took a 10,000 line snippet from them (the files are about 5-6 million usually) to test my code with. I'm developing it on a Powerbook G4 1.2GHz with 1.25GB of RAM and the Apple distributed Python* and I tested my code on the 10,000 line snippet. It took 2 minutes and 10 seconds to process that snippet. Way too slow -- I'd be looking at about 20 hours to process a single daily log file.
Just for fun, I copied the same code and the same log snippet to a dual-proc P3 500MHz machine running Fedora Core 2* with 1GB of RAM and tested it there. This machine provides web services and domain control for my network, so it's moderately utilized. The same code took six seconds to execute.
Granted I've got the GUI and all of that bogging down my Mac. However, I had nothing else fighting for CPU cycles and 700MB of RAM free when my testing was done. Even still, what would account for such a wide, wide, wide variation in the time required to process the data file? The code is 90% regular expressions and string finds.
Theories? Thanks!
-jag
* versions are: Python 2.3 (#1, Sep 13 2003, 00:49:11) [GCC 3.3 20030304 (Apple Computer, Inc. build 1495)] on darwin and Python 2.3.3 (#1, May 7 2004, 10:31:40) [GCC 3.3.3 20040412 (Red Hat Linux 3.3.3-7)] on linux2
<<inline: Pasted Graphic.tiff>>
Joshua Ginsberg -- [EMAIL PROTECTED] Brainstorm Internet Network Operations 970-247-1442 x131
-- http://mail.python.org/mailman/listinfo/python-list