abhinav wrote: > I want to strke a balance between development speed and crawler speed.
"The best performance improvement is the transition from the nonworking state to the working state." - J. Osterhout Try to get there are soon as possible. You can figure out what that means. ;^) When you do all your programming in Python, most of the code that is relevant for speed *is* written in C already. If performance is slow, measure! Use the profiler to see if you are spending a lot of time in Python code. If that is your problem, take a close look at your algorithms and perhaps your data structures and see what you can improve with Python. In the long run, going from from e.g. O(n^2) to O(n log n) might mean much more than going from Python to C. A poor algorithm in machine code still sucks when you have to handle enough data. Changing your code to improve on algorithms and structure is a lot easier in Python than in C. If you've done all these things, still have performance problems, and have identified a bottle neck in your Python code, it might be time to get that piece rewritten in C. The easiest and least intrusive way to do that might be with pyrex. You might also want to try Psyco before you do this. Even if you end up writing a whole program in C, it's not unlikely that you will get to your goal faster if your first version is written in Python. Good luck! P.S. Why someone would want to write yet another web crawler is a puzzle to me. Surely there are plenty of good ideas that haven't been properly implemented yet! It's probably very difficult to beat Google on their home turf now, but I'd really like to see a good tool to manage all that information I got from the net, or through mail or wrote myself. I don't think they wrote that yet--although I'm sure they are trying. -- http://mail.python.org/mailman/listinfo/python-list