On May 18, 4:20 pm, David C. Ullrich <[EMAIL PROTECTED]> wrote: > Are you going to be doing research _about_ the > algorithms in question or is it going to be research > _using_ these algorithms to draw conclusions > about other things? > > Most of the replies seem to be assuming the latter. > If it's the former then Python seems like definitely > an excellent choice - when you have want to try > something new it will be much faster trying it > out in Python,
I second this. Hence my previous statement that "In scientific research, CPU time is cheap and time spent programming is expensive." If it was not clear what I meant, your post can serve as a clarification. But whether Giner is 'developing' or 'using' algorithms, he should value his own labour more than the CPU's. CPU labour (i.e. computation) is very cheap. Manual labour (i.e. programming) is very expensive. He may in any case benefit from using Python. Today, the preferred computer language amount scientists is not Fortran77, but various high-level languages like Matlab, S, IDL, Perl and Python. A related question is: How much 'speed' is really needed? If Giner is analyzing datasets using conventional statistics (ANOVA, multiple regression, etc.), when will Python (with NumPy) cease to be sufficient? In my experience, conventional statistics on a dataset of 100,000 or 1,000,000 samples can be regarded child's play on a modern desktop computer. One really need HUGE amounts of data before it's worthwhile to use anything else. If one can save a couple of seconds CPU time by spending several hours programming, then the effort is not just futile, it's downright wasteful and silly. Something else that should be mentioned: The complexity of the algorithm (the big-O notation) is much more important for runtime performance than the choice of language. If you can replace a O(N*N) with O(N log N), O(N) or O(1) it is always adviceable to do so. An O(N*N) algorithm implemented in C is never preferred over an O(N) algorithm written in Python. The only time when C is preferred over Python is when N is large, but this is also when O(N*N) is most painful. Pay attention to the algorithm is things are running unbearably slow. Python has highly tuned datatypes like lists, dicts and sets, which a C programmer will have a hard time duplicating. This also applies to built-in algorithms like 'timsort'. qsort in the C standard library or anything a C programmer can whip up within a reasonable amount of time simply doesn't compare. C vs. Python benchmarks that doesn't take this into account will falsely put Python in a bad light. -- http://mail.python.org/mailman/listinfo/python-list