I doubt that anyone would dispute that even as boosted by Numpy/Scipy, Python will almost certainly be notably slower than moderately well-written code in a compiled language. The reason Numpy exists, however, is not to deliver the best possible speed, but to deliver enough speed to make it possible to solve large numerical problems with the powerful and flexible Python language. As observed by Hans Latangen in _Python Scripting for Computational Science_, 2nd ed., Springer 2005, scientific computing is more than number crunching:
"Very often programming is about shuffling data in and out of different tools, converting one data format to another, extracting numerical data from a text, and administering numerical experiments involving a large number of data files and directories. Such tasks are much faster to accomplish in a language like Python than in Fortran, C, C++ or Java." He might well have mentioned the importance of developing nice-looking reports once the analysis is complete, and that development is simpler and more flexible in Python than a compiled language. In principle, I agree that heavy-duty number-crunching, at least if it has to be repeated again and again, should be accomplished by means of a compiled language. However, if someone has to solve many different problems just one time, or just a few times (for example if you are an engineering consultant), there is an excellent argument for using Python + Numpy. Unless it affects feasibility, I opine, computational speed is important primarily in context of regular production, e.g., computing the daily "value at risk" in a commodity trading portfolio, or making daily weather predictions. I am aware of the power and flexibility of the OCaml language, and particularly that an OCaml user can easily switch back and forth between interpreted and compiled implementation. I'm attacted to OCaml and, indeed, I'm in the process of reading Smith's (unfortunately not very well-written) _Practical OCaml_. However, I also understand that OCaml supports only double-precision implementation of real numbers; that its implementation of arrays is a little clunky compared to Fortran 95 or Numpy (and I suspect not as fast as Fortran's); and that the available libraries, while powerful, are by no means as comprehensive as those available for Python. For example, I am unaware of the existance of an HDF5 interface for OCaml. In summary, I think that OCaml is an excellent language, but I think that the question of whether to use it in preference to Python+Numpy for general-purpose numerical analysis must rest on much more than its computational speed. Jon Harrop wrote: > Filip Wasilewski wrote: > > Besides of that this code is irrelevant to the original one and your > > further conclusions may not be perfectly correct. Please learn first > > about the topic of your benchmark and different variants of wavelet > > transform, namely difference between lifting scheme and dwt, and try > > posting some relevant examples or use other topic for your benchmarks. > > Your lifting implementation is slightly longer and slower than the > non-lifting one but its characteristics are identical. For n=2^25 I get: > > 1.88s C++ (816 non-whitespace bytes) > 2.00s OCaml (741 b) > 2.33s F# (741 b) > 9.83s Your Python (1,002 b) > > The original python was 789 bytes. > > > Neglecting this issues, I wonder if it is equally easy to juggle > > arbitrary multidimensional arrays in a statically typed language and do > > operations on selected axis directly from the interactive interpreter > > like in the NumPy example from my other post - > > http://groups.google.com/group/comp.lang.python/msg/a71a5bf00a88ad57 ? > > I don't know much OCaml so it would be great if you could elaborate on > > this. > > It is probably just as easy. Instead of dynamic typing you have parametric > polymorphism. If you want to make your function generic over arithmetic > type then you can pass in the arithmetic operators. > > >> 0.56s C++ (direct arrays) > >> 0.61s F# (direct arrays) > >> 0.62s OCaml (direct arrays) > >> 1.38s OCaml (slices) > >> 2.38s Python (slices) > >> 10s Mathematica 5.1 > >> > >> Note that all implementations are safe (e.g. C++ uses a.at(i) instead of > >> a[i]). > > > > So why not use C++ instead of all others if the speed really matters? > > What is your point here? > > 1. Benchmarks should not just include two inappropriate languages. > 2. Speed aside, the other languages are more concise. > > > Could you please share full benchmark code, so we could make > > conclusions too? > > I'll paste the whole programs at the end... > > > I would like to get to know about your benchmark > > methodology. I wonder if you are allocating the input data really > > dynamically or whether it's size is a priori knowledge for the > > compiler. > > The knowledge is there for the compiler to use but I don't believe any of > them exploit it. > > >> In this specific context (discrete wavelet transform benchmark), I'd have > >> said that speed was the most important thing after correctness. > > > > Not necessarily, if you are doing research with different kinds of > > wavelets and need a general, flexible and easily adaptable tool or just > > the computation speed is not considered a bottleneck. > > Brevity is probably next. > > > Language speed is a great advantage, but if it always was the most > > important, people would talk directly to the CPUs or knit DSP chips in > > theirs backyards whenever they need to solve a calculation problem. > > Sure. > > > Please don't make this discussion heading towards `what is the fastest > > way of doing d4 transform and why OCaml rules` instead of `what is the > > optimal way of doing things under given circumstances and still have a > > free afternoon ;-)`. > > Comments like that seem to be based on the fundamental misconception that > writing C++ like this is hard. > > > Different tasks need different, well-balanced > > measures. Neither Python nor OCalm nor any other language is a panacea > > for every single problem. > > Absolutely. OCaml is as good as the next (compiled) language in this case. > Python and Matlab seem to be particularly bad at this task. > > Here's my complete OCaml: > > let a = sqrt 3. and b = 4. *. sqrt 2. > let h0, h1, h2, h3 = > (1. +. a) /. b, (3. +. a) /. b, (3. -. a) /. b, (1. -. a) /. b > let g0, g1, g2, g3 = h3, -.h2, h1, -.h0 > > let rec d4_aux tmp a n = > let n2 = n lsr 1 in > for i=0 to n2-2 do > tmp.(i) <- a.(i*2)*.h0+.a.(i*2+1)*.h1+.a.(i*2+2)*.h2+.a.(i*2+3)*.h3; > tmp.(i+n2) <- a.(i*2)*.g0+.a.(i*2+1)*.g1+.a.(i*2+2)*.g2+.a.(i*2+3)*.g3; > done; > tmp.(n2-1) <- a.(n-2)*.h0 +. a.(n-1)*.h1 +. a.(0)*.h2 +. a.(1)*.h3; > tmp.(2*n2-1) <- a.(n-2)*.g0 +. a.(n-1)*.g1 +. a.(0)*.g2 +. a.(1)*.g3; > Array.blit tmp 0 a 0 n; > if n > 4 then d4_aux tmp a (n lsr 1) > > let d4 a = > let tmp = Array.make (Array.length a) 0. in > d4_aux tmp a (Array.length a) > > let () = > print_endline "Generating test data..."; > let x = Array.init (1 lsl 25) (fun _ -> Random.float 1.) in > print_endline "Benchmarking..."; > let t1 = Sys.time() in > ignore(d4 x); > Printf.printf "Elapsed time is %.6f seconds\n" (Sys.time() -. t1) > > and C++: > > #include <iostream> > #include <vector> > #include <ctime> > #include <cmath> > > using namespace std; > > double a = sqrt(3), b = 4 * sqrt(2); > double h0 = (1 + a) / b, h1 = (3 + a) / b, h2 = (3 - a) / b, h3 = (1 - a) / > b; > double g0 = h3, g1 = -h2, g2 = h1, g3 = -h0; > > void d4(vector<double> &a) { > vector<double> tmp(a.size()); > for (int n = a.size(); n >= 4; n >>= 1) { > int half = n >> 1; > > for (int i=0; i<n-2; i+=2) { > tmp.at(i/2) = a.at(i)*h0 + a.at(i+1)*h1 + a.at(i+2)*h2 + a.at(i+3)*h3; > tmp.at(i/2+half) = a.at(i)*h3 - a.at(i+1)*h2 + a.at(i+2)*h1 - > a.at(i+3)*h0 > ; > } > > tmp.at(half-1) = a.at(n-2)*h0 + a.at(n-1)*h1 + a.at(0)*h2 + a.at(1)*h3; > tmp.at(2*half-1) = a.at(n-2)*h3 - a.at(n-1)*h2 + a.at(0)*h1 - > a.at(1)*h0; > > copy(tmp.begin(), tmp.begin() + n, a.begin()); > } > } > > int main() { > cout << "Generating data...\n"; > vector<double> a(1 << 25); > for (int i=0; i<a.size(); ++i) a.at(i) = rand(); > cout << "Benchmarking...\n"; > double t1 = clock(); > d4(a); > cout << "Elapsed time " << (clock() - t1) / CLOCKS_PER_SEC << "s\n"; > } > > -- > Dr Jon D Harrop, Flying Frog Consultancy > Objective CAML for Scientists > http://www.ffconsultancy.com/products/ocaml_for_scientists/index.html?usenet -- http://mail.python.org/mailman/listinfo/python-list