sturlamolden wrote: > I don't agree that slicing is not the best way to approach this > problem.
Indeed, the C++ approach can be written very succinctly using slicing: for i=0 to n/2-1 do tmp[i] = dot a[2*i:] h; tmp[i + n/2] = dot a[2*i + 1:] g; where a[i:] denotes the array starting at index "i". > In both NumPy and Matlab, slicing (often) results in calls to > BLAS/ATLAS, which are heavily optimized numerical routines. One can > supply BLAS libraries that automatically exploits multiple CPUs, use > FMA (fused multiply and add) whenever appropriate, and make sure cache > is used in the most optimal way. If the Python were making only a few calls to BLAS then I would agree. However, in order to move as much computation as possible from the interpreted language onto BLAS you have had to call BLAS many times. So the super-fast BLAS routines are now iterating over the arrays many times instead of once and the whole program is slower than a simple loop written in C. Indeed, this algorithm could be written concisely using pattern matching over linked lists. I wonder if that would be as fast as using BLAS from Python. -- Dr Jon D Harrop, Flying Frog Consultancy Objective CAML for Scientists http://www.ffconsultancy.com/products/ocaml_for_scientists/index.html?usenet -- http://mail.python.org/mailman/listinfo/python-list