On Tue, 23 Sep 2008 06:23:12 -0700 (PDT), sturlamolden
<[EMAIL PROTECTED]> wrote:

>I have recently been playing with a kd-tree for solving the "post
>office problem" in a 12-dimensional space. This is pure cpu bound
>number crunching, a task for which I suspected Python to be
>inefficient.

Well, python is not a number crunching language. However much we would
like it to be (we would ? :-). No scripting language is.
Developing time is shorter, I agree, but when you have, for example a
problem which takes 8,31 minutes to go through in optimized fortran
code (like the one I had the other day), then that hardly matters.

>
>My prototype in Python 2.5 using NumPy required 0.41 seconds to
>construct the tree from 50,000 samples. Unfortunately, searching it
>felt a bit slow, finding the 11 nearest-neighbours of 1,000 points
>took 29.6 seconds (and there were still 49,000 to go). Naturally, I
>blamed this on Python. It would be 100 times faster if I used C++,
>right?


Not necessarily.
Before resorting to rewriting the problem try psyco. It speeds up
things sometimes.
Also, (I'm not that familiar with python yet, so I don't know how to
do it in python), try finding the bottlenecks of your calculation. Are
the loops where most of the processing time is wasted, or disk
accessing, or ... ?

>
>After having a working Python prototype, I resorted to rewrite the
>program in C++. The Python prototype took an hour to make, debug and
>verify. The same thing in C++ took me almost a day to complete, even
>with a working prototype as model. To my surprise, the resulting beast
>of C++ required 64.3 seconds to construct the same kd-tree. Searching
>the tree was not faster either, 1,000 points required 38.8 seconds. I
>wasted a day, only to find my Python prototype being the faster.



>
>We may conclude that I'm bad at programming C++, but I suspect that is
>not the case here. Albeit micro-benchmarks may indicate that Python is
>100-200 times slower than C++, they may not be applicable to the real
>world. Python can be very efficient. And when combined with libraries
>like NumPy, beating it's performance with hand-crafted C++ is
>difficult. At least, my 10 years experience programming scientific
>software in various languages was not sufficient to beat my own Python
>prototype with C++.
>
>That is not to say I have never seen C++ run a lot faster than Python.
>But it tends to be very short pieces of CPU bound code, no more than a
>function or two. But as the problem grows in complexity, C++
>accumulates too much of its own bloat.
>

Well, personally, I try to combine fortran (being a fortran programmer
by trade) with python (in the last few years), as I find fortran to
be, by two grades, more comfortable for solving scientific problems
then c (or python for that matter, although it has its merits).
Starting from ith his capabilities for "normal" array handling, to
optimisation and easy readability, to whatnot.


Best regards
Bob
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to