Re: [Tutor] Loop comparison

Stefan Behnel Fri, 16 Apr 2010 01:54:41 -0700

Alan Gauld, 16.04.2010 10:29:

"Stefan Behnel" wrote

import cython


@cython.locals(result=cython.longlong, i=cython.longlong)
def add():
    result = 0
    for i in xrange(1000000000):
        result += i
    return result

print add()

This runs in less than half a second on my machine, including the time
to launch the CPython interpreter. I doubt that the JVM can even start
up in that time.


I'm astonished at these results. What kind of C are you using. Even in
assembler I'd expect the loop/sum to take at least 3s
on a quad core 3GHz box.

Or is cython doing the precalculation optimisations you mentioned?


Nothing surprising in the C code:

  __pyx_v_result = 0;
  for (__pyx_t_1 = 0; __pyx_t_1 < 1000000000; __pyx_t_1+=1) {
    __pyx_v_i = __pyx_t_1;
    __pyx_v_result += __pyx_v_i;
  }

And if so when does it do them? Because surely, at some stage, it still
has to crank the numbers?

Cython does a bit of constant folding (which it can benefit from oninternal optimisation decisions), but apart from that, the mantra is tojust show the C compiler explicit C code that it can understand well, andlet it do its job.

(We can of course do some fancy math to speed this particular sum up
since the result for any power of ten has a common pattern, but I
wouldn't expect the compiler optimiser to be that clever)

In this particular case, the C compiler really stores the end result in thebinary module, so I assume that it simply applies Little Gauß as anoptimisation in combination with loop variable aliasing.


Stefan

_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Loop comparison

Reply via email to