"Stefan Behnel" <stefan...@behnel.de> wrote in message news:mailman.470.1283712666.29448.python-l...@python.org...
BartC, 05.09.2010 19:09:

All those compilers that offer loop unrolling are therefore wasting
their time...

Sometimes they do, yes.

Modifying the OP's code a little:

a = 0
for i in xrange(100000000):      # 100 million
    a = a + 10                  # add 10 or 100
print a

Manually unrolling such a loop four times (ie. 4 copies of the body, and counting only to 25 million) increased the speed by between 16% and 47% (ie. runtime reducing by between 14% and 32%).

This depended on whether I added +10 or +100 (ie. whether long integers are needed), whether it was inside or outside a function, and whether I was running Python 2 or 3 (BTW why doesn't Python 3 just accept 'xrange' as a synonym for 'range'?)

These are just some simple tests on my particular machine and implementations, but they bring up some points:

(1) Loop unrolling does seem to have a benefit, when the loop body is small.

(2) Integer arithmetic seems to go straight from 32-bits to long integers; why not use 64-bits before needing long integers?

(3) Since the loop variable is never used, why not have a special loop statement that repeats code so many times? This can be very fast, since the loop counter need not be a Python object, and probably there would be no need for unrolling at all:

repeat 100000000:        # for example
   a = a + 10

--
Bartc
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to