wxjmfa...@gmail.com wrote: > By chance and luckily, first attempt. > c:\python32\python -m timeit "('€'*100+'€'*100).replace('€' > , 'œ')" > 1000000 loops, best of 3: 1.48 usec per loop > c:\python33\python -m timeit "('€'*100+'€'*100).replace('€' > , 'œ')" > 100000 loops, best of 3: 7.62 usec per loop
OK, that is roughly factor 5. Let's see what I get: $ python3.2 -m timeit '("€"*100+"€"*100).replace("€", "œ")' 100000 loops, best of 3: 1.8 usec per loop $ python3.3 -m timeit '("€"*100+"€"*100).replace("€", "œ")' 10000 loops, best of 3: 9.11 usec per loop That is factor 5, too. So I can replicate your measurement on an AMD64 Linux system with self-built 3.3 versus system 3.2. > Note > The used characters are not members of the latin-1 coding > scheme (btw an *unusable* coding). > They are however charaters in cp1252 and mac-roman. You seem to imply that the slowdown is connected to the inability of latin-1 to encode "œ" and "€" (to take the examples relevant to the above microbench). So let's repeat with latin-1 characters: $ python3.2 -m timeit '("ä"*100+"ä"*100).replace("ä", "ß")' 100000 loops, best of 3: 1.76 usec per loop $ python3.3 -m timeit '("ä"*100+"ä"*100).replace("ä", "ß")' 10000 loops, best of 3: 10.3 usec per loop Hm, the slowdown is even a tad bigger. So we can safely dismiss your theory that an unfortunate choice of the 8 bit encoding is causing it. Do you agree? -- http://mail.python.org/mailman/listinfo/python-list