* Michael Ströder (Thu, 06 Aug 2009 18:26:09 +0200) > Thorsten Kampe wrote: > > * Michael Ströder (Wed, 05 Aug 2009 16:43:09 +0200) > > I don't think any measurable speed increase will be noticeable > > between those two. > > Well, seems not to be true. Try yourself. I did (my console has UTF-8 as > charset): > > Python 2.6 (r26:66714, Feb 3 2009, 20:52:03) > [GCC 4.3.2 [gcc-4_3-branch revision 141291]] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> import timeit > >>> timeit.Timer("'äöüÄÖÜß'.decode('utf-8')").timeit(1000000) > 7.2721178531646729 > >>> timeit.Timer("'äöüÄÖÜß'.decode('utf8')").timeit(1000000) > 7.1302499771118164 > >>> timeit.Timer("unicode('äöüÄÖÜß','utf8')").timeit(1000000) > 8.3726329803466797 > >>> timeit.Timer("unicode('äöüÄÖÜß','utf-8')").timeit(1000000) > 1.8622009754180908 > >>> timeit.Timer("unicode('äöüÄÖÜß','utf8')").timeit(1000000) > 8.651669979095459 > >>> > > Comparing again the two best combinations: > > >>> timeit.Timer("unicode('äöüÄÖÜß','utf-8')").timeit(10000000) > 17.23644495010376 > >>> timeit.Timer("'äöüÄÖÜß'.decode('utf8')").timeit(10000000) > 72.087096929550171 > > That is significant! So the winner is: > > unicode('äöüÄÖÜß','utf-8')
Unless you are planning to write a loop that decodes "äöüÄÖÜß" one million times, these benchmarks are meaningless. Thorsten -- http://mail.python.org/mailman/listinfo/python-list