On Thu, 06 Aug 2009 20:05:52 +0200, Thorsten Kampe wrote: > > That is significant! So the winner is: > > > > unicode('äöüÄÖÜß','utf-8') > > Unless you are planning to write a loop that decodes "äöüÄÖÜß" one > million times, these benchmarks are meaningless.
What if you're writing a loop which takes one million different lines of text and decodes them once each? >>> setup = 'L = ["abc"*(n%100) for n in xrange(1000000)]' >>> t1 = timeit.Timer('for line in L: line.decode("utf-8")', setup) >>> t2 = timeit.Timer('for line in L: unicode(line, "utf-8")', setup) >>> t1.timeit(number=1) 5.6751680374145508 >>> t2.timeit(number=1) 2.6822888851165771 Seems like a pretty meaningful difference to me. -- Steven -- http://mail.python.org/mailman/listinfo/python-list