jmfauth <wxjmfa...@gmail.com> writes: > Now replace i by a char, a representent of each "subset" > of the FSR, select a method where this FST behave badly > and take a look of what happen.
You insist in cherry-picking a single "method where this FST behave badly", even when it is so obviously a corner case (IMHO it is not reasonably a common case when you have relatively big chunks of ASCII characters where you are adding one single non-ASCII char...) Anyway, these are my results on the opposite case, where you have a big chunk of non-ASCII characters and a single ASCII char added: Python 2.7.3 (default, Jan 2 2013, 13:56:14) [GCC 4.7.2] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import timeit >>> timeit.repeat("'€' * 1000 + 'z'") [0.2817099094390869, 0.2811391353607178, 0.2811310291290283] >>> timeit.repeat("u'œ' * 1000 + u'\U00010001'") [0.549591064453125, 0.5502040386199951, 0.5490291118621826] >>> timeit.repeat("u'\U00010001' * 1000 + u'œ'") [0.3823568820953369, 0.3823089599609375, 0.3820679187774658] >>> timeit.repeat("u'\U00010002' * 1000 + 'a'") [0.45046305656433105, 0.45000195503234863, 0.44980502128601074] Python 3.3.0 (default, Mar 18 2013, 12:00:52) [GCC 4.7.2] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import timeit >>> timeit.repeat("'€' * 1000 + 'z'") [0.23264244200254325, 0.23299441300332546, 0.2325888039995334] >>> timeit.repeat("'œ' * 1000 + '\U00010001'") [0.3760241370036965, 0.37552819900156464, 0.3755163860041648] >>> timeit.repeat("'\U00010001' * 1000 + 'œ'") [0.28259182300098473, 0.2825558360054856, 0.2824251129932236] >>> timeit.repeat("'\U00010002' * 1000 + 'a'") [0.28227063300437294, 0.2815949220021139, 0.2829978369991295] IIUC, while it may be true that Py3 is slightly slower than Py2 when the string operation involves an internal representation change (all your examples, and the second operation above), in the much more common case it is considerably faster. This, and the fact that Py3 actually handles the whole Unicode space without glitches, make it a better environment in my eyes. Kudos to the core team! Just my 0.45-0.28 cents, ciao, lele. -- nickname: Lele Gaifax | Quando vivrò di quello che ho pensato ieri real: Emanuele Gaifas | comincerò ad aver paura di chi mi copia. l...@metapensiero.it | -- Fortunato Depero, 1929. -- http://mail.python.org/mailman/listinfo/python-list