Robin Becker wrote: > For fairly sensible reasons we changed the internal default to use unicode > rather than bytes. After doing all that and making the tests compatible > etc etc I have a version which runs in both and passes all its tests. > However, for whatever reason the python 3.3 version runs slower
"For whatever reason" is right, unfortunately there's no real way to tell from the limited information you give what that might be. Are you comparing a 2.7 "wide" or "narrow" build? Do your tests use any so-called "astral characters" (characters in the Supplementary Multilingual Planes, i.e. characters with ord() > 0xFFFF)? If I remember correctly, some early alpha(?) versions of Python 3.3 consistently ran Unicode operations a small but measurable amount slower than 3.2 or 2.7. That especially effected Windows. But I understand that this was sped up in the release version of 3.3. There are some operations with Unicode strings in 3.3 which unavoidably are slower. If you happen to hit a combination of such operations (mostly to do with creating lots of new strings and then throwing them away without doing much work) your code may turn out to be a bit slower. But that's a pretty artificial set of code. Generally, test code doesn't make good benchmarks. Tests only get run once, in arbitrary order, it spends a lot of time setting up and tearing down test instances, there are all sorts of confounding factors. This plays merry hell with modern hardware optimizations. In addition, it's quite possible that you're seeing some other slow down (the unittest module?) and misinterpreting it as related to string handling. But without seeing your entire code base and all the tests, who can say for sure? > 2.7 Ran 223 tests in 66.578s > > 3.3 Ran 223 tests in 75.703s > > I know some of these tests are fairly variable, but even for simple things > like paragraph parsing 3.3 seems to be slower. Since both use unicode > internally it can't be that can it, or is python 2.7's unicode faster? Faster in some circumstances, slower in others. If your application bottleneck is the availability of RAM for strings, 3.3 will potentially be faster since it can use anything up to 1/4 of the memory for strings. If your application doesn't use much memory, or if it uses lots of strings which get created then thrown away. > So far the superiority of 3.3 escapes me, Yeah I know, I resisted migrating from 1.5 to 2.x for years. When I finally migrated to 2.3, at first I couldn't see any benefit either. New style classes? Super? Properties? Unified ints and longs? Big deal. Especially since I was still writing 1.5 compatible code and couldn't really take advantage of the new features. When I eventually gave up on supporting versions pre-2.3, it was a load off my shoulders. Now I can't wait to stop supporting 2.4 and 2.5, which will make things even easier. And when I can ignore everything below 3.3 will be a truly happy day. > but I'm tasked with enjoying > this process so I'm sure there must be some new 'feature' that will help. > Perhaps 'yield from' or 'raise from None' or ....... No, you have this completely backwards. New features don't help you support old versions of Python that lack those new features. New features are an incentive to drop support for old versions. > In any case I think we will be maintaining python 2.x code for at least > another 5 years; the version gap is then a real hindrance. Five years sounds about right. -- Steven -- https://mail.python.org/mailman/listinfo/python-list