Neil Hodgson added the comment:
Windows is the only widely used OS that has a 16-bit wchar_t. I can't recall
what OS/2 did but Python doesn't support OS/2 any more.
--
___
Python tracker
<http://bugs.python.o
Neil Hodgson added the comment:
Including the wmemcmp patch did not improve the times on MSC v.1600 32 bit - if
anything, the performance was a little slower for the test I used:
a=['C:/Users/Neil/Documents/λ','C:/Users/Neil/Documents/η']156
specialised:
[0.9125948707773204
Neil Hodgson added the comment:
A quick rewrite showed the single level case slightly faster (1%) on average
but its less readable/maintainable. Perhaps taking a systematic approach to
naming would allow Py_UCS1 to be deduced from PyUnicode_1BYTE_KIND and so avoid
repeating the information in
Neil Hodgson added the comment:
The patch fixes the performance regression on Windows. The 1:1 case is better
than either 3.2.4 or 3.3.1 downloads from python.org. Other cases are close to
3.2.4, losing at most around 2%. Measurements from 32-bit builds:
## Download 3.2.4
3.2.4 (default, Apr
Neil Hodgson added the comment:
Looking at the assembler output from gcc 4.7 on Linux shows that it specialises
the loop 9 times - once for each pair of kinds. This is why there was far less
slow-down on Linux.
Explicitly writing out the 9 loops is inelegant and would make accurate
Neil Hodgson added the comment:
For 32-bit Windows, the code generated for unicode_compare is quite slow.
There are either 1 or 2 kind checks in each call to PyUnicode_READ and 2
calls to PyUnicode_READ inside the loop. A compiler may decide to move the kind
checks out of the loop and
Neil Hodgson added the comment:
For 32-bits whether wchar_t is signed shouldn't matter as Unicode is only
21-bits so no character will be seen as negative. On Windows, wchar_t is
unsigned.
C11 has char16_t and char32_t which are both unsigned but it doesn't include
comparison
Neil Hodgson added the comment:
The common cases are likely to be 1:1, 2:2, and 1:2. There is already a
specialisation for 1:1. wmemcmp is widely available but is based on wchar_t so
is for different widths on Windows and Unix. On Windows it would handle the 2:2
case
New submission from Neil Hodgson:
On Windows, non-equal comparisons (<, <=, >, >=) between strings with common
prefixes are slower in Python 3.3 than 3.2. This is for both 32-bit and 64-bit
builds. Performance on Linux has not decreased for the same code. The attached
p
New submission from Neil Hodgson :
Unicode includes Line Separator U+2028 and Paragraph Separator U+2029
line ending characters. The readlines method of the file object returned
by the built-in open does not treat these characters as line ends
although the object returned by codecs.open
Neil Hodgson <[EMAIL PROTECTED]> added the comment:
The recommended addition includes the 'excluded license' section which
appears unnecessary as Python does not distribute any source code
redistributables, only the .DLL file which is a binary executable.
Including this is l
11 matches
Mail list logo