Neil Hodgson, replying to self:
The assembler (32-bit build) for each
PyUnicode_READ looks like
Don't have 64-bit MSVC 2010 set up but the code from 64-bit MSVC
2012 is better since there are an extra 8 registers in 64-bit mode:
; 10431: c1 = PyUnicode_READ(kind1, data1, i);
cmp rsi, 1
jne SHORT $LN17@unicode_co
lea rax, QWORD PTR [r9+rcx]
movzx r8d, BYTE PTR [rax+rbx]
jmp SHORT $LN16@unicode_co
$LN17@unicode_co:
cmp rsi, 2
jne SHORT $LN15@unicode_co
movzx r8d, WORD PTR [r9+r11]
jmp SHORT $LN16@unicode_co
$LN15@unicode_co:
mov r8d, DWORD PTR [r9+r10]
$LN16@unicode_co:
All the variables used in the loop are now in registers but the
tests and branches are the same. This lines up with 64-bit being better
than 32-bit on Windows but not as good as Python 3.2 or Unix.
Neil
--
http://mail.python.org/mailman/listinfo/python-list